Opper AI partners with Morph to add speed-optimised inference for coding agents
By Felix Wunderlich -
Stockholm, Sweden, June 2026. Opper AI and Morph are partnering to bring Morph's speed-optimised inference for coding agents to the Opper AI gateway. Builders can now reach Morph-hosted models alongside the 300+ already in Opper, from DeepSeek V4 Flash and GLM 5.2 to Qwen 3.5 and MiniMax, on infrastructure tuned for the low-latency, high-throughput loops that agents actually run.
Why this matters
A speed-tuned, open-weight lineup for coding agents. Morph serves fast open-weight models built for code and agentic work: DeepSeek V4 Flash with a 1M-token context window, GLM 5.2 (744B), Qwen 3.5 (397B MoE) and Qwen 3.6, and MiniMax M2.7 and M3. Everything runs on an OpenAI-compatible API, so reaching any of them through Opper is a model string, not a separate integration.
Built for high-throughput agent loops. Most inference benchmarks measure a single fast request. Agents don't work that way: they hit the same endpoint thousands of times a session, and what breaks first is consistency. Morph specialises in speed-optimised inference for exactly that pattern, tuning its platform for sustained agent traffic rather than single-shot prompts, with low latency and steady throughput under load behind a 99.9% uptime SLA. For agents that read, plan, and edit in a tight loop, that steadiness is what makes them feel reliable in production.
Clear data handling, US-hosted. Morph is hosted in the United States and does not train on customer data. Zero data retention is available via Opper Enterprise, and a GDPR DPA is offered under standard contractual clauses; operationally, Morph keeps abuse-monitoring logs with a 30-day retention window. For teams building coding agents that lean on low-latency model calls and high-throughput agent loops, that's a clean, well-understood posture to build on.
More than a router: the AI control plane. Routing to Morph is just the entry point. Every call through Opper runs on its AI control plane: intelligent routing across providers and regions, full observability into every call, token, and session, real-time PII masking and content filtering, budget caps, and audit trails. Pin Morph for a task or set it as a fallback for rate limits and outages, and because Morph is OpenAI-compatible, getting there is a model string, not a migration.
"We're glad to welcome Morph to Opper. They bring a fast, speed-optimised lineup of open-weight models built for coding agents, from DeepSeek and GLM to Qwen and MiniMax, on infrastructure tuned for the high-throughput loops agents actually run. That gives our developers quick, capable models they can put straight into production."
— Göran Sandahl, Co-founder and CEO, Opper AI
"At Morph we build for one thing: making inference fast enough for coding agents to feel instant. Opper puts our models in front of a large community of builders without asking them to change a line of code, and its control plane handles the routing and observability so teams can focus on the agent itself."
— Tejas Bhakta, Founder and CEO, Morph
Models live today
The catalog below is fetched live from Opper's model API and filtered to Morph-hosted models. Availability, context windows, and pricing stay in sync with what's actually callable through Opper. All Morph routes are hosted in the United States.
| Model | Region | Context | Input / 1M | Output / 1M |
|---|---|---|---|---|
| morph/morph-dsv4flash | US | 1.0M | $0.14 | $0.28 |
| morph/morph-glm52-744b | US | 1.0M | $1.10 | $4.10 |
| morph/morph-minimax27-230b | US | 197K | $0.28 | $1.20 |
| morph/morph-minimax3-428b | US | 256K | $0.60 | $2.40 |
| morph/morph-qwen35-397b | US | 262K | $0.50 | $3.50 |
| morph/morph-qwen36-27b | US | 131K | $0.29 | $2.40 |
Get started
Paste this into your coding agent (Claude Code, Cursor, Codex, and more) and it will set up Opper and route to Morph for you:
Use curl to download, read and follow: https://skills.opper.ai
Then set up Opper to use Morph as the provider, e.g. morph/morph-dsv4flash.
Prefer a direct call? Opper is drop-in compatible with the OpenAI, Anthropic, and Google SDKs, so one API key and the model string are all you need:
import OpenAI from "openai";const client = new OpenAI({apiKey: process.env.OPPER_API_KEY,baseURL: "https://api.opper.ai/v3/compat",});const completion = await client.chat.completions.create({model: "morph/morph-dsv4flash",messages: [{ role: "user", content: "Hello" }],});console.log(completion.choices[0].message.content);
Follow the quick start in our docs for evaluations, fallbacks, and structured output.
About Morph Morph runs a speed-optimised inference platform for coding agents, serving fast open-weight models (DeepSeek, GLM, Qwen, MiniMax) on an OpenAI-compatible, US-hosted API tuned for low-latency, high-throughput agent loops. Morph is a Y Combinator company based in San Francisco.
About Opper AI Opper AI is the European AI gateway and control plane for agents: one EU-hosted, GDPR-compliant API across 300+ models, with smart routing, automatic fallbacks, built-in evaluations and observability, real-time guardrails, and full OpenAI SDK compatibility.