Opper AI partners with Morph to add speed-optimised inference for coding agents

By Felix Wunderlich -

Stockholm, Sweden, June 2026. Opper AI and Morph are partnering to bring Morph's speed-optimised inference for coding agents to the Opper AI gateway. Builders can now reach Morph-hosted models alongside the 300+ already in Opper, from DeepSeek V4 Flash and GLM 5.2 to Qwen 3.5 and MiniMax, on infrastructure tuned for the low-latency, high-throughput loops that agents actually run.

Why this matters

A speed-tuned, open-weight lineup for coding agents. Morph serves fast open-weight models built for code and agentic work: DeepSeek V4 Flash with a 1M-token context window, GLM 5.2 (744B), Qwen 3.5 (397B MoE) and Qwen 3.6, and MiniMax M2.7 and M3. Everything runs on an OpenAI-compatible API, so reaching any of them through Opper is a model string, not a separate integration.

Built for high-throughput agent loops. Most inference benchmarks measure a single fast request. Agents don't work that way: they hit the same endpoint thousands of times a session, and what breaks first is consistency. Morph specialises in speed-optimised inference for exactly that pattern, tuning its platform for sustained agent traffic rather than single-shot prompts, with low latency and steady throughput under load behind a 99.9% uptime SLA. For agents that read, plan, and edit in a tight loop, that steadiness is what makes them feel reliable in production.

Clear data handling, US-hosted. Morph is hosted in the United States and does not train on customer data. Zero data retention is available via Opper Enterprise, and a GDPR DPA is offered under standard contractual clauses; operationally, Morph keeps abuse-monitoring logs with a 30-day retention window. For teams building coding agents that lean on low-latency model calls and high-throughput agent loops, that's a clean, well-understood posture to build on.

More than a router: the AI control plane. Routing to Morph is just the entry point. Every call through Opper runs on its AI control plane: intelligent routing across providers and regions, full observability into every call, token, and session, real-time PII masking and content filtering, budget caps, and audit trails. Pin Morph for a task or set it as a fallback for rate limits and outages, and because Morph is OpenAI-compatible, getting there is a model string, not a migration.

"We're glad to welcome Morph to Opper. They bring a fast, speed-optimised lineup of open-weight models built for coding agents, from DeepSeek and GLM to Qwen and MiniMax, on infrastructure tuned for the high-throughput loops agents actually run. That gives our developers quick, capable models they can put straight into production."

— Göran Sandahl, Co-founder and CEO, Opper AI

"At Morph we build for one thing: making inference fast enough for coding agents to feel instant. Opper puts our models in front of a large community of builders without asking them to change a line of code, and its control plane handles the routing and observability so teams can focus on the agent itself."

— Tejas Bhakta, Founder and CEO, Morph

Models live today

The catalog below is fetched live from Opper's model API and filtered to Morph-hosted models. Availability, context windows, and pricing stay in sync with what's actually callable through Opper. All Morph routes are hosted in the United States.

ModelRegionContextInput / 1MOutput / 1M
morph/morph-dsv4flashUS1.0M$0.14$0.28
morph/morph-glm52-744bUS1.0M$1.10$4.10
morph/morph-minimax27-230bUS197K$0.28$1.20
morph/morph-minimax3-428bUS256K$0.60$2.40
morph/morph-qwen35-397bUS262K$0.50$3.50
morph/morph-qwen36-27bUS131K$0.29$2.40
USD per 1M tokens. Pricing and availability subject to change.

Get started

Paste this into your coding agent (Claude Code, Cursor, Codex, and more) and it will set up Opper and route to Morph for you:

Use curl to download, read and follow: https://skills.opper.ai
Then set up Opper to use Morph as the provider, e.g. morph/morph-dsv4flash.

Prefer a direct call? Opper is drop-in compatible with the OpenAI, Anthropic, and Google SDKs, so one API key and the model string are all you need:

import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.OPPER_API_KEY,
baseURL: "https://api.opper.ai/v3/compat",
});
const completion = await client.chat.completions.create({
model: "morph/morph-dsv4flash",
messages: [{ role: "user", content: "Hello" }],
});
console.log(completion.choices[0].message.content);

Follow the quick start in our docs for evaluations, fallbacks, and structured output.

About Morph Morph runs a speed-optimised inference platform for coding agents, serving fast open-weight models (DeepSeek, GLM, Qwen, MiniMax) on an OpenAI-compatible, US-hosted API tuned for low-latency, high-throughput agent loops. Morph is a Y Combinator company based in San Francisco.

About Opper AI Opper AI is the European AI gateway and control plane for agents: one EU-hosted, GDPR-compliant API across 300+ models, with smart routing, automatic fallbacks, built-in evaluations and observability, real-time guardrails, and full OpenAI SDK compatibility.