Claude Opus 4.8 by Anthropic, highly capable model for complex reasoning, coding, and agentic work with 1M context and improved efficiency.
Available on
Privacy
- Zero data retention
- Always-on
- Training
- No
- Region
- EU + US
AI Model Directory
Search by what matters. Zero data retention, residency, training and more.
Claude Opus 4.8 by Anthropic, highly capable model for complex reasoning, coding, and agentic work with 1M context and improved efficiency.
Claude Sonnet 4.6 by Anthropic, full upgrade across coding, computer use, and long-context reasoning.
Claude Haiku 4.5 by Anthropic, fast model matching Claude Sonnet 4 coding performance at a fraction of the cost, more than twice the speed.
GPT-5.4 Mini is OpenAI's strongest small model, improving over GPT-5 Mini in coding and reasoning while running more than 2x faster, with a 400k context window.
GPT Realtime 2 by OpenAI: realtime voice model with GPT-5-class reasoning, vision input and configurable reasoning effort.
Gemini 3.5 Flash by Google pairs frontier-level intelligence with fast throughput and a 1M token context for agentic and coding tasks.
Gemini 3.5 Live Translate by Google delivers near real-time speech-to-speech translation across 70+ languages while preserving the speaker's voice.
Gemini 3.1 Flash-Lite Preview by Google, a preview of the cost-efficient multimodal LLM with 1M context, optimized for low latency and high throughput.
Gemini 3.1 Pro Preview by Google, a frontier reasoning LLM with 1M context and strong agentic coding, scoring 77% on ARC-AGI-2 and 81% on SWE-Bench Verified.
Gemini 3.1 Flash Image by Google, high-efficiency image generation and editing at multiple resolutions, optimized for speed and high-volume use.
Gemini 3.1 Flash Live by Google, a low-latency real-time audio-to-audio model with video input, optimized for voice-first agents and multimodal dialogue.
Gemini 3 Flash Preview by Google delivers near Pro-level reasoning at flash speed, with a 1M token context for agentic workflows.
Gemini 3 Pro Image Preview by Google generates and edits images with state-of-the-art reasoning for creative and technical work.
Gemini Flash Latest by Google is an alias that always points to the newest Flash model, for fast, frontier-class performance.
Gemini Flash Lite Latest by Google is an alias for the newest lightweight Flash model, tuned for high speed and throughput.
Gemini Pro Latest by Google is an alias for the newest Pro model, providing advanced reasoning for complex problem-solving.
Google Gemma 4 31B, dense open-weight model with 31B parameters, multimodal reasoning, and 256K context.
Grok Voice, xAI's real-time speech-to-speech agent with 20+ languages, five voices, function calling, web and X search.
Claude Opus 4.7 by Anthropic, advanced vision and coding model with 1M context for complex long-running agentic workflows.
Claude Opus 4.6 by Anthropic, professional knowledge-work model with 1M context, extended thinking, and 128K token outputs.
Claude Opus 4.5 by Anthropic, highly capable model for coding and agents with extended thinking and a tunable effort parameter.
Claude Sonnet 4.5 by Anthropic, strong coding and agentic model scoring 61.4% on the OSWorld computer-use benchmark, 30+ hour focus.
Claude Opus 4.1 by Anthropic, improved agentic reasoning and multi-file code refactoring at 74.5% SWE-bench.
Claude Opus 4 by Anthropic, leading coding model with extended thinking for complex software engineering and long-running agents.
Claude Sonnet 4 by Anthropic, significant upgrade with 72.7% SWE-bench, vision, and extended thinking for everyday coding.
Claude 3 Haiku by Anthropic, fast and affordable vision model for customer support and batch processing of large documents.
GPT-5.5 Pro uses extended reasoning compute to deliver OpenAI's strongest answers, built for the hardest tasks, long-running workflows, and deep analysis.
GPT-5.4 Pro is the highest-performance GPT-5.4 variant, using extended reasoning compute and a 1M context window for the hardest multi-step professional work.
GPT-5.4 Nano is the fastest, lightest GPT-5.4 variant, built for classification, data extraction, and lightweight coding subagents that need low latency.
GPT-5.3 Chat by OpenAI, chat-tuned model with a 272K context, vision, and tools for conversational use.
GPT-5.3 Chat Latest by OpenAI, newest chat alias with a 272K context, vision, reasoning, and tool support.
GPT-5.3 Codex is OpenAI's most capable agentic coding model, reaching state-of-the-art on SWE-Bench Pro and Terminal-Bench while using fewer tokens than prior models.
GPT-5.2 Codex by OpenAI, code-tuned model with a 272K context for complex refactoring and architectural work.
GPT-5.2 Pro by OpenAI, extended-compute variant with a 272K context for deeper reasoning on hard problems.
GPT-5.1 Codex by OpenAI, code-tuned model with a 272K context for code generation, refactoring, and debugging.
GPT-5.1 Codex Max by OpenAI, agentic coding model with a 272K context for large-scale, long-horizon development.
GPT-5.1 Codex Mini by OpenAI, fast, cost-efficient code model with a 272K context for everyday development tasks.
GPT-5.1 Chat Latest by OpenAI, chat-tuned variant with a 128K context and vision for conversational applications.
GPT-5 Codex is OpenAI's agentic coding model, trained on real-world tasks to generate code matching human style while iteratively running tests until they pass.
The Opper model directory lists every large language model on the gateway, with image, video, and voice models alongside them. Filter by price, context window, capability, benchmark score, hosting region, and zero data retention, then run any of them through one OpenAI-compatible API and a single key. Put two models head to head on the comparison pages, or read how routing and fallbacks work on the LLM gateway.