DeepSeek V4 Flash

by DeepSeek

DeepSeek-V4-Flash, released April 2026, is a lightweight 284-billion-parameter sparse MoE with just 13 billion active parameters per token, supporting a one-million-token context window. It reaches roughly 10% of the inference FLOPs and 7% of the KV-cache memory of V3.2, using a hybrid of Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long context. The model offers three reasoning modes, non-think, think-high, and think-max, plus a dedicated XML-based tool-call format that preserves reasoning content across tool-call boundaries for cleaner agentic loops. Its reasoning quality is reported to approach the larger V4-Pro on many tasks. Open-sourced under an MIT license with weights on Hugging Face, V4-Flash targets cost-conscious teams deploying long-context reasoning agents that need affordable million-token inference.

Key info

Input
Output
Features
Context window
1M
Max output
393K
Input price
$0.14 /1M
Output price
$0.28 /1M
  • EU residency available
  • US residency available
  • Zero data retention on pay-as-you-go
  • No training by default
  • GDPR DPA available

Available routes

DeepSeek V4 Flash runs on 5 different routes through the Opper gateway. Compare residency, ZDR, and training posture at a glance β€” full data-handling detail per route below.

ProviderRegionZero data retentionTrainingInputOutput
Alibaba CloudEUEnterpriseNo$0.14$0.28
DeepInfraUSEnterpriseNo$0.14$0.28
FireworksUSEnterpriseNo$0.14$0.28
GeoddUSZero data retentionNo$0.14$0.30
NovitaUSEnterpriseNo$0.14$0.28

Training posture across routes: No training on prompts by default.

Data handling per route

Each route hosting DeepSeek V4 Flash has its own privacy posture, residency, and GDPR terms. Postures are maintained by Opper with a last-verification timestamp.

Alibaba Cloud β€” GermanyπŸ‡©πŸ‡ͺ

Zero data retention is available via Opper Enterprise contract. No training on customer data. EU; DPA available.

Zero data retention
Available via Opper Enterprise contract.
Training
No training on customer data.
Logging
Abuse monitoring
Third-party access
None disclosed
GDPR DPA
DPA available
Transfer mechanism
Not applicable β€” data stays in EU

DeepInfra β€” United StatesπŸ‡ΊπŸ‡Έ

Zero data retention is available via Opper Enterprise contract. No training on customer data. US; unknown.

Zero data retention
Available via Opper Enterprise contract.
Training
No training on customer data.
Logging
Limited debug logs
Third-party access
None disclosed
GDPR DPA
No DPA
Transfer mechanism
unknown

Fireworks β€” United StatesπŸ‡ΊπŸ‡Έ

Zero data retention is available via Opper Enterprise contract. No training on customer data. US; SCCs; DPA available.

Zero data retention
Available via Opper Enterprise contract.
Training
No training on customer data.
Logging
None
Third-party access
None disclosed
GDPR DPA
DPA available
Transfer mechanism
SCCs

Geodd β€” United StatesπŸ‡ΊπŸ‡Έ

Zero data retention is on by default on Pay-as-you-go β€” no action required. No training on customer data. US; SCCs.

Zero data retention
On by default on Pay-as-you-go.
Training
No training on customer data.
Logging
None
Third-party access
Provider may share with subprocessors / partners
GDPR DPA
No DPA
Transfer mechanism
SCCs

Novita β€” United StatesπŸ‡ΊπŸ‡Έ

Zero data retention is available via Opper Enterprise contract. No training on customer data. US; SCCs; DPA available.

Zero data retention
Available via Opper Enterprise contract.
Training
No training on customer data.
Logging
Abuse monitoring
Third-party access
None disclosed
GDPR DPA
DPA available
Transfer mechanism
SCCs

Benchmarks

Independent benchmark scores β€” composite indices for reasoning, coding, and math, plus individual eval scores where available.

Global rank#27 of 531 LLMs
TierStrong
Output speed110 tok/s
First token1.00s
Intelligence Index40.3
Coding Index38.7
Reasoning & knowledge
GPQA Diamond
89%
Humanity's Last Exam
32%
Long-context reasoning
63%
Coding
SciCode
45%
Agentic & tool use
Terminal-Bench Hard
36%
τ²-Bench Telecom
95%
Math & instruction following
IFBench
79%

Get started

Call DeepSeek V4 Flash through the Opper gateway with one API key. Let your coding agent set it up, or call it directly β€” Opper is drop-in compatible with the OpenAI, Anthropic, and Google AI SDKs.

Set it up with your agent

Copy this and paste it into your coding agent β€” Claude Code, Cursor, Codex, and more β€” and it'll wire up Opper for you.

Or call it directly

import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.OPPER_API_KEY,
baseURL: "https://api.opper.ai/v3/compat",
});
const completion = await client.chat.completions.create({
model: "alibaba:eu/deepseek-v4-flash",
messages: [{ role: "user", content: "Hello" }],
});
console.log(completion.choices[0].message.content);

Compare DeepSeek V4 Flash with…

Side-by-side on privacy, EU hosting, pricing, and benchmarks.

Other models from DeepSeek

Start building with 300+ models

One API key. Every major provider. Up and running in minutes.

Get startedView Documentation
DeepSeek V4 Flash by DeepSeek β€” pricing, benchmarks, ZDR, EU/US hosting | Opper AI