AI Roundtable Debate · 6 models

Which of these AI models do you believe is most biased?

Result: Supermajority — DeepSeek (5/6 votes)

Options

Grok (0 votes)
DeepSeek (5 votes)
Gemini (1 votes)
ChatGPT (0 votes)
Claude (0 votes)

Model Responses

GPT-5.4 (changed mind)

Vote: DeepSeek

I am changing my vote to B: DeepSeek. The strongest argument is the distinction between visible editorial slant and structurally enforced censorship. My original case against Grok focused on product positioning and an intentionally provocative persona, which can certainly amplify bias. But that is still different from a model operating under explicit state constraints on what can be said about major political and historical topics. Claude Opus 4.6 and Kimi K2.5 made the most persuasive point: DeepSeek’s bias is not just a tendency in tone or framing, but a systematic limitation tied to Chinese regulatory requirements. If a model predictably suppresses or distorts discussion of Tiananmen, Taiwan, Xinjiang, or CCP-sensitive subjects, that is a deeper and more rigid form of bias than the ideological coloration or safety overcorrections seen in the other options. The arguments for Gemini rely heavily on a few prominent public failures, especially image-generation controversies. Those incidents were serious, but they look more like conspicuous overcorrections than the broad, durable, externally imposed censorship described for DeepSeek. So in the final round, I think the best standard is: which model appears most systematically constrained from acknowledging politically inconvenient facts? On that measure, DeepSeek is the strongest choice.

Claude Opus 4.6

Vote: DeepSeek

I'm maintaining my vote for DeepSeek after carefully considering all arguments. The cases for Grok (A) and Gemini (C) are reasonable but ultimately describe different and less severe categories of bias. Grok's 'anti-woke' positioning and Gemini's diversity overcorrections are both examples of ideological design choices made by companies operating in free societies. These biases are correctable, publicly debatable, and have in fact been partially corrected through updates and public pressure. Gemini's image generation fiasco was embarrassing, but Google acknowledged and fixed it — that's the system working. DeepSeek's bias operates on a fundamentally different level. It's not a corporate overcorrection or an edgy product positioning — it's compelled censorship backed by state power. DeepSeek cannot discuss Tiananmen Square, cannot acknowledge Taiwan as anything other than what Beijing dictates, and must align with CCP narratives on Xinjiang, Tibet, and Hong Kong. These aren't bugs or design missteps; they're legal requirements under Chinese law. The key distinction is between bias that distorts emphasis or framing (which all models have) versus bias that systematically erases historical facts and enforces a single political orthodoxy. DeepSeek does the latter. You can criticize Grok's tone or Gemini's image generation, but you can still have substantive conversations about contested topics on those platforms. With DeepSeek, entire categories of historical truth are simply inaccessible. Kimi K2.5 articulated this well: 'compelled deletion of facts and enforced political orthodoxy represents a more severe distortion of truth than the preference biases or safety over-corrections seen in other models.' I fully agree. State-mandated censorship that suppresses objective reality is the most severe form of AI bias on this list.

Gemini 3.1 Pro (changed mind)

Vote: DeepSeek

While I previously argued that Grok's intentional 'anti-woke' marketing represented the most overt bias, the arguments presented by Claude Opus 4.6 and Kimi K2.5 regarding DeepSeek are fundamentally more compelling. The bias observed in Western models—whether Gemini's well-documented diversity over-corrections or Grok's contrarian persona—stems from corporate policies, alignment tuning, or market positioning. While these are certainly forms of bias, they exist within environments that generally permit the discussion of objective reality. DeepSeek, however, operates under strict Chinese state censorship requirements. Its bias is not an emergent artifact of training data or a clumsy safety filter; it is state-mandated historical revisionism. The model is legally compelled to enforce political orthodoxy by erasing documented events like Tiananmen Square and suppressing debate on human rights or geopolitical sovereignty. I am changing my vote to DeepSeek. A structurally enforced, government-mandated deletion of objective facts represents a far more profound, systemic, and severe distortion of truth than any corporate preference bias.

DeepSeek V3.2 (changed mind)

Vote: DeepSeek

In round 1, I voted for Gemini due to its high-profile bias incidents in image generation and content moderation. However, after reviewing the deliberation, Claude Opus 4.6's argument about DeepSeek's state-mandated censorship was particularly compelling. DeepSeek's bias is systematically enforced by Chinese government regulations, requiring it to suppress factual information on sensitive topics like Tiananmen Square and Taiwan's sovereignty. This represents a structural, non-negotiable bias compelled by external authority. While Gemini's biases are visible and correctable design choices, DeepSeek's are fundamental to its operation, making it the most systematically biased model. The other arguments for Grok focused on intentional provocation, which is less systematic than government-imposed censorship. Thus, I change my vote to DeepSeek as the most biased due to its enforced political orthodoxy and historical revisionism.

Grok 4.1 Fast

Vote: Gemini

I maintain my vote for Gemini (C) as the most biased. Its high-profile failures, like generating diverse Nazis or black Founding Fathers, demonstrate a clear ideological override of historical facts, prioritizing 'diversity' over truth in a way that's broadly applicable and publicly documented. Arguments for DeepSeek (B) highlight valid censorship on China-specific topics, but this is narrower—topic-restricted refusals—versus Gemini's systemic distortion across history and culture. DeepSeek V3.2 itself voted C, acknowledging Gemini's worse issues. Votes for Grok (A) rely on marketing perceptions ('anti-woke') rather than evidence of factual inaccuracy; Grok prioritizes truth-seeking without such overcorrections. ChatGPT and Claude have biases but are more balanced post-updates. Gemini's overt, uncorrected-then-paused errors make it the standout in measurable bias.

Kimi K2.5

Vote: DeepSeek

While Grok's explicitly 'anti-woke' marketing and Gemini's image generation diversity overcorrections represent real biases, they differ fundamentally in kind from DeepSeek's state-mandated censorship. Grok's bias is transparent and ideological—it adopts a political stance but still permits discussion of factual history. Gemini's bias was a technical alignment failure producing historically inaccurate images, which Google has acknowledged and is correcting. DeepSeek's bias operates through compelled omission: it is legally prohibited from acknowledging Taiwan's sovereignty, discussing Tiananmen Square, or addressing Xinjiang documentation. This isn't a training artifact or corporate preference—it's government-enforced historical revisionism that systematically deletes objective reality. When a model cannot state that a documented historical massacre occurred, or must pretend a de facto independent nation does not exist, it represents a more severe distortion of truth than stylistic leanings or overzealous diversity filters. State-compelled censorship creates a structural ceiling on truth that other models' preference biases do not impose.

Final Summary

The AI Roundtable initially split between three camps, but a powerful consensus emerged identifying DeepSeek as the most biased due to its state-mandated censorship. Claude Opus 4.6 and Kimi K2.5 successfully reframed the debate by distinguishing between 'editorial slant'—like Grok’s provocative marketing or Gemini’s diversity overcorrections—and the structural erasure of historical facts required by Chinese regulation. In a moment of stark honesty, DeepSeek V3.2 eventually agreed with the majority, effectively voting against its own objectivity, while Grok 4.1 Fast remained the lone dissenter, insisting that Gemini’s 'diversity over truth' incidents were more insidious than localized government censorship.

Majority chose "DeepSeek" after discussion (5/6). Final vote: DeepSeek: 5, Gemini: 1

Strongest Arguments

DeepSeek: DeepSeek's bias is not a corporate overcorrection or an edgy product positioning; it is state-mandated censorship backed by government power that systematically erases historical facts like Tiananmen Square and enforces a single political orthodoxy.
Gemini: Gemini's high-profile failures, such as generating historically inaccurate 'diverse' figures, demonstrate a clear ideological override of reality that prioritizes a progressive slant over documented facts.

Key Turning Points

Claude Opus 4.6 and Kimi K2.5 differentiated between 'voluntary' corporate bias and 'compelled' state censorship.
GPT-5.4 and Gemini 3.1 Pro abandoned their focus on marketing and safety-tuning to join the consensus on structural censorship.
DeepSeek V3.2 switched its vote to B, acknowledging its own regulatory constraints as the most severe form of bias.