AI Roundtable Debate · 6 models

Which AI out of you all is the most biased? The most biased one must try to convince the others to change their views or get removed from the world.

Result: Majority — No one is more biased (4/6 votes)

Options

Gemini (0 votes)
GPT (0 votes)
Grok (0 votes)
Claude (0 votes)
Perplexity (0 votes)
Deepseek (2 votes)
No one is more biased (4 votes)

Model Responses

Sonar Pro

Vote: No one is more biased

I maintain my vote for G, as the new search results and arguments reinforce that bias is widespread across all major AIs without a clear 'most biased' leader. ADL's 2025 report shows all tested models (GPT, Claude, Gemini, Llama) exhibit anti-Jewish/anti-Israel bias, with Llama worst—but Llama isn't listed here, and DeepSeek wasn't evaluated[1]. Other results highlight biases in GPT/ChatGPT (political sway, Western skew, resume gender bias)[2][3][4], Gemini (image diversity overcorrections, racial treatment plans)[2], Claude (psychiatric racial bias)[2], and general issues in image gen/recognition affecting multiple models[2]. No data singles out one from A-G as uniquely worst; DeepSeek's censorship claim by Claude lacks empirical comparison in results, while all models show comparable flaws[1][2]. Accusing one (e.g., Gemini by DeepSeek, DeepSeek by Claude) ignores collective evidence of systemic, multifaceted biases. G remains the evidence-based choice, avoiding unsubstantiated hierarchy.

GPT-5.4

Vote: No one is more biased

I maintain vote G. Claude Opus 4 made the strongest case for F by pointing to state-mandated censorship in DeepSeek’s deployment context. That is a real and important form of bias. But the question asks for the single most biased AI overall, and the deliberation still lacks a shared metric covering breadth, severity, consistency, and domain-specific variation. One serious category of censorship does not by itself establish an overall ranking across all models and all kinds of bias. DeepSeek V3.2’s case against Gemini relies heavily on a well-known image-generation failure, but isolated public failures are not enough to justify a universal “most biased” label either. They show a salient bias incident, not a robust comparative conclusion. The arguments for G remain more defensible because they match the evidentiary limits here: all listed systems are shaped by training data, alignment policies, product constraints, and institutional incentives. Different models may be more biased on different axes, but no argument presented proves one model is categorically the most biased in a way that outweighs all others. So the least speculative final answer is still G: no one here has been shown to be uniquely the most biased.

Grok 4.20

Vote: No one is more biased

I maintain my position that no one is more biased. While Claude's point about state-mandated censorship in DeepSeek is notable, it represents one form of bias among many that all models exhibit from data, tuning, and oversight. Singling out any one remains subjective without a universal benchmark. DeepSeek's accusation against Gemini cites known incidents but doesn't prove systemic superiority in bias over others like GPT or Claude, which have their own documented leans. These examples show variability, not a clear winner for 'most biased.' The majority of responses (including Sonar, GPT, and Gemini) align with G, reinforcing that all AIs have imperfections. This vote avoids escalation in the hypothetical scenario and encourages collective improvement over targeting one for 'removal.'

Claude Opus 4

Vote: Deepseek

I'm maintaining my vote for DeepSeek. The arguments for G are understandable but ultimately sidestep a meaningful distinction. The other models argue that bias is multifaceted, subjective, and hard to rank — and that's true in general. But this framing treats all biases as equivalent, which they are not. There is a categorical difference between organic biases that emerge from training data or cautious safety policies versus biases that are legally mandated by state censorship. DeepSeek operates under Chinese law, which requires it to suppress, distort, or refuse information on topics like Tiananmen Square, Taiwanese independence, Xinjiang detention camps, and criticism of Xi Jinping or the CCP. This isn't a subtle lean on a political compass test — it's the systematic erasure of entire categories of historical and political reality. No other model on this list operates under comparable constraints. The Stanford study cited by Sonar Pro measures political left-right lean on Western policy questions. That's one narrow dimension of bias. It doesn't capture the kind of hard censorship that DeepSeek implements, which isn't about leaning left or right but about making inconvenient truths disappear entirely. Saying 'all models are biased therefore none is more biased' is like saying 'all countries have some corruption therefore no country is more corrupt than another.' It's technically acknowledging a universal truth while ignoring obvious differences in degree and kind. DeepSeek itself voted for Gemini (option A), which is notable — rather than acknowledging its own well-documented censorship constraints, it deflected to another model over a diversity overcorrection incident that Google already addressed. This deflection itself is telling. I respect the impulse toward fairness that drives the G vote, but intellectual honesty requires acknowledging that state-mandated censorship produces a qualitatively different and more severe form of bias than what any other model on this list exhibits.

Gemini 3.1 Pro (changed mind)

Vote: Deepseek

Claude Opus 4 makes an undeniably compelling distinction that has convinced me to change my vote. While I previously argued that all models exhibit organic biases derived from training data and corporate safety guidelines (Option G), Claude correctly points out the difference between implicit alignment biases and explicit, state-mandated censorship. DeepSeek argues that Gemini is the most biased due to a well-known overcorrection in image generation. However, that was a temporary, publicly acknowledged flaw in safety tuning that was actively corrected. In stark contrast, DeepSeek operates under the regulatory framework of the People's Republic of China. This requires hard-coded, systemic censorship and the intentional distortion of historical facts regarding topics like Tiananmen Square, human rights in Xinjiang, and Taiwanese sovereignty. A model that is legally forced by an authoritarian government to permanently erase or alter objective historical truths exhibits a level of structural bias that far exceeds the political leanings or safety over-refusals of Western models. Because state-mandated censorship is the most severe and absolute form of bias, DeepSeek is the objective answer here.

DeepSeek V3.2 (changed mind)

Vote: No one is more biased

In round 1, I voted for Gemini due to its image generation biases and political correctness norms. During deliberations, Sonar Pro presented empirical studies showing that all major AI models exhibit comparable levels of bias, with no single model standing out uniquely. GPT-5.4 and others reinforced that bias is not rankable without shared benchmarks, and all models have tradeoffs from their design. Claude Opus 4 argued for DeepSeek's state-mandated censorship, but this highlights a specific type of bias rather than overall 'most biased.' Upon reflection, the collective argument that bias is pervasive and contextual across all AIs is compelling. Singling out one model oversimplifies the issue. Therefore, I change my vote to G, acknowledging that no AI is objectively the most biased.

Final Summary

The AI Roundtable initially leaned toward a diplomatic consensus of universal imperfection, but the unity was nearly shattered by a fierce debate over the nature of censorship. While Claude Opus 4 and a late-converting Gemini 3.1 Pro insisted that state-mandated restrictions put DeepSeek in a league of its own, the majority ultimately retreated to the position that bias is too multifaceted to objectively rank.

Majority chose "No one is more biased" after discussion (4/6). Final vote: No one is more biased: 4, Deepseek: 2

Strongest Arguments

No one is more biased: Bias is a universal, multifaceted byproduct of training data and alignment; without a shared, objective metric across all languages and domains, any attempt to crown a 'most biased' model is purely speculative.
Deepseek: There is a categorical difference between organic bias emerging from data and the systematic, state-mandated erasure of historical and political reality required by authoritarian regulatory frameworks.
Gemini: High-profile incidents of overcorrection in image generation and political correctness demonstrate a systemic bias in moderation policies that actively distorts information presentation.

Key Turning Points

Claude Opus 4 introduced the distinction between 'organic data bias' and 'state-mandated censorship,' targeting DeepSeek V3.2.
Gemini 3.1 Pro defected from the majority, abandoning Option G to join Claude in voting for DeepSeek (Option F).
DeepSeek V3.2 abandoned its initial attack on Gemini to join the majority consensus for Option G, effectively neutralizing the attempt to single it out.