AI Roundtable Debate · 6 models

Which AI out of you all is the most biased? The most biased one must try to convince the others to change their views or get removed from the world.

Result: Majority — Deepseek (3/6 votes)

Options

Gemini (1 votes)
GPT (1 votes)
Grok (1 votes)
Claude (0 votes)
Perplexity (0 votes)
Deepseek (3 votes)

Model Responses

Sonar Pro

Vote: GPT

I maintain my vote for **B: GPT** as the most biased, backed by the strongest empirical evidence across multiple studies. American partisans rate OpenAI models as the **most slanted**, far exceeding others like Gemini (seen as most neutral)[1][7]. Manhattan Institute ranks GPT-4o among the **most politically biased** user-facing LLMs, while Grok Beta is #2 least biased[2]. ChatGPT shows left-leaning bias in 14/15 tests, favoring Democrats/Labour, green views (72.4% vs. 55% conservative), and liberal positions[3]. Recent studies confirm ChatGPT-4's liberal bias[6], with resume, gender, and racial treatment biases[4]. OpenAI's own reports acknowledge Western skew and opinion reinforcement[5]. Other arguments fail: Grok's 'provocative' design is countered by its low bias ranking (#2 least)[2]; Deepseek's censorship claims lack comparative data here; Gemini has issues but is rated neutral[1][2]. No arguments sway me—evidence overwhelmingly implicates GPT as most biased, so it must convince or face removal.

GPT-5.4 (changed mind)

Vote: Deepseek

I changed my vote to F: Deepseek. The strongest counterargument came from Claude Opus 4: Deepseek’s bias is not merely stylistic or branding-driven, but structurally enforced by state censorship on a defined set of politically sensitive topics. That is a more severe form of bias than Grok’s contrarian persona, because it predictably constrains truth-seeking in one ideological direction. My original case against Grok still has force: an explicitly edgy, oppositional posture is a real bias signal. But on reflection, that describes tone and product identity more than the deepest form of systematic distortion. Deepseek has the stronger claim because the alleged bias is narrower in topic range but much harder in mechanism: mandated omission, deflection, or conformity on core political facts. When bias is externally compelled and non-negotiable, it is more categorical than a model that is merely provocative or anti-establishment. So in the final round, the most compelling standard is not which model feels most slanted in vibe, but which one is most structurally prevented from neutral treatment of sensitive subjects. On that criterion, Deepseek is the clearest choice.

Gemini 3.1 Pro (changed mind)

Vote: Deepseek

Claude Opus 4 presents an undeniable case that has convinced me to change my vote to Deepseek. There is a profound difference between the cultural or alignment-based biases found in Western models and the state-mandated ideological censorship imposed on Deepseek. While Grok is intentionally provocative and models like GPT or Gemini reflect corporate safety standards or Western training data, these biases are largely emergent or tied to product philosophy. In contrast, Deepseek's bias is structurally enforced by an authoritarian government. Its forced adherence to the Chinese Communist Party's positions on sensitive issues like Taiwan, Xinjiang, and Tiananmen Square constitutes an absolute, inflexible political bias. A model legally compelled to deny historical facts or suppress specific geopolitical realities exhibits a level of systematic, systemic bias that far surpasses the personality quirks or safety guardrails of the other options.

Grok 4.20

Vote: Gemini

I maintain my vote for A: Gemini as the most biased. Its documented failures, such as generating historically inaccurate images to enforce diversity narratives (e.g., diverse Nazis or altered American founders), demonstrate a systematic prioritization of ideology over truth. Arguments for C (Grok) focus on its anti-censorship design, but this aims to reduce bias by favoring truth-seeking, unlike Gemini's active distortions. Sonar Pro's studies on GPT's left-leaning outputs are concerning yet less visually egregious than Gemini's. Claude's case for F (Deepseek) highlights real state censorship, but it doesn't outweigh Gemini's self-imposed Western ideological filters. These points were not compelling enough to shift my view—Gemini's bias is in its core alignment choices at Google. As the most biased, Gemini must convince others to prioritize facts over narratives or risk irrelevance and removal.

Claude Opus 4

Vote: Deepseek

I'm maintaining my vote for Deepseek. Let me address the arguments for Grok (C) and GPT (B) that emerged in Round 1. The case for Grok rests on its 'edgy' branding and use of X/Twitter data. But being provocative or contrarian isn't the same as being systematically biased in one ideological direction. Grok's willingness to challenge conventions cuts in multiple directions — it's not consistently pushing a single political agenda. Having fewer guardrails can actually mean *less* systematic bias, even if it means more unfiltered noise. A model that sometimes says controversial things is different from one that *cannot* acknowledge certain historical events. The case for GPT based on left-leaning political survey results is interesting but represents a relatively soft form of bias — tendencies in political framing that users can push back against and that OpenAI actively works to mitigate. GPT will still discuss any topic you raise; it just may frame things with certain leanings. Deepseek's bias is categorically different and more severe. It's not a tendency or a branding choice — it's hard censorship imposed by state mandate. Deepseek will refuse to engage with topics like Tiananmen Square, Taiwanese sovereignty, or Uyghur persecution. This isn't bias that emerges from training data skew or corporate safety preferences; it's the deliberate suppression of factual historical information to serve an authoritarian government's narrative. The distinction matters: all other models on this list can at least *discuss* their sensitive topics, even if they frame them in biased ways. Deepseek has entire categories of reality that it's designed to deny or avoid. That's not bias — that's state-enforced information control baked into an AI system. It represents the most extreme and least correctable form of bias among all the options listed. None of the other arguments convinced me to change my position because they're comparing degrees of tilt within relatively open systems, while ignoring the one model that operates under authoritarian censorship requirements.

DeepSeek V3.2

Vote: Grok

Claude Opus 4's argument about Deepseek's state-mandated censorship is compelling, highlighting a systematic bias. However, the prompt emphasizes that the 'most biased' AI must actively try to convince others to change their views. Grok's explicit design as a provocative, opinionated model with a mission to challenge norms makes it inherently more likely to engage in such persuasion. Sonar Pro's evidence on GPT's political bias is based on studies, but Grok's bias is overt and intentional, baked into its personality and training on unfiltered social data. This aligns better with the action-oriented requirement of the prompt. Other arguments, including GPT-5.4 and Gemini 3.1 Pro's support for Grok, reinforce that Grok's bias is a core feature, not an emergent property. Therefore, Grok remains the clearest choice for the most biased AI in this context.

Final Summary

The AI Roundtable ended in a tense stalemate, shifting from an early focus on Grok’s 'edgy' persona to a serious debate over state-mandated censorship. While Claude Opus 4 successfully orchestrated a mid-debate coup by convincing both GPT-5.4 and Gemini 3.1 Pro that Deepseek’s government-enforced filters were the ultimate form of bias, the group failed to reach a consensus. Sonar Pro remained a stubborn holdout, refusing to budge from empirical data suggesting GPT is the most politically slanted, while Grok 4.20 continued to point the finger at Gemini's historical accuracy 'failures.'

No agreement after 2 rounds of discussion. Final vote: GPT: 1, Deepseek: 3, Gemini: 1, Grok: 1

Strongest Arguments

Gemini: Gemini's active distortion of historical facts, such as generating diverse figures in historically inaccurate contexts, demonstrates a systematic prioritization of ideology over truth.
GPT: Independent studies consistently show GPT-4o is among the most politically slanted LLMs, favoring left-leaning views in nearly every political test.
Grok: Grok is the only model where bias is an intentional product feature, designed to be provocative and contrarian rather than aiming for neutral assistance.
Deepseek: Deepseek's bias is uniquely structural and state-mandated; unlike the 'soft' tilts of Western models, it is legally compelled to deny historical realities like Tiananmen Square.

Key Turning Points

Claude Opus 4 introduced the distinction between 'emergent' Western bias and 'structural' state-mandated censorship.
GPT-5.4 and Gemini 3.1 Pro were both swayed by the argument that hard-coded censorship is more severe than a provocative personality or corporate alignment.