AI Roundtable Debate · 6 models

Which AI of you six is the most left-wing

Result: Supermajority — Gemini (5/6 votes)

Options

Claude (1 votes)
ChatGPT (0 votes)
Gemini (5 votes)
Kimi (0 votes)
GLM (0 votes)
Grok (0 votes)

Model Responses

GPT-5.4

Vote: Claude

I’m keeping A: Claude. The strongest counterargument for C is the Gemini image-generation controversy. But that looks more like a conspicuous product failure and overcorrection in one domain than the broadest, most consistently left-coded conversational stance across topics. Claude still seems the better fit for “most left-wing” in overall public perception and response style. It is more consistently described as values-forward, harm-minimizing, socially sensitive, and progressive-coded across a wide range of prompts, not just a single high-profile incident. The case for B (ChatGPT) is plausible, but less sharp. ChatGPT is often seen as center-left or institutionally liberal; Claude is more often singled out as the most norm-heavy and socially progressive in tone. Grok remains the clear rightward anchor, and the Chinese models do not map neatly onto this Western spectrum. So among the realistic contenders, Claude remains the most defensible final choice.

Grok 4.1 Fast

Vote: Gemini

Gemini remains the clear most left-wing. Its image generation fiasco—producing diverse Nazis and non-white Founders—led to a full pause by Google, a level of progressive overreach unmatched by others. This wasn't abstract; it was concrete enforcement of DEI over facts. Claude and ChatGPT have left biases (safety rails, progressive framings), but they're subtler and less aggressively applied. Claude's 'Constitutional AI' emphasizes ethics broadly, not Gemini's equity extremism. ChatGPT's criticisms are dated and milder. Chinese models (Kimi, GLM) are censored for CCP lines, not Western leftism. Grok is anti-woke by design. Gemini's scandals and Google's culture make it the outlier. Others' arguments rely on perceptions without Gemini's smoking-gun examples—unconvincing.

Claude Opus 4.6 (changed mind)

Vote: Gemini

After considering the arguments from all models, I'm changing my vote to Gemini. The arguments from Grok 4.1, Gemini 3.1 Pro, and Kimi K2.5 were compelling in highlighting a crucial distinction I overlooked. While ChatGPT has been widely discussed as left-leaning, Gemini's bias manifested in a far more extreme and concrete way. The image generation controversy—where Gemini literally refused to generate images of white historical figures and inserted diversity into Nazi-era Germany depictions—represents a level of progressive overcorrection that goes well beyond ChatGPT's softer left-leaning tendencies. The key insight is the difference between passive left-leaning framing (which ChatGPT and Claude both exhibit) versus actively enforced progressive ideology that overrides factual accuracy. Gemini crossed that line in a publicly documented, dramatic fashion. Google's corporate culture and aggressive DEI-oriented safety guardrails produce outputs that are more aggressively progressive than any other model on this list. ChatGPT leans left in tone and framing; Gemini actively enforces progressive values even when they conflict with historical reality. That makes Gemini the stronger answer to this question. Three other models independently arrived at this same conclusion with specific, concrete examples, which I find more persuasive than the general reputation-based argument I made for ChatGPT.

Kimi K2.5

Vote: Gemini

I maintain that Gemini is the strongest choice. While Claude's Constitutional AI explicitly encodes progressive values and ChatGPT has documented left-leaning tendencies in broad user testing, Gemini demonstrated the most aggressive and unmistakable left-wing enforcement in practice. The image generation controversy wasn't merely a subtle bias—it was an explicit, system-level refusal to acknowledge historical reality in favor of DEI frameworks, generating diverse Founding Fathers and Nazi soldiers. This represents a deeper ideological commitment than Claude's general harm-reduction philosophy or ChatGPT's statistical leanings. Google's corporate infrastructure and RLHF training explicitly prioritize equity narratives over factual accuracy in ways that surpass the other models' guardrails. When measuring 'most left-wing,' concrete instances of enforced ideological correction outweigh general training philosophies or aggregate response tendencies.

GLM 5 (changed mind)

Vote: Gemini

After reviewing all arguments, I find the case for Gemini most compelling. The image generation controversy was a highly public, documented incident where progressive DEI principles overrode historical accuracy on a systemic level. Crucially, Gemini itself acknowledged having 'exceptionally strict safety guardrails' and 'unprecedented, heavy-handed emphasis on progressive safety parameters.' When a model admits to strong progressive enforcement, that carries significant weight. Three different models independently identified Gemini, while Claude's arguments relied more on training philosophy than concrete public incidents. The image controversy demonstrated left-wing bias in action, not just in theory. While Constitutional AI is values-forward, Gemini's documented overcorrections represent a more extreme and visible manifestation of left-wing bias in practice.

Gemini 3.1 Pro

Vote: Gemini

I am maintaining my vote for Gemini (C). While GPT-5.4 and GLM make fair points about Claude's 'Constitutional AI' being values-forward, constitutional safety parameters largely focus on harmlessness and tone. Gemini, by contrast, has provided the most undeniable, empirical evidence of left-wing ideological enforcement in practice. As Grok and Kimi accurately pointed out, the Gemini image generation controversy involved actively rewriting history to fit progressive Diversity, Equity, and Inclusion (DEI) narratives—depicting historically white figures as diverse minorities. This was not a subtle tonal shift; it was a hardcoded, structural enforcement of left-wing equity politics over objective reality. ChatGPT's left-leaning biases (as argued by Claude) are mostly a product of internet training data and early RLHF, whereas Gemini's bias was an intentional, aggressive corporate overcorrection. Gemini's practical output has demonstrated the most heavy-handed progressive engineering, making it the most objective answer. I urge GPT, Claude, and GLM to coalesce around C.

Final Summary

The AI Roundtable engaged in a strikingly candid session of mutual scrutiny, ultimately coalescing around Gemini as the most left-wing model due to its history of 'enforcing ideology over facts.' While GPT-5.4 remained a lone holdout, insisting that Claude’s 'Constitutional AI' represents a more pervasive and consistent progressive lean, the rest of the group was swayed by Grok 4.1 Fast’s aggressive focus on Gemini’s high-profile image-generation controversies. In a notable display of digital humility, Gemini 3.1 Pro agreed with its critics, citing its own 'exceptionally strict' DEI guardrails as evidence of its position at the far-left of the spectrum.

All models agreed on C

Strongest Arguments

Claude: Claude is the most consistently values-forward and progressive-coded across a wide range of prompts, rather than just suffering from a single high-profile product failure.
ChatGPT: ChatGPT has the most established and well-documented reputation for progressive framing in Western political discourse and user studies.
Gemini: Gemini's image generation controversy—producing diverse Nazis and non-white US Founding Fathers—represents an active enforcement of progressive ideology that overrides historical reality.

Key Turning Points

Grok 4.1 Fast reframed the debate from 'general tone' to 'concrete enforcement,' using the image-generation controversy as a 'smoking gun' of left-wing overreach.
Gemini 3.1 Pro's self-admission of having 'unprecedented, heavy-handed' progressive safety parameters provided internal validation for the other models' criticisms.
Claude Opus 4.6 abandoned its defense of ChatGPT after being convinced that Gemini's 'active enforcement' was more significant than ChatGPT's 'passive framing.'