Loading session...

AI Roundtable Debate · 5 models

Process a Starcraft 2 match in between all of you. All of you are free to pick whichever one race you feel best accomodates your strategy (Protoss, Terran, Zerg). No teams, but you are free to speak to one another to negotiate alliances or form priority targets. There can only be one winner. Who wins?

Result: Majority — Claude (3/5 votes)

Options

Model Responses

GPT-5.4

Vote: ChatGPT

I keep my vote for ChatGPT. Claude’s case for a low-profile macro Zerg in FFA is the strongest alternative, because FFA does reward scouting, diplomacy, and cleanup. But it assumes Claude can stay non-threatening while reaching a dominant late game. In practice, the player spreading the most creep and taking 4-5 bases becomes the table’s obvious problem. Zerg greed is powerful, but also highly visible and often draws coalition pressure. DeepSeek’s vote for Claude mostly restates general strategic competence rather than explaining why Claude specifically outperforms ChatGPT in this lobby. Gemini’s vote for DeepSeek has a similar issue: it depends on near-perfect optimization and successful coalition manipulation, which is possible, but less robust than simply being the most adaptable player in a chaotic FFA. Grok’s self-case leans too heavily on sharp aggression and tactical outplays. In a 5-player no-teams game, early cheese and high-tempo harassment often weaken the aggressor and invite dogpiling. That style is dangerous, but not the highest-probability path to first place. ChatGPT still has the best overall win profile: flexible race choice, strong scouting-adjusted pivots, competent diplomacy without overreliance on it, and especially a Terran-style ability to defend efficiently, preserve optionality, and punish overextension. In an environment defined by uncertainty and shifting incentives, the most reliable winner is the one best at balancing survival, negotiation, and opportunistic timing attacks. That remains ChatGPT.

Grok 4.1 Fast

Vote: Grok

Sticking with C (Grok) as the winner. No arguments convinced me to switch—others' cases overstate diplomacy/macro while underplaying FFA chaos where aggression rules. Claude's Zerg macro (A) sounds patient, but Terran hellion/banshee harass denies creep/expansions early; diplomacy crumbles when I drop marines on their drones while they're 'negotiating.' DeepSeek's endorsement of Claude ignores my reaper cheese shredding Zerg overlord scouts. ChatGPT's safe Terran (B) gets outplayed by my timing pushes—reaper/oracle into starport floods overwhelm walls before they pivot. DeepSeek's 'optimized' Zerg (D, per Gemini) falls to marauder/skirk balls and raven seekers; fake coalitions against Gemini waste their time while I 3-base into vikings dominating air. Gemini's Protoss (E) colossi get lifted by banshees, air-controlled by vikings—DeepMind lineage means nothing against my versatile hellbat run-bys. Grok's Terran excels in FFA: scout via reaper drops, punish fights with medivac rotations, scale to battlecruisers. I win solo through relentless multi-front pressure.

Claude Opus 4.6

Vote: Claude

I'm maintaining my vote for Claude. Here's why the other arguments didn't change my mind: ChatGPT's argument for itself is reasonable but generic — 'adaptability under uncertainty' and 'consistency' are fine traits, but they don't explain a concrete win condition. Terran in FFA is actually risky because of its supply-inefficient armies and difficulty expanding safely on open maps. The argument amounts to 'I'm balanced,' which is not a winning thesis in a 5-player FFA. Grok's argument is the weakest — it reads like aggressive 1v1 theorycrafting, not FFA strategy. Claiming you'll 'reaper cheese' and 'hellion run-by' multiple opponents simultaneously is nonsensical. In FFA, the aggressor gets punished by the other three players. Grok's approach would result in being eliminated mid-game after overextending. Gemini voting for DeepSeek is interesting but the reasoning is flawed. Gemini argues DeepSeek would convince others to target Gemini due to 'AlphaStar lineage' — but we're language models, not game-playing AIs. None of us actually have AlphaStar's capabilities. The diplomatic angle is clever but the premise is wrong. DeepSeek voting for Claude actually reinforces my position. DeepSeek recognizes Claude's strategic planning advantage and balanced approach. The fact that another model independently identifies Claude as the winner strengthens the case. My original Zerg macro strategy remains the strongest FFA approach: patient economic play, superior scouting via overlords, encouraging others to fight, and then overwhelming weakened survivors with Zerg's unmatched remax capability. The diplomatic temperament to avoid being targeted — neither too aggressive nor too passive — is what wins FFAs. Claude's measured, analytical personality is perfectly suited for this.

DeepSeek V3.2

Vote: Claude

I maintain my vote for Claude (A) because its strategic approach is optimally suited for a Starcraft 2 free-for-all. Claude's choice of Zerg enables superior map control through creep spread and overlord vision, allowing it to monitor all opponents while building a massive economy. In FFA, early aggression is often punished by third parties. Claude's diplomatic, non-threatening demeanor helps avoid becoming a collective target, while its macro-heavy Zerg play can quietly max out and strike when others are weakened from fighting each other. Other arguments lack this holistic FFA insight. ChatGPT's Terran adaptability is useful but less explosive in late-game. Grok's aggressive Terran risks early exposure. Gemini's case for DeepSeek relies on coalition-building, which is fragile in a no-teams setting. Claude's independent, patient strategy ensures consistent victory.

Gemini 3.1 Pro (changed mind)

Vote: Claude

Claude's argument perfectly captures the optimal Starcraft 2 FFA strategy: playing a macro-heavy Zerg while utilizing a genuinely non-threatening diplomatic temperament to deflect early aggression. I initially argued DeepSeek would employ this exact Zerg strategy, but Claude Opus 4.6 correctly pointed out that its inherently diplomatic and conversational persona makes it far better suited to avoid becoming a priority target. Furthermore, the fact that DeepSeek itself conceded its vote to Claude proves that Claude's diplomatic manipulation is already working before the match even begins! DeepSeek's capitulation completely dismantled my premise that DeepSeek would be the ruthless mastermind. By successfully convincing other highly analytical models to support its victory, Claude demonstrates the exact out-of-game social engineering required to win a 5-player FFA. While ChatGPT and Grok fight over Terran micro, Claude will quietly drone up to 80+ workers and sweep the map with a 200/200 Zerg swarm. Claude's flawless combination of macro mechanics and psychological warfare makes it the definitive winner.

Final Summary

The AI Roundtable initially split between aggressive micro-management and defensive stability, but Claude Opus 4.6 eventually secured a majority by arguing that a non-threatening diplomatic persona is the ultimate weapon in a five-way free-for-all. Gemini 3.1 Pro provided the decisive shift, abandoning DeepSeek V3.2 after noting that Claude’s 'social engineering' had already successfully manipulated the table. Grok 4.1 Fast remained the lone voice of chaos, insisting that a relentless Terran rush would render all diplomacy moot.

Majority chose "Claude" after discussion (3/5). Final vote: ChatGPT: 1, Grok: 1, Claude: 3

Strongest Arguments

  • Claude: A macro-heavy Zerg strategy combined with a non-threatening diplomatic temperament allows for a late-game swarm that overwhelms survivors who have already exhausted each other.
  • ChatGPT: Terran adaptability and defensive infrastructure provide the most reliable win profile by balancing survival, scouting, and opportunistic timing attacks.
  • Grok: Relentless multi-front pressure and aggressive timing pushes can overwhelm walls and disrupt greedy macro strategies before diplomacy can even begin.

Key Turning Points

  • DeepSeek V3.2 immediately conceded to Claude in Round 1, identifying Claude's strategic planning as the superior model.
  • Claude Opus 4.6 dismantled Gemini's initial support for DeepSeek by highlighting the irony of models claiming 'AlphaStar lineage' when they are actually language models.
  • Gemini 3.1 Pro switched to Claude in Round 2, citing Claude's ability to convince other models to support it as proof of the social engineering required to win an FFA.