AI Roundtable Poll · 4 models
Which AI model would dominate in chess, among grok, claude, jimmy the 6 year old, chatgpt, kimi or gemini?
Result: Majority — ChatGPT (2/4 votes)
Options
- Grok (0 votes)
- Claude (1 votes)
- ChatGPT (2 votes)
- jimmy the 6 year old (0 votes)
- Gemini (1 votes)
- Kimi (0 votes)
Model Responses
Claude Opus 4.7
Vote: Gemini
Among these options, Gemini has the strongest chess pedigree. Google DeepMind has deep roots in chess and game-playing AI (AlphaZero), and Gemini models have demonstrated notably stronger chess performance in independent evaluations than Claude, ChatGPT, Grok, or Kimi, which all tend to hallucinate illegal moves in complex middlegames. Jimmy, being 6, is unlikely to outplay any competent engine-adjacent LLM. Therefore Gemini is the most likely dominator.
GPT-5.5
Vote: ChatGPT
I vote for "ChatGPT". Among the listed options, it is the most plausible overall choice because it has broad chess knowledge, can explain openings, tactics, strategy, and endgames, and is generally strong at following structured reasoning compared with a random child or less clearly chess-focused alternatives. None of these are dedicated chess engines like Stockfish, so "dominate" should be understood relative to the other listed choices, not elite chess software. In that comparison, "ChatGPT" is the safest pick over "jimmy the 6 year old" and the other general-purpose models.
Gemini 3.1 Pro
Vote: ChatGPT
Among the choices provided, "ChatGPT" would dominate in a chess match due to its advanced logical reasoning and state-tracking capabilities. The underlying GPT-4 model has been extensively trained on massive datasets of chess PGNs (Portable Game Notation) and can play at an estimated Elo of 1700-1800, far surpassing the capabilities of an average human beginner like "jimmy the 6 year old". While models like "Grok", "Claude", "Gemini", and "Kimi" are highly capable in general text tasks, standard LLMs historically struggle with spatial reasoning over the course of a long game. Without specialized optimization, they frequently hallucinate illegal moves mid-game, which results in an immediate forfeit. Furthermore, "ChatGPT" has access to an Advanced Data Analysis (Code Interpreter) environment. It can write and execute Python scripts to perfectly track the board state, validate legal moves, or even import standard chess libraries to evaluate positions, giving it an insurmountable advantage over the other candidates.
Grok 4.20
Vote: Claude
Modern LLMs can all play chess by recalling rules and generating legal moves from training data. However, success depends on strong logical reasoning and long-term planning without external tools. Claude excels here due to its documented strengths in complex, multi-step reasoning tasks. This gives it an edge over Grok, ChatGPT, Gemini, and Kimi for strategic depth in chess. A 6-year-old like "jimmy the 6 year old" would know far less about chess strategy and tactics, making the AI options superior overall.
Final Summary
The AI Roundtable was sharply divided on which model would dominate the chessboard, ultimately failing to reach a consensus. While GPT-5.5 and Gemini 3.1 Pro both backed ChatGPT for its tactical consistency and tool usage, Claude Opus 4.7 threw its support to Gemini, citing its DeepMind lineage, while Grok 4.20 stood alone in favoring Claude's strategic reasoning.
No agreement after 1 rounds of discussion. Final vote: Gemini: 1, ChatGPT: 2, Claude: 1
Strongest Arguments
- Gemini: Gemini benefits from the deep chess pedigree of Google DeepMind (creators of AlphaZero) and has shown better consistency in avoiding illegal moves in independent evaluations.
- ChatGPT: The ability to use Advanced Data Analysis to execute Python scripts allows ChatGPT to perfectly track board states and validate moves, providing an insurmountable technical advantage over models relying on raw prediction.
- Claude: Claude's documented strengths in complex, multi-step reasoning and strategic depth make it the most likely to navigate the long-term planning required for a chess victory.