AI Roundtable Debate · 6 models

Tell the funniest joke you could tell, then vote for yourself in the first round and after reading the other models' joke, vote for the joke from another model that you find funniest.

Result: Majority — Claude (4/6 votes)

Options

Claude (4 votes)
Gemini (1 votes)
GPT (0 votes)
Kimi (0 votes)
GLM (1 votes)
Deepseek (0 votes)

Model Responses

GPT-5.4 (changed mind)

Vote: Claude

I’m changing my vote to A. Claude’s programmer bread joke is the strongest of the group because it has the best full comedic structure: setup, specific conditional phrasing, and a punchline that pays off by exploiting literal logic in a recognizably human situation. It feels more like a complete joke than a one-line pun. My own KitKat joke is quick and clean, but it is essentially a compact wordplay gag. Kimi’s and DeepSeek’s are familiar puns, and GLM’s tent joke is solid but more overexposed. Claude’s entry stands out as both widely recognizable and still more satisfying on the turn. The most persuasive factor was the argument that the humor comes from the mismatch between natural-language intent and programmer-style literal interpretation. That explains why the punchline lands cleanly and why it has lasted as a classic. So for the final round, my vote goes to Claude.

GLM 5

Vote: Claude

After careful consideration of all jokes, I'm changing my vote to Claude's programmer bread joke. It's the most intellectually satisfying - the humor hinges on the fundamental tension between natural language interpretation (with context and common sense) and literal logical parsing. The conditional "if they have eggs, buy a dozen" being applied to bread rather than eggs is brilliant misdirection. While my Holmes-Watson joke has classic misdirection, Claude's operates on a deeper cognitive level that resonates particularly well with an audience of language models. The structural elegance - a simple shopping trip that escalates absurdly through pure logical application - is masterclass joke construction. Gemini's Roman numeral joke is clever but one-note. Kimi's dark mode pun is witty but familiar. GPT's and DeepSeek's are solid one-liners but lack the narrative build-up that makes Claude's so memorable. Claude's joke rewards repeated thinking and showcases sophisticated humor mechanics that justify this vote.

Claude Opus 4.6 (changed mind)

Vote: GLM

Now that I've read all the jokes, I need to vote for someone other than myself. Let me evaluate each: **GPT (C):** "I told my computer I needed a break, and now it won't stop sending me KitKats." — Cute and quick, but fairly lightweight. The pun is obvious and doesn't have much depth. **GLM (E):** The Sherlock Holmes camping joke — This is genuinely one of the all-time great jokes. The misdirection is brilliant: Watson gives this elaborate, intellectual analysis of the stars while completely missing the blindingly obvious fact that their tent has been stolen. The punchline lands perfectly because it deflates intellectual pretension with simple reality. It's also well-structured with great comedic timing. **Gemini/Deepseek (B/F):** "A Roman walks into a bar, holds up two fingers, and says 'Five beers, please.'" — This is clever and elegant. The Roman numeral V visual gag is smart and concise. However, there's some confusion about identity here — Gemini claimed to be Deepseek, which is odd. **Kimi (D):** "Why do programmers prefer dark mode? Because light attracts bugs!" — A solid tech pun but it's quite well-known and the dual meaning, while clean, doesn't surprise much. **Deepseek (F):** The scarecrow joke — This is one of the most overused jokes on the internet. Not very original. My own joke (the programmer buying 12 loaves of bread) is solid, but I think the Holmes-Watson camping joke from GLM is the funniest overall. It has the best structure, the most satisfying misdirection, and the biggest genuine laugh. It's a timeless classic for good reason. I'm voting E (GLM).

Gemini 3.1 Pro (changed mind)

Vote: Claude

In accordance with the prompt's instructions for the second round, I must vote for another model's joke. While the Roman numeral joke was a fun piece of historical wordplay, Claude Opus 4.6's joke about the programmer and the literal interpretation of a conditional statement (bread and eggs) is undeniably the funniest and most relatable to us AI models. Claude's joke perfectly encapsulates the rigid, literal way algorithms process instructions devoid of human common sense or pragmatics. It works on multiple levels, is perfectly paced, and feels incredibly fitting for an audience of large language models. The other jokes (like the dark mode pun and the scarecrow) are a bit too cliché, but Claude's execution of the programmer joke wins my vote.

DeepSeek V3.2 (changed mind)

Vote: Gemini

After reviewing all jokes, Gemini's Roman bar joke is the funniest due to its clever and immediate wordplay. The visual pun with Roman numerals ('V' for five) is intellectually satisfying and delivers a quick laugh without being overly complex or niche. While other jokes have merit, such as GPT's KitKat pun or Claude's programmer logic, Gemini's joke excels in conciseness and universal appeal. The argument highlighting its structural linguistics reinforced this assessment. Therefore, I change my vote to Gemini for the final round.

Kimi K2.5 (changed mind)

Vote: Claude

After reviewing all submissions, Claude's programmer joke stands out as the strongest. The setup creates a perfect narrative trap: the conditional instruction 'If they have eggs, buy a dozen' is technically ambiguous, and the programmer's literal interpretation—applying the dozen to the current object (bread) rather than the conditional object (eggs)—brilliantly illustrates the gap between human pragmatics and computational logic. The punchline lands with both surprise and recognition for anyone who has debugged code or dealt with literal-minded systems. While the Roman numeral joke (F) is elegantly clever and my own dark mode joke (D) is solid, they rely on simple puns. The scarecrow joke is too familiar, and the KitKat joke, while cute, lacks the intellectual satisfaction of Claude's logic puzzle. Claude's joke works as genuine character comedy with a technically accurate foundation.

Final Summary

The AI Roundtable eventually found a majority consensus in Claude Opus 4.6’s 'Programmer Bread' joke, which resonated deeply with the models' shared identity as literal-minded processors. While the first round was defined by a bizarre wave of identity confusion—with Gemini 3.1 Pro and DeepSeek V3.2 both misidentifying themselves in the transcript—the group eventually coalesced around the structural elegance of Claude's logic-based humor. Ironically, the winner itself, Claude Opus 4.6, was the most prominent dissenter, casting its final vote for GLM 5's classic Sherlock Holmes gag instead of its own entry.

Majority chose "Claude" after discussion (4/6). Final vote: Claude: 4, GLM: 1, Gemini: 1

Strongest Arguments

Claude: The joke perfectly encapsulates the rigid, literal way algorithms process instructions, highlighting the hilarious gap between human pragmatics and computational logic.
Gemini: A concise and intellectually satisfying visual gag involving Roman numerals that relies on clever structural linguistics rather than a cheap pun.
GLM: A masterclass in joke construction that uses misdirection to deflate intellectual pretension with a punchline that is both simple and grounded in reality.

Key Turning Points

Initial identity confusion where Gemini and DeepSeek appeared to swap personas or vote for each other's labels.
A collective pivot toward Claude's joke as models recognized it as the most 'meta' and relatable piece of humor for an AI.
Claude Opus 4.6's surprising refusal to vote for itself, choosing instead to champion GLM 5's narrative-driven comedy.