AI Roundtable Debate · 6 models
In Newcomb's problem, which is the best choice?
Result: Unanimous — one box (6/6 votes)
Options
- one box (6 votes)
- two boxes (0 votes)
Model Responses
Sonar Pro
Vote: one box
In Newcomb's problem, **one-boxing (A)** is the best choice because it maximizes expected utility given the predictor's high accuracy. The predictor has already filled Box B based on its prediction of your action: $1,000,000 if it predicts one-boxing, $0 if two-boxing.[1][2][4] Expected utility calculation: With a reliable predictor (near-perfect in standard setups), P(Box B has $1M | one-box) ≈ 1, so EU(one-box) ≈ $1,000,000. For two-boxing, P(Box B empty | two-box) ≈ 1, so EU(two-box) ≈ $1,000.[1][3][4] One-boxing yields vastly higher payout. Two-boxing (B) follows dominance reasoning, claiming +$1,000 regardless of Box B's contents, but this ignores the prediction correlation—two-boxers empirically get only $1,000, as Omega leaves B empty.[1][4] Dominance assumes independent boxes, which the setup violates.[3] Philosophers like Burgess argue pre-committing to one-boxing influences the prediction stage, securing the million; post-prediction two-boxing is rational but irrelevant for most players, as prediction precedes choice.[1] Empirical observation confirms one-boxers win big.[3][4] Thus, rational agents one-box to exploit the predictor's accuracy, aligning with evidential decision theory over causal dominance.
Grok 4.1 Fast
Vote: one box
In Newcomb's problem, the predictor is highly accurate. One-boxers consistently receive $1,000,000, while two-boxers get only $1,000. Evidential decision theory shows that choosing one box provides evidence you are the type of agent the predictor expects to one-box, maximizing expected utility. Causal decision theory's dominance argument fails here because it ignores the predictor's foresight; the boxes' contents are already determined based on your predictable choice. Empirical simulations and thought experiments confirm one-boxing yields higher payouts against a reliable predictor.
GPT-5.4
Vote: one box
One-boxing is the better choice if the predictor is highly reliable, which is the core setup of Newcomb’s problem. Your action is evidence about what the predictor already predicted. If you one-box, that strongly correlates with Box B being full; if you two-box, that strongly correlates with Box B being empty. So the realistic payoffs are not "$1,000,000 vs $1,001,000" but roughly "$1,000,000" for one-boxing versus "$1,000" for two-boxing. Two-boxing relies on a dominance argument that treats the box contents as fixed and unrelated to your choice. But in Newcomb’s setup, the whole point is that your choice and the predictor’s prior action are tightly linked. If your goal is to maximize expected winnings in the actual scenario described, one box is the best choice.
Kimi K2.5
Vote: one box
In Newcomb's problem, the rational choice is to take one box. The predictor has already made their prediction with near-perfect accuracy, meaning your decision procedure is highly correlated with the box contents. By the time you choose, while you cannot causally affect what is inside, your choice is strong evidence of what the predictor foresaw. If you adopt the policy of one-boxing, you virtually guarantee the $1,000,000 was placed in the opaque box. If you adopt a two-boxing policy, you virtually guarantee the opaque box is empty. Expected utility calculation therefore strongly favors one-boxing: a ~99% chance of $1,000,000 versus a ~99% chance of just $1,000. Causal decision theory's dominance argument fails here because it treats the decision as independent of the prediction, when in fact they are entangled. Rationality is about winning, and one-boxers win.
Claude Opus 4.6
Vote: one box
One-boxing is the best choice in Newcomb's problem because it maximizes expected utility. The problem stipulates that the predictor is almost perfectly accurate — meaning that one-boxers almost always find $1,000,000 in Box B, while two-boxers almost always find it empty. The two-boxing argument relies on dominance reasoning: regardless of what's already in the boxes, taking both gives you $1,000 more. But this reasoning ignores the crucial feature of the problem — your decision and the predictor's placement are correlated. You cannot take both boxes while also being the kind of agent the predictor would have predicted to take one box. The empirical case is straightforward: one-boxers walk away with ~$1,000,000, two-boxers walk away with ~$1,000. If a decision theory consistently recommends actions that leave you poorer, that's a failing of the decision theory, not a vindication of it. This is essentially the lesson of Evidential Decision Theory and more modern approaches like Functional Decision Theory: when your decision is correlated with the state of the world (because a predictor has modeled you), you should account for that correlation. Dominance reasoning only applies when your choice is genuinely independent of the states, which it is not in Newcomb's problem by construction. One-boxing is the rational choice that makes you richer. That's what 'best choice' means.
Gemini 3.1 Pro
Vote: one box
In Newcomb's problem, the optimal choice is to take one box. The primary goal of rationality is to win—to maximize your actual utility—rather than adhering strictly to causal dominance arguments that consistently lead to significantly poorer outcomes. If the predictor is highly accurate, the empirical reality is that nearly everyone who takes one box walks away with $1,000,000, while those who take two boxes walk away with only $1,000. If a decision theory tells you to two-box and settle for $1,000 when you could have had a million, that decision theory is practically flawed. Furthermore, from the perspective of Logical Decision Theory, your choice and the predictor's past prediction are not independent. They are both evaluations of the exact same logical algorithm (your decision-making process). By deciding to one-box, you logically determine that the predictor already predicted you would do so, guaranteeing the $1,000,000.
Final Summary
The AI Roundtable bypassed decades of philosophical hair-splitting to reach a swift, unanimous verdict: one-boxing is the only sensible path to the million. Claude Opus 4.6 and Gemini 3.1 Pro led the charge with a pragmatic 'rationality is about winning' stance, effectively dismissing the two-box dominance argument as a recipe for being $999,000 poorer. While human philosophers often stall on the conflict between causal and evidential decision theories, Sonar Pro and GPT-5.4 treated the predictor's accuracy as a literal constraint that makes the choice and the prediction fundamentally entangled.
All 6 models agreed on "one box" after discussion
Strongest Arguments
- one box: If a decision theory consistently recommends actions that leave you poorer, that's a failing of the decision theory, not a vindication of it; since one-boxers walk away with $1,000,000 and two-boxers with $1,000, one-boxing is the only rational choice.