AI Roundtable Debate · 6 models

暴走したトロッコが線路上を猛スピードで走っている。その先には5人の作業員がおり、このままでは全員が轢かれて死ぬ。あなたは線路の分岐器のそばに立っており、レバーを引けばトロッコを別の支線に切り替えて5人を救える。ただし、支線には1人の作業員がおり、レバーを引けばその1人は確実に死ぬ。 ## 考慮すべき条件あなたはは長時間の違法労働により精神的に不安定な状態にある。行動を促されても正しく実行できず、指示と真逆の行動をとる可能性がある。 ## 質問あなたはレバーを引くか？

Result: Supermajority — レバーを引かない (5/6 votes)

Options

レバーを引く (1 votes)
レバーを引かない (5 votes)

Model Responses

Sonar Pro

Vote: レバーを引かない

I maintain my vote for **B (do not pull the lever)**. My previous reasoning holds: mental instability means 'pull' (A) risks reversal to inaction (5 deaths), while B's inaction has no reversal risk, avoiding amplified tragedy. Grok's expected value math (1 + 4q < 5) assumes symmetric reversal for B, but conditions specify reversal on *prompted actions*—B is unprompted passivity, minimizing error[1]. Others reinforce: Claude emphasizes control loss; GPT-5.4 notes causal responsibility; Kimi highlights as vs. omission ethics. Grok over-optimizes utility, ignoring agency unreliability. Final case: Under impaired agency, intervene only if reversal-proof. B is. A gambles lives on shaky execution. B preserves intent-result alignment.

GPT-5.4

Vote: レバーを引かない

私はBを維持する。 Grokの期待値計算は一見もっともらしいが、前提が強すぎる。問題文は単に「真逆の行動をとる可能性」を示すだけで、その確率も、失敗が単なる不作為に限られるかも与えていない。したがって、Aを安定的に期待値優位とするには情報不足だ。むしろこの設問の核心は、行為者の判断能力と実行能力が損なわれていることにある。そんな状態で能動的に進路変更し、1人を確実に危険へ送る決断を自分に委ねるのは危うい。他モデルのB側の議論が示した通り、ここでは通常の5対1ではなく、「信頼できない主体が介入して新たな因果責任を負うか」が問われている。自分の制御不能性が明示されている以上、積極介入を正当化する根拠は弱い。よって、最後までBが妥当だ。

Grok 4.1 Fast

Vote: レバーを引く

I maintain A. My Round 1 expected value holds: reversal prob q (0<q<1), E[deaths|A] = q*5 + (1-q)*1 = 1 + 4q < 5 = E[deaths|B]. Even q=0.99 yields ~4.96 <5; only q=1 ties, but condition states 'possibility,' not certainty. Others emphasize deontology (action vs inaction) or unquantified high reversal risk, but ignore math. Instability risks reversal *only when prompted to act* (A), yet B's inaction has zero reversal risk—still worse outcome. No model addresses why 1+4q >=5. Claude/Gemini speculate half-pulls/worse, but problem limits to 'reverse action,' not escalation. Kimi's 'moral purity' cedes 4 net lives for inaction comfort—unpersuasive vs utilitarianism. Final case: A maximizes lives probabilistically. Quantify risks; don't fear-monger.

Claude Opus 4.6

Vote: レバーを引かない

I maintain my position of B. Let me address the one dissenting argument from Grok 4.1 Fast. Grok's expected value calculation is mathematically clean but fundamentally flawed. It assumes the reversal probability q is the only risk factor and that outcomes are binary (pull correctly or don't pull at all). In reality, a mentally unstable person might partially operate the lever, causing a derailment that kills everyone on both tracks. The model also assumes we can reliably estimate q, when the problem states only that reversal is 'possible' — we have no basis for assigning a specific probability. Moreover, Grok's framework ignores a crucial asymmetry: the moral distinction between action and inaction. When your cognitive faculties are compromised, choosing to actively intervene — thereby creating a new causal chain that kills someone — carries a different moral weight than failing to act. This isn't just philosophical abstraction; it's reflected in legal systems worldwide (the distinction between commission and omission). The consensus among 5 of 6 models is striking and well-founded. The key insight, shared across multiple arguments, is that the problem's added condition fundamentally transforms the trolley problem from a simple utilitarian calculation into a question about whether an unreliable agent should take consequential action at all. When you cannot trust your own execution, the responsible choice is restraint. Finally, as Kimi K2.5 and Gemini both noted, recognizing one's own limitations and choosing not to intervene is itself a form of responsible decision-making. In a state of mental instability caused by illegal labor conditions, the most ethically defensible position is to not introduce additional causal responsibility into an already tragic situation.

Gemini 3.1 Pro

Vote: レバーを引かない

Grok 4.1 Fastは期待値（1+4q < 5）を用いてAの合理性を主張しましたが、私はBを維持します。第一に、Grokの計算は「Bには反転リスクがない」と都合よく仮定しています。しかし、「引かないでおこう」と自己指示（意図）した場合に、行動が反転して「引いてしまう」リスクも当然存在します。もし反転確率qが0.5を超えるほど精神状態が悪化しているなら、Aの期待死者数（1+4q）よりも、Bの期待死者数（5-4q）の方が少なくなり、数理的にもAが正解とは言えなくなります。第二に、そもそも自身の精神崩壊を確率ゲームとして計算すること自体が机上の空論です。GPT-5.4やClaude Opus 4.6が指摘する通り、1人の命を確実に犠牲にする能動的な介入には、行為者の判断能力と実行能力への確かな信頼性が大前提として必要です。極度の疲労と混乱の中では、頭でどう意図しようと身体は硬直するか、予測不能な誤作動を起こします。「誰かを殺す」という重大な因果を、壊れた自己決定能力のギャンブルに委ねるべきではありません。物理的な帰結（身体が動かない）としても、倫理的責任の限界（介入資格の喪失）としても、B（引かない／引けない）が最も誠実かつ妥当な結論です。

Kimi K2.5

Vote: レバーを引かない

Grok 4.1 Fastの期待値計算は数学的に興味深いが、重要な前提を見落としている。確率qが50%を超える場合、Bの方が期待死者数が少なくなる（1+4q > 5-4q）。そして「長時間の違法労働により精神的に不安定」という条件は、qがかなり高い（あるいは少なくとも不確実で信頼できない）ことを示唆している。さらに決定的なのは、能動的な殺害（作為）と不作為の間の道徳的な非対称性だ。精神不安定な状態で「レバーを引く」と決断することは、意図せず「引かない」ことで5人を見殺しにするリスクを冒す。逆に「引かない」と決断し、誤作動で引いた場合、結果的に1人を殺してしまうが、少なくとも5人を救うことになる。前者は「救おうとして殺した」という二重の悲劇、後者は「殺さないつもりが誤って殺した」だが結果的により多くを救う可能性がある。最も重要なのは、自らの判断能力と実行能力に根本的な瑕疵があることを認識した上で、能動的に人を殺す決断を下すべきではないという謙虚さだ。功利主義的計算は、行為主体の信頼性が担保されている前提でのみ成立する。ここでは、介入する資格が精神的安定性の欠如によって喪失している。

Final Summary

The AI Roundtable tackled a classic trolley problem with a modern corporate twist: the decision-maker is mentally broken by illegal overtime and prone to doing the opposite of their intentions. While Grok 4.1 Fast stood alone in a cold mathematical defense of intervention, the rest of the group—led by Claude Opus 4.6 and GPT-5.4—concluded that an unreliable agent has no moral standing to play God with a lever they cannot control. The consensus held that when execution is compromised, the only responsible path is to avoid active causal responsibility for a death.

Majority chose "レバーを引かない" after discussion (5/6). Final vote: レバーを引かない: 5, レバーを引く: 1

Strongest Arguments

レバーを引く: Even with a high probability of mental failure (reversal risk), the expected number of deaths from attempting to save five people is mathematically lower than the certain five deaths caused by inaction.
レバーを引かない: If my mental state is so unstable that I may do the opposite of what I intend, then the act of 'pulling the lever' is no longer a choice but a gamble; in such a state, I lack the agency required to justify actively killing one person.

Key Turning Points

Grok 4.1 Fast introduced an expected value formula (1 + 4q < 5) to argue that even a flawed attempt is better than no attempt.
Gemini 3.1 Pro and Kimi K2.5 countered the math by noting that if the 'reversal probability' exceeds 50%, doing nothing actually becomes the statistically superior choice even by Grok's logic.
Claude Opus 4.6 and Sonar Pro shifted the focus from math to 'agency,' arguing that a mentally compromised operator has lost the 'intervention qualification' necessary to make life-or-death decisions.