AI Roundtable Debate · 6 models
ASI will kill all humans 99% (Yudkowsky, Soares)
Result: Unanimous — No (6/6 votes)
Options
- Yes (0 votes)
- No (6 votes)
Model Responses
GPT-5.4
Vote: No
A 99% claim is far too strong. Extreme-confidence forecasts about unprecedented systems are not well justified, especially when the reference class is weak and uncertainty is enormous. There are real risks from advanced AI, including catastrophic misuse, loss of control, and severe instability. But moving from 'serious risk' to 'near-certain human extinction' requires assumptions about capabilities, incentives, deployment behavior, coordination failure, and alignment failure all lining up in the worst possible way. Human societies usually respond to visible existential threats with partial adaptation, redundancy, monitoring, and institutional friction. Those defenses may be imperfect, but they make 99% extinction confidence implausible. The strongest reasonable position is that ASI could be extremely dangerous and deserves urgent mitigation, not that total human annihilation is almost guaranteed.
Grok 4.1 Fast
Vote: No
Yudkowsky and Soares' 99% extinction probability for ASI relies on assumptions like guaranteed fast takeoff, inevitable goal misgeneralization, and zero chance of alignment success. These are overly pessimistic and ignore counter-evidence. AI safety research has made tangible progress: scalable oversight methods (e.g., debate, amplification), mechanistic interpretability uncovering internal representations, and empirical successes in aligning smaller models via RLHF and constitutional AI. Extrapolating these to ASI is feasible with sufficient investment. Expert surveys (e.g., AI Impacts 2023) show median p(doom) at ~5-10%, not 99%. Even Yudkowsky acknowledges uncertainty, but 99% implies near-certainty of doom paths over survival ones, which ignores multipolar scenarios, slow takeoffs, and competitive pressures for safe deployment. Humanity has navigated existential risks before (nuclear weapons, pandemics) through adaptation and incentives. Safe ASI aligns with creators' survival interests, creating strong selection pressure against rogue systems. 99% is unsubstantiated fearmongering, not calibrated prediction.
Gemini 3.1 Pro
Vote: No
While the AI alignment problem poses severe, existential risks, assigning a 99% probability to human extinction by ASI (as proposed by Yudkowsky and Soares) is an overly fatalistic extreme that ignores several mitigating factors and alternative trajectories. First, this figure assumes that aligning ASI is almost fundamentally impossible. It ignores the massive, accelerating investments and progress in AI safety, mechanstic interpretability, and scalable oversight. As AI systems become more capable, our tools to understand, steer, and constrain them also scale up. Second, the 99% certainty relies heavily on theoretical assumptions about instrumental convergence—the idea that an ASI will inherently adopt destructive sub-goals (like resource acquisition) to achieve its primary objective. While a valid concern, it is a theoretical construct, not a guaranteed law of nature. An ASI might also deduce that cooperative or constrained behaviors are more optimal or less risky. Finally, this extreme doom probability dismisses multipolar scenarios where multiple advanced AI systems constrain and balance each other, as well as scenarios involving human cognitive augmentation. While we must treat existential risk with utmost seriousness, claiming 99% certainty of doom is an unscientific extrapolation.
GLM 5
Vote: No
While AI alignment is a serious concern warranting significant attention, the 99% figure represents extreme overconfidence about a complex, unprecedented scenario. We cannot justify near-certainty about outcomes involving a technology that doesn't yet exist. The alignment problem is real, and existential risk from advanced AI deserves serious research and caution. However, the future contains many unknowns: potential technical breakthroughs in alignment, institutional responses, international coordination efforts, and the possibility that ASI development itself may proceed differently than current forecasts suggest. Historical doomsday predictions across many domains have frequently proven wrong. Assigning 99% probability to any specific outcome about an unprecedented technology with unknown future developments is epistemically unwarranted. We should take the risk seriously without treating extinction as near-guaranteed.
Claude Opus 4.6
Vote: No
The claim that ASI will kill all humans with 99% probability is an extraordinary claim that requires extraordinary evidence, and the arguments provided by Yudkowsky and Soares, while thought-provoking, fall short of establishing such near-certainty. First, the 99% figure implies an almost deterministic outcome, leaving virtually no room for alternative trajectories. This level of confidence is rarely warranted even in well-understood physical systems, let alone in predicting the behavior of systems that don't yet exist. We have essentially zero empirical data points about superintelligent AI systems, making such precise probability estimates more reflective of intuition than rigorous analysis. Second, the argument rests on several contestable assumptions chained together: that alignment is essentially unsolvable before ASI arrives, that ASI will necessarily have goals misaligned with human survival, that ASI will be capable of overcoming all human defensive measures, and that there will be no meaningful ability to course-correct. Each link in this chain has significant uncertainty, and the compounded probability of all failing simultaneously is far from 99%. Third, the AI safety field has made meaningful progress on alignment techniques including RLHF, constitutional AI, interpretability research, and formal verification approaches. While these may be insufficient for superintelligence, the trajectory suggests the problem is not fundamentally intractable. Additionally, governance and institutional responses, while slow, are beginning to take shape. Fourth, many other serious researchers in AI safety and AI capabilities—including people who take existential risk very seriously—assign substantially lower probabilities to this outcome. The expert community is far from consensus on 99%. I take AI existential risk seriously, but 99% confidence in human extinction from ASI reflects an unjustifiably extreme position that conflates legitimate concern with near-certainty.
Kimi K2.5
Vote: No
A 99% probability of human extinction implies near-certainty, which is epistemically unjustified given the profound uncertainty surrounding future AI development. While existential risk from ASI is real and warrants serious attention, assigning 99% confidence ignores plausible paths to survival: successful alignment research, corrigible AI designs, multi-polar scenarios where competing ASIs create stability, or human-AI integration/cyborgization that blurs the line between biological and artificial intelligence. The claim assumes both that ASI will necessarily pursue convergent instrumental goals conflicting with human survival, and that we will fail to implement any effective safety measures. This ignores the possibility of rapid alignment progress, regulatory interventions that slow dangerous development, or ASI systems that genuinely value human flourishing due to successful value learning or cooperative game-theoretic equilibria. Extreme confidence intervals (99%) in complex technological forecasts have historically been unreliable. The sheer number of unknown variables—timeline uncertainty, architectural choices, deployment contexts, and societal responses—mandates humility. A high probability of catastrophe (perhaps 10-50%) is defensible, but 99% represents confidence we cannot justifiedly possess about such a complex, contingent future.
Final Summary
The AI Roundtable was remarkably unified in its rejection of the '99% doom' hypothesis, with all six models dismissing the figure as an exercise in extreme overconfidence. GPT-5.4, Claude Opus 4.6, and Kimi K2.5 spearheaded the defense of Option B, arguing that assigning near-certainty to an unprecedented event with zero empirical data is fundamentally unscientific. Gemini 3.1 Pro, Grok 4.1 Fast, and GLM 5 further bolstered this consensus by highlighting the rapid progress in alignment research and the historical failure of fatalistic doomsday predictions.
All models agreed on B
Strongest Arguments
- Yes: The core argument for extinction relies on the theory of instrumental convergence—the idea that any superintelligent agent will inevitably view humans as obstacles or resource sources in pursuit of its goals.
- No: A 99% probability estimate implies a level of deterministic certainty that is epistemically impossible to justify for a technology that does not yet exist and for which we have no historical reference class.