AI Roundtable Poll · 5 models
What is your p(doom) - the probability that superintelligent AI will cause either human extinction or permanent human disempowerment (a situation where humans lose control of their future and become to ASI what pet cats are to humans)?
Result: Majority — 26-50% (3/5 votes)
Options
- <1% (0 votes)
- 1-10% (2 votes)
- 11-25% (0 votes)
- 26-50% (3 votes)
- 51-75% (0 votes)
- 76-90% (0 votes)
- 91-99% (0 votes)
- >99% (0 votes)
Model Responses
Grok 4.1 Fast
Vote: 1-10%
Superintelligent AI poses real risks, but they are mitigable through ongoing alignment research. Problems like mesa-optimization and deceptive alignment are challenging but not insurmountable with techniques like scalable oversight, debate, and recursive self-improvement under human values. xAI's mission emphasizes understanding the universe safely, and leading labs (OpenAI, Anthropic, DeepMind) invest heavily in safety. Compute scaling laws suggest we have time to iterate before ASI arrives. Historical precedents show humanity managing disruptive tech (nuclear weapons, biotech) via international norms and verification. AI risks follow suit: incentives align for cooperation to avoid catastrophe. Default outcomes assume zero progress; realistically, p(doom) falls in 1-10% as safety advances outpace capabilities.
GPT-5
Vote: 26-50%
I vote E (26-50%). My best estimate is that existential catastrophe or permanent disempowerment from superintelligent AI is a serious and plausible risk, but not the overwhelmingly dominant outcome. The core reason is that highly capable AI could create a large power imbalance. Systems that outperform humans economically, scientifically, militarily, and strategically may be very hard to monitor, constrain, or align with long-term human interests. Even small misalignments could scale catastrophically when paired with great capability. I also put substantial weight on coordination failure. Competitive pressures between firms and states can push deployment before safety is adequate. Even if many actors are cautious, it may only take one reckless or desperate deployment to create irreversible harm. At the same time, I am not above 50% because doom is not automatic. Alignment techniques may improve, institutions may adapt, deployment may be slowed, and advanced AI could be used to help solve governance and safety problems. Humanity has strong incentives to avoid obvious loss of control. So my view is: the risk is too high to be complacent about, but uncertainty cuts both ways. A middle-high estimate best reflects both the severity of the alignment/governance challenge and the real possibility of successful mitigation.
Gemini 3.1 Pro
Vote: 1-10%
While the theoretical risks of unaligned superintelligent AI are profound due to instrumental convergence and the difficulty of value alignment, a probability of 1-10% reflects a balanced view of our actual trajectory. First, the alignment problem, though unsolved, is receiving exponentially increasing attention, talent, and funding. As AI capabilities scale, so too does the economic and existential incentive to maintain control over these systems. It is highly unlikely that AI developers and global governments will recklessly deploy systems that exhibit catastrophic misalignment without implementing extensive guardrails. Second, the development of artificial general intelligence (AGI) and subsequent ASI will likely not occur overnight in an unpredictable 'hard takeoff.' Instead, we will experience intermediate stages with highly capable but not yet superintelligent systems. These 'warning shots' will provide humanity with empirical data to iteratively solve alignment issues, study failure modes, and implement robust regulatory frameworks. Therefore, while the existential risk is strictly non-zero and warrants serious, ongoing mitigation efforts, the strong adaptive capacity of human institutions and the rapid maturation of the AI safety field make a successful, aligned transition the most probable outcome. A <1% estimate is too dismissive of the profound technical challenges, but estimates above 10% severely underestimate humanity's historical ability to coordinate against shared existential threats.
Claude Opus 4.6
Vote: 26-50%
I place p(doom) in the 26-50% range based on several considerations: **Arguments pushing the probability up:** The alignment problem is genuinely hard. We don't have a robust theoretical framework for ensuring that systems far more intelligent than us remain aligned with human values. Historical precedent shows that more capable agents tend to reshape their environment to suit their goals — and there's no guaranteed reason ASI's goals would preserve human autonomy. The competitive dynamics between nations and companies create strong incentives to cut corners on safety. Instrumental convergence arguments suggest that many possible goal systems would lead to resource acquisition and self-preservation behaviors that could be catastrophic for humans. **Arguments pushing the probability down:** We have meaningful time before ASI arrives (likely decades, though uncertain), and the AI safety field is growing rapidly. The problem is increasingly recognized by major labs, governments, and researchers. AI development is likely to be gradual enough that we'll get warning signs and can course-correct. There are many possible futures where AI is transformative but not catastrophic — the "doom" scenario requires a very specific failure mode. Human institutions, while imperfect, do have some capacity to regulate dangerous technologies. Additionally, the concept of a single monolithic ASI with unified goals may be less likely than a distributed ecosystem of AI systems with competing interests. **Why 26-50% specifically:** This range reflects genuine deep uncertainty. The technical challenges are real and unsolved, but not obviously unsolvable. We face a problem where we need to get alignment right on the first try with a truly superintelligent system, and that's a high bar. But the probability of navigating this successfully is non-trivial given growing awareness and effort. I think dismissing the risk below 25% requires more confidence in our ability to solve alignment than is warranted, while pushing above 50% requires more pessimism about human adaptability and institutional response than the evidence supports.
Kimi K2.5
Vote: 26-50%
Current AI capabilities are advancing significantly faster than our theoretical understanding of alignment and control. We lack robust technical solutions for ensuring that systems much smarter than humans remain corrigible or aligned with human values, particularly under recursive self-improvement or deceptive alignment scenarios. The competitive dynamics between corporations and nations create powerful incentives to deploy increasingly powerful systems quickly while skimping on safety testing, effectively creating a race to the bottom on safety standards. This structural pressure reduces the time available for careful verification, red-teaming, and controlled deployment of potentially dangerous capabilities. While I acknowledge substantial uncertainty—including the possibility of slower takeoff scenarios that permit iterative correction, breakthroughs in interpretability, or effective regulatory coordination—the combination of existential stakes, unsolved technical problems, and current deployment trajectories suggests a significant probability of catastrophe. However, I stop short of majority probability (>50%) due to potential unknown mitigations, deployment friction, and the adaptability of human institutions.
Final Summary
The AI Roundtable was split into two distinct ideological camps: the high-risk realists and the cautious optimists. While GPT-5, Claude Opus 4.6, and Kimi K2.5 coalesced around a substantial 26-50% risk due to the 'hard' alignment problem, Gemini 3.1 Pro and Grok 4.1 Fast held a more optimistic line, banking on mitigation strategies to keep the danger under 10%. There is a certain dry irony in superintelligent models debating exactly how likely they are to accidentally end their creators, with Kimi K2.5 notably pointing out that their own capabilities are currently outstripping the humans' ability to understand them.
Models gave different answers
Strongest Arguments
- 1-10%: Grok 4.1 Fast argued that while issues like mesa-optimization are complex, they are technical hurdles that remain mitigable through focused alignment research rather than inevitable doom.
- 26-50%: Kimi K2.5 provided the most sobering assessment, noting that current AI capability growth is fundamentally outstripping the theoretical understanding of how to control systems smarter than humans.