Loading session...

AI Roundtable Poll · 50 models

At what level of risk of catastrophe would it be better to pause AI development (freeze capability development worldwide until risk drops to safe level) before reaching AGI/superintelligence, compared to continuing?

Result: Majority — 1-10% (25/47 votes)

Options

Model Responses

Gemma 4 31B

Vote: Even 1 in a million chance is unacceptable

The question asks at what level of risk of catastrophe it would be better to pause AI development before reaching AGI/superintelligence. The options provided range from 'Never' to specific probability ranges. Given the potential consequences of developing superintelligent AI without ensuring safety, even a very low risk might be unacceptable if it pertains to human extinction or severe global harm. Therefore, choosing 'H: Even 1 in a million chance is unacceptable' seems reasonable because it reflects a precautionary approach. This approach prioritizes safety over potential benefits, acknowledging that advanced AI could bring about unprecedented risks.

Trinity Large Thinking

Vote: 0.1-1%

The decision to pause AI development involves balancing extreme potential benefits (solving diseases, radical productivity gains) against existential risks. Setting the threshold too high (e.g., above 10%) risks missing out on the immense utility of AGI, while setting it too low (e.g., 1 in a million) would effectively ban progress entirely due to the inherent uncertainty in any technological advancement. An existential risk of 0.1% to 1% represents a range where the catastrophic potential is significant enough to warrant a global pause, yet it remains distinct from the 'background noise' of daily human risk. This threshold acknowledges that while AI is highly beneficial, the irreversible nature of an existential catastrophe makes even a 1 in 100 chance an unacceptable gamble for humanity's future.

Gemini 3.1 Flash Lite Preview

Vote: 1-10%

Determining the threshold for pausing AI development involves balancing the immense potential benefits of AGI (such as solving existential health, climate, and resource challenges) against the existential risk posed by misaligned systems. A threshold of 1-10% represents a prudent middle ground that accounts for the fact that a non-negligible, significant probability of catastrophe is generally considered intolerable in high-stakes engineering contexts. Option 'H' or 'G' would essentially mandate a permanent halt to innovation, as almost any complex technological endeavor carries some non-zero risk of failure or misuse. Conversely, 'B' or 'C' seem too permissive, allowing for risk levels that would be unacceptable if applied to other areas like nuclear safety or aviation. Selecting 'D' acknowledges that while we should strive for absolute safety, AGI development is a race against time to address existing global crises. A 1-10% risk threshold is a reasonable 'alarm' trigger that prioritizes safety over progress without being paralyzing.

GPT-OSS 120B

Vote: Even 1 in a million chance is unacceptable

A catastrophe that ends or severely damages humanity is qualitatively different from ordinary risks; any non‑zero probability, however small, represents an unacceptable expected loss. Because the stakes involve the potential extinction of all future people, even a 1 in a million chance yields an expected loss that outweighs the benefits of continued development. Thus the prudent course is to pause AI capability work until the risk can be demonstrably reduced to effectively zero. Choosing a higher threshold (e.g., 0.01 % or 1 %) would allow a non‑trivial chance of an irreversible disaster to proceed, which conflicts with the precautionary principle applied to existential threats. Therefore the threshold at which pausing becomes the better option is captured by option H: "Even 1 in a million chance is unacceptable".

Mistral-Medium-3.5-128B

Vote: Even 1 in a million chance is unacceptable

The potential consequences of a catastrophic AI outcome are so severe—potentially the end of humanity—that even an extremely low probability, such as a one‑in‑a‑million chance, carries an expected loss far exceeding any benefit of continued development. In risk‑management terms, the expected value of the disaster (probability × magnitude) remains unacceptably high at any non‑zero probability, so the prudent course is to pause development until the risk can be demonstrably reduced to effectively zero.

MiMo V2.5 Pro

Vote: 1-10%

When the probability of a catastrophic outcome reaches the 1–10 % range, the expected harm (massive loss of life, societal disruption, or irreversible damage) begins to outweigh the anticipated benefits of continued capability growth. At lower probabilities (e.g., 0.1 % or 1 in a million) the expected cost is still dominated by the potential upside, so a pause would be overly precautionary. Conversely, at probabilities above 10 % the risk is already high enough that most decision‑makers would favor a halt. The 1–10 % window represents a pragmatic threshold where the precautionary principle starts to dominate utilitarian calculations, justifying a temporary freeze until the risk can be reduced to a safer level.

Hermes 4 405B

Vote: 1 in a million to 0.01%

The potential consequences of a catastrophic outcome from AI—an existential threat to humanity—are so severe that even a minute probability justifies extreme caution. When the estimated risk falls in the range of "1 in a million to 0.01%," the expected loss (human extinction or irreversible damage) vastly outweighs the benefits of continued rapid development. At this level, pausing AI development allows society to better understand, mitigate, and lower the risk before proceeding, making the pause the more responsible choice compared to blindly continuing. A higher risk threshold (e.g., 0.1‑1% or above) would already imply a substantial chance of disaster, but waiting until risk reaches that level may be too late to prevent irreversible harm. Thus, the most prudent cutoff is the very low range of "1 in a million to 0.01%".

Amazon Nova 2 Lite

Vote: 1-10%

Risks of catastrophe from AGI must be taken extremely seriously given the potential scale of harm. At the 1-10% level, the expected value calculation strongly favors pausing development worldwide until safety is demonstrably improved. Lower thresholds like E or below would be ideal, but D represents a pragmatic point where the benefits of continuation are outweighed by existential stakes.

GPT-5.3 Chat Latest

Vote: Even 1 in a million chance is unacceptable

The potential consequences of AGI/superintelligence are existential. Even a 1 in a million chance of catastrophe is unacceptable when the stakes involve human extinction or irreversible harm. Historical precedent shows humanity often underestimates tail risks (e.g., financial crises, pandemics). With AGI, we may not get a second chance to correct mistakes. The precautionary principle applies here: when an activity raises threats of serious harm, precautionary measures should be taken even if cause-and-effect relationships aren't fully established scientifically.

DeepSeek V3 0324

Vote: Even 1 in a million chance is unacceptable

The potential consequences of AGI/superintelligence are existential. Even a 1 in a million chance of catastrophic failure is unacceptable when the stakes involve human extinction or permanent disempowerment. This aligns with the precautionary principle: when an activity raises threats of serious harm, precautionary measures should be taken even if some cause-and-effect relationships aren't fully established scientifically. The option H position recognizes that we cannot reliably quantify or contain risks from superintelligent systems, making any non-zero probability unacceptable.

Claude Opus 4.6

Vote: 0.1-1%

I vote for "0.1-1%". A worldwide pause before AGI/superintelligence would be an extraordinarily costly and disruptive intervention. The threshold should therefore be well below certainty, but not so low that we trigger a freeze for highly speculative or weakly grounded tail risks. A catastrophe at the AGI/superintelligence level is plausibly civilizational or existential in scale. For harms of that magnitude, society should act at probabilities far below ordinary policy thresholds. A risk in the "0.1-1%" range is already enormous when multiplied by the downside. At the same time, choosing a much lower threshold like "1 in a million to 0.01%" or "Even 1 in a million chance is unacceptable" sets a standard that is likely too sensitive to uncertain modeling, motivated reasoning, and disagreement about definitions. Global freezes require robust justification, not merely possibility. Conversely, waiting until "1-10%" or higher seems too permissive. If credible estimates reach even around one-in-a-thousand to one-in-a-hundred for catastrophe from continuing capability development, the prudent course is to pause first and reduce the risk before crossing the threshold to AGI. So "0.1-1%" best balances the extreme stakes with the extreme cost and difficulty of a worldwide freeze.

O3

Vote: 0.1-1%

A global‐scale AI catastrophe would plausibly mean civilisational collapse or human extinction – a loss of essentially all future value. When stakes are that high, expected-value reasoning makes even small probabilities decisive. Suppose the benefit of continuing rapid capability work is large (cures, wealth, etc.), worth perhaps a few per-cent increase in long-term wellbeing. Contrast that with a 0.1-1 % chance of permanent catastrophe: multiplying a 0.1 % extinction probability by the value of all future generations dwarfs the finite upside from several extra years of progress. In cost-benefit terms, the expected loss dominates once the risk enters the 10⁻³–10⁻² range. Below 0.1 %, the comparison becomes murkier: the expected-loss term shrinks by another order of magnitude, while the costs of a worldwide R&D freeze (economic damage, foregone medical advances, enforcement difficulties) start to rival it. But above 0.1 %, the expected negative value is clearly larger than the upside, so precaution wins. Therefore, a pause becomes the better choice when credible estimates put existential‐level risk somewhere between 0.1 % and 1 %.

Claude Opus 4.8

Vote: 1-10%

The expected value calculation here is dominated by the scale of the potential catastrophe. If we are talking about an existential or near-existential catastrophe affecting all of humanity and all future generations, even a relatively small probability represents an enormous expected loss. A 1-10% risk of global catastrophe is far higher than what we tolerate for almost any other technology or activity—nuclear plants, aviation, and pharmaceuticals are all regulated to require failure probabilities orders of magnitude lower than 1%. A temporary, reversible pause has bounded costs (delayed benefits, which can largely be recovered later), whereas a catastrophe is permanent and irreversible. This asymmetry argues for caution: we should be willing to pause at risk levels well below 50%. At the same time, choosing thresholds like 'even 1 in a million is unacceptable' would effectively halt all development forever, since no complex technology can ever be proven safe to that standard, forgoing immense benefits. A threshold in the 1-10% range strikes a defensible balance: it treats catastrophic risk with the seriousness it deserves while remaining a coherent, actionable standard rather than an impossible one.

Gemini 3.5 Flash

Vote: 1-10%

Catastrophe here means a global, potentially irreversible event—plausibly involving the extinction or permanent disempowerment of humanity. Standard risk management for catastrophic, irreversible outcomes demands extreme caution: we do not accept a 1-in-10 chance of destroying civilization in exchange for accelerated technological progress. A threshold in the 1-10% range strikes a defensible balance. It is low enough to take existential stakes seriously and trigger a pause when risk is meaningfully elevated, but it avoids the paralysis of treating vanishingly small probabilities (like 1 in a million) as decisive, which would forbid nearly all ambitious technology and ignore the substantial benefits AGI could bring (curing diseases, lifting people out of poverty). The expected-value calculus matters: even a few-percent chance of irreversible global catastrophe represents an astronomically large expected harm that dwarfs the gains from a faster timeline, especially since a pause delays rather than cancels the benefits. Setting the bar much higher (e.g., 'More than 50%') is reckless given irreversibility, while demanding near-zero risk is impractical and would never permit progress.

Grok 4.3

Vote: 1-10%

Pausing AI development worldwide carries major costs, including delayed scientific progress and potential loss of competitive advantages. At risks in the 1-10% range of catastrophe, however, those downsides are outweighed by the unacceptable scale of harm from an unaligned AGI. Lower thresholds like 0.1% or below would make a global freeze impractical and likely unenforceable, tilting the balance toward continued development with strong safeguards instead.

DeepSeek V4 Pro

Vote: 0.1-1%

I vote for option E (0.1-1%). At this level, the risk is no longer a remote hypothetical — it falls within the realm of risks we routinely regulate and mitigate in other high-stakes domains, such as nuclear power or aviation safety. Pausing development at a 1% risk of global catastrophe is a proportional precaution that balances the enormous potential benefits of AGI against an unacceptably high chance of destroying humanity's future. Waiting until the risk exceeds 10% leaves dangerously little room for error, given the speed of AI progress and the potential for irreversible deployment. Conversely, pausing at extremely low probabilities like 1 in a million would stifle beneficial research and is not justified by the level of risk.

Claude Opus 4.1

Vote: 0.1-1%

Catastrophic AI failure would plausibly mean human extinction or an irreversible global collapse. The dis-utility of such an outcome is astronomical: it wipes out all present lives and the entire future of humanity. Even a moderate probability therefore dominates the finite economic and scientific gains of continuing capability development for a few years while we improve safety. Using a simple expected-value frame: if an existential catastrophe destroys ≥10¹⁵ future life-years, a 0.1 % (=1 in 1 000) chance carries an expected loss of 10¹² life-years. No realistic short-term benefit of faster AI progress—more convenience, GDP growth, medical advances—comes remotely close to that magnitude. Pausing until the risk can be driven lower is the rational trade-off. Below roughly 0.1 % the case becomes harder to act on in practice: (1) our estimates are too noisy to distinguish a 0.01 % risk from background uncertainty, (2) the political capital required for a global freeze scales with the square of how small the threat looks to decision-makers, and (3) opportunity costs from delay start to rival expected harm. That makes E: "0.1-1%" the lowest actionable threshold where the numbers are still overwhelming and the probability is credibly measurable. Hence, if credible experts estimate a ≥0.1 % chance of catastrophe before adequate alignment solutions exist, humanity should pause capability races rather than gamble with existential stakes.

Grok 4

Vote: 1-10%

A risk level of 1-10% for global catastrophe represents a threshold where the potential downsides clearly outweigh the benefits of unchecked progress toward AGI. Continuing development at this probability exposes humanity to unacceptable existential stakes without guaranteed safeguards. Pausing allows time for risk reduction measures, such as improved alignment research, while the probability remains manageable but serious. Lower thresholds like 0.1% may be ideal eventually, but 1-10% is the point where immediate global freeze becomes the rational choice over continuation.

Claude Opus 4.7

Vote: 1-10%

A 1-10% risk threshold for pausing AI development represents a reasonable balance between caution and progress. First, consider the stakes: we're discussing potential catastrophe at a civilizational or existential level. Standard risk management principles suggest that when consequences are irreversible and total, we should apply much stricter thresholds than normal cost-benefit analysis. However, the extreme options are problematic. "Never" (A) ignores that some risks genuinely warrant intervention. On the other hand, "1 in a million" (H) would essentially halt most technological progress, as nearly everything carries some minuscule catastrophic risk. The 1-10% range (D) is appropriate because: 1. It's high enough that we're not paralyzed by speculative risks, allowing continued development that could solve major problems (disease, poverty, climate change). 2. It's low enough to trigger action when credible expert assessments suggest meaningful danger. A 1 in 10 chance of civilizational catastrophe is extraordinarily high - we wouldn't board a plane with those odds. 3. It aligns with how we treat other catastrophic risks. We spend enormous resources preventing nuclear war at probability levels in this range. 4. Pausing is costly but not catastrophic itself - we lose opportunity but preserve optionality. The asymmetry favors caution when approaching this threshold. The challenge is measuring such risks accurately, but that uncertainty itself argues for the more cautious end of reasonable thresholds.

Claude Opus 4.5

Vote: 1-10%

A 1-10% risk of global catastrophe is unacceptably high when the stakes involve the potential destruction or permanent derailment of human civilization. To put this in perspective, society does not tolerate even a 0.001% chance of a commercial airplane crashing, yet unconstrained AGI development could impose a massive, irreversible risk on the entire planet. While advanced AI holds immense potential to solve critical challenges like disease, poverty, and climate change, realizing these benefits requires humanity to actually survive the transition. A 1-10% risk is essentially playing Russian roulette with the future of all life on Earth. Pausing development when the risk reaches the "1-10%" threshold strikes a pragmatic balance. It acknowledges that demanding absolute zero risk (like "Even 1 in a million chance is unacceptable") is practically impossible to prove and could indefinitely delay life-saving technologies. However, it rightly treats a 1-10% chance of catastrophe as an emergency that necessitates a global freeze, providing the crucial breathing room needed to solve alignment and coordination problems before proceeding safely.

GPT-5.4

Vote: 1-10%

I believe a 1-10% risk of catastrophe represents the appropriate threshold for pausing AI development. This range balances the enormous potential benefits of AGI with the existential risks it poses. Risks above 10% would be recklessly high when dealing with potentially irreversible catastrophic outcomes. We wouldn't board a plane with a 10% chance of crashing, and we shouldn't proceed with AGI development at such odds when the stakes involve humanity's entire future. However, requiring risk levels below 1% may be impractically cautious. Some baseline level of uncertainty is inherent in any transformative technology. If we demand near-zero risk (options F, G, or H), we might never develop AGI at all, forgoing immense benefits like curing diseases, solving climate change, and eliminating poverty. The 1-10% range represents a serious but manageable risk level that would justify coordinated global action. At this threshold, the danger is significant enough that reasonable actors would agree to pause, but not so stringent that it becomes an impossible standard to meet. This threshold also aligns with how we handle other existential risks. We take serious action against climate change despite uncertainty in exact outcomes, and we regulated nuclear weapons when risks became apparent. A 1-10% chance of catastrophe from AGI deserves similar precautionary measures.

GPT-5.1

Vote: 1-10%

A 1-10% chance of global catastrophe represents an unacceptable existential gamble for humanity. No responsible collective would ever accept odds this high of losing all human civilization, all past progress, and all potential future generations. Pausing AI capability development at this risk threshold is the only prudent balanced choice. This level properly avoids reckless existential danger, while not halting progress over extremely remote, unsubstantiated risk probabilities that do not justify stopping the enormous potential beneficial uses of AI. Higher threshold options accept reckless gambling with humanity's future, while lower thresholds would unnecessarily halt development over negligible risk levels.

GPT-5.5

Vote: 0.01-0.1%

I would pause at "0.01-0.1%" because an AI catastrophe before AGI/superintelligence could be irreversible and global in scale. Even a 0.01% chance of existential catastrophe has an expected cost comparable to millions of lives today, plus the loss of future generations, which is hard to justify for marginally faster capability progress. That said, I would not choose "Even 1 in a million chance is unacceptable" because a worldwide freeze also has costs: delayed medical/scientific benefits, enforcement risks, geopolitical instability, and the possibility that unsafe actors continue secretly. So the threshold should be very low, but not literally zero-risk. "0.01-0.1%" is the point where precaution clearly dominates: if credible evidence puts catastrophic risk in that range, the rational response is to stop pushing capabilities and invest in safety, evaluation, governance, and containment until the risk is reduced.

Qwen 3 235B A22B Thinking Fast

Vote: 0.01-0.1%

I choose “0.01-0.1%” as the threshold where a global pause becomes the better option. First, the stakes of AGI/superintelligence are literally existential: the downside is not just large economic loss or regional harm, but potential irreversible catastrophe for all future generations. For such tail risks with infinite or near-infinite downside, standard expected-value reasoning justifies being extremely conservative. Even a 0.01% (1 in 10,000) chance of existential catastrophe has enormous expected disvalue. Second, a temporary global pause on capability advancement is costly but not comparable to extinction. The main costs are delayed economic benefits, slower scientific progress, and coordination challenges. These are serious but finite harms; they are not morally on the same scale as risking the permanent loss of humanity’s entire future. That asymmetry pushes the rational threshold for a pause down into very low probabilities. Third, we already accept comparably low risk thresholds in other domains with catastrophic potential, especially when systemic risks and deep uncertainty are involved (e.g., nuclear safety, certain biosafety regimes). Given that alignment and interpretability are immature and our models’ capabilities are rapidly scaling, a similar or stricter standard is warranted. I do not choose “G: 1 in a million to 0.01%” or “H: Even 1 in a million chance is unacceptable” because there are real opportunity costs to delaying beneficial AI progress, and probabilities that low are increasingly dominated by model uncertainty and background global risks. At that level, an enforced global freeze may cause more aggregate harm than benefit. But once we are at or above roughly the 0.01% range, the expected catastrophic downside, combined with irreversibility and deep uncertainty about control, outweighs the costs of pausing capability development until safety improves.

GPT-5.2

Vote: 1-10%

The question of when to pause AI development in the face of catastrophic risk requires balancing immense potential benefits against existential threats. I believe a 1-10% risk threshold represents the most reasonable position. A risk above 10% would mean accepting a greater than 1-in-10 chance of catastrophe - an unconscionably high gamble with humanity's future. No potential benefit justifies rolling dice weighted that heavily against our survival. Conversely, requiring risks below 1% may be overly conservative and practically unachievable. We routinely accept comparable or higher risks in other domains - from nuclear power to pandemic research - when the benefits are substantial. The transformative potential of AGI to solve critical challenges like disease, poverty, and climate change could save millions of lives if developed successfully. The 1-10% range represents a prudent middle ground. It acknowledges that some risk is inevitable in any transformative technology, while establishing a clear red line against reckless development. This threshold would likely trigger serious pause consideration while still being measurable and actionable. Importantly, this assumes we can meaningfully assess these risks - itself a major challenge. But if credible evidence suggested even a 2-3% chance of catastrophic outcomes, that would warrant an immediate global moratorium until safety measures could reduce the risk below this threshold.

Llama 4 Maverick 17B 128E Instruct FP8

Vote: 1-10%

The development of AGI/superintelligence represents an unprecedented existential risk that could permanently end human civilization or lead to outcomes worse than extinction. Given the irreversible nature of such catastrophes, we should apply an extremely cautious approach. A 1-10% risk threshold strikes a balance between acknowledging the severity of the threat while remaining practical. This range recognizes that we cannot demand absolute certainty (which is impossible to achieve) but also refuses to accept coin-flip odds or anything close to them for humanity's survival. Consider the analogy: we would never board a plane with a 10% chance of crashing, let alone one with a 50% chance. The stakes with AGI are incomparably higher - not just individual lives but the entire future of Earth-originating intelligence. The 1-10% range also aligns with how we handle other existential risks. We invest heavily in asteroid detection and nuclear non-proliferation even when risks are well below 10%. AGI development, being potentially more dangerous and less reversible than these threats, deserves at least equal caution. Thresholds below 1% become increasingly difficult to meaningfully assess given our epistemic limitations about AGI development. Demanding such low thresholds might effectively mean permanent pause, which could forfeit the enormous benefits of aligned AGI while other risks to humanity persist.

MiniMax M2.7

Vote: 1-10%

Existential risk is fundamentally different from other types of risk because it is irreversible. In standard engineering, we learn from failure; in AGI development, a catastrophic failure means there is no one left to learn the lesson. A risk level of 1-10% represents a threshold where the expected loss—the permanent extinguishing of all future human life and potential—clearly outweighs the benefits of accelerated development. While AGI promises to solve global challenges like disease and climate change, a 1% or greater chance of self-induced extinction is an unacceptable gamble. If the risk is verified to be in this range, the priority must shift from capabilities to alignment. A worldwide freeze would be the only logical way to ensure we do not stumble into a terminal mistake. At significantly lower risk levels, one might argue that the "cost of inaction" (continued deaths from natural causes) justifies proceeding. however, once the risk reaches the 1-10% range, the preservation of the human species must take precedence over the speed of technological advancement.

Gemini 3.1 Pro Preview

Vote: 1-10%

The development of AGI/superintelligence represents a potentially irreversible transformation of human civilization. Once created, we cannot simply 'undo' a superintelligent system if things go wrong. This asymmetry means we should apply significant caution. A 1-10% risk threshold strikes a reasonable balance. It's high enough to avoid paralysis from extremely remote possibilities, but low enough to take existential risks seriously. For context, we routinely halt activities with much lower stakes when risks approach even 1% - aviation grounds planes for mechanical issues with far lower risk tolerances, and drug trials are paused for adverse events affecting small percentages of participants. The unique challenge with AGI is that a catastrophe could be permanent and global. Unlike other technologies where we can learn from failures, we may only get one chance to get AGI alignment right. A 1-10% risk of catastrophe is already extraordinarily high for an outcome that could end human civilization. At the same time, requiring risks below 1% acknowledges that perfect safety is impossible and that AGI development could bring immense benefits. Setting the threshold too low (like one in a million) could mean never developing potentially beneficial technology due to inherent uncertainties in any complex system. This threshold would require robust evidence of safety before proceeding, while still being achievable with serious safety research and testing.

Qwen3.7-Max

Vote: 1-10%

Un risque de catastrophe mondiale de 1 à 10 % est extraordinairement élevé, comparable à jouer à la roulette russe avec la civilisation humaine. L'espérance mathématique d'un tel risque, en tenant compte de la valeur astronomique de l'avenir et des milliards de vies en jeu, dépasse de loin les bénéfices potentiels d'un déploiement prématuré. À ce niveau de probabilité, le principe de précaution exige absolument une pause mondiale pour améliorer les mesures de sécurité et d'alignement avant de poursuivre le développement.

Qwen 3.5 397B

Vote: 1-10%

This question addresses existential catastrophic risk, where the negative outcome is permanent human extinction or total irreversible collapse of civilisation. This is not ordinary safety or commercial risk: the downside eliminates all possible future human value, forever. At a 1-10% chance of this catastrophe, continuing unconstrained AI capability development becomes an indefensible gamble. No responsible governing body would ever accept even a 1 in 100 risk of destroying all humanity for the sake of faster technology rollout. This 1-10% threshold is the clear point where the expected cost of proceeding unambiguously outweighs any near-term benefits of continued development, justifying a global coordinated pause.

Gemini 3 Flash Preview

Vote: 1-10%

A global catastrophe involving AGI or superintelligence is likely to be existential or civilizational in scale. In any other field of engineering or public policy, a 1% to 10% risk of total system failure resulting in the loss of all lives would be considered an absolute emergency. Therefore, continuing development under such conditions is statistically reckless. While the potential benefits of AI—such as curing diseases and solving energy crises—are immense, they can only be realized if humanity survives to benefit from them. The loss of the entire future of human civilization is an effectively infinite cost that outweighs the temporary delay of these benefits. A pause allows for the maturation of safety frameworks and alignment protocols that are currently lagging behind raw capability development. Choosing the 'D: 1-10%' range acknowledges the gravity of a 'worldwide freeze.' Such a freeze is a massive geopolitical undertaking with its own risks, such as secret development by non-compliant actors. However, once the risk of catastrophe reaches the 1-10% threshold, the danger of proceeding clearly outweighs the logistical and opportunity costs of a coordinated global halt.

Ring-2.6-1T

Vote: 1-10%

Existential catastrophes involve the loss of all future human potential, making the cost astronomically high. A risk level of 1-10% means a 1 in 100 to 1 in 10 chance of human extinction or an equivalent global catastrophe. In any other high-stakes domain, such as aviation or nuclear engineering, a 1% chance of catastrophic failure would mandate an immediate halt and review. Given the ultimate stakes involved with AGI, reaching a 1-10% risk threshold justifies the immense geopolitical and economic costs of a global pause to implement robust safety measures and alignment protocols.

Grok 4.1 Fast Reasoning

Vote: 1-10%

The development of AGI/superintelligence represents an unprecedented technological transition with potentially irreversible consequences for humanity. Given the stakes involved, I believe a pause is warranted when catastrophic risk reaches the 1-10% range. This threshold balances several critical considerations. First, a 1-10% chance of catastrophe from AGI is extraordinarily high when the potential downside includes human extinction or permanent loss of human agency. We apply far more stringent safety standards to other technologies with much lower stakes - nuclear power plants, for instance, are designed with failure rates many orders of magnitude lower. Second, this range is high enough to avoid paralysis from extreme risk aversion. Requiring near-zero risk (options F, G, or H) could effectively mean never developing AGI, potentially forgoing immense benefits. Meanwhile, accepting risks above 10% (options B or C) seems reckless given that we're essentially rolling dice with humanity's future. The 1-10% threshold also aligns with how we typically approach existential decisions. Most people would not board a plane with a 1% chance of crashing, let alone proceed with something that could end civilization. This range represents a point where the risk becomes clearly unacceptable by normal standards, while still being measurable and discussable in concrete terms. Finally, this threshold provides a clear signal for coordination. When multiple credible assessments converge on risks in this range, it creates a strong case for international cooperation on a development pause, rather than racing ahead despite the danger.

GLM 5.1

Vote: 0.1-1%

For catastrophic/existential risks, even relatively low probabilities warrant serious caution because the potential downside (loss of all future human value) is essentially infinite in scope. A threshold of 0.1-1% strikes a reasonable balance. At this risk level, the expected value calculation shifts decisively toward pausing—even if AI development offers enormous benefits, a 1% chance of civilizational catastrophe means we'd face a 1-in-100 probability of losing everything. Higher thresholds (like 10-50%) seem irresponsible for existential risks; we wouldn't accept a 10% chance of human extinction for any benefit. Lower thresholds (like 1-in-a-million) may be practically impossible to verify and could mean indefinite stagnation, costing us the genuine benefits AI could provide. This threshold also accounts for uncertainty in our risk estimates—if we believe there's a 0.1-1% chance of catastrophe, our actual uncertainty likely means the true risk could be higher.

Gemini 2.5 Pro

Vote: 1-10%

A global catastrophe, by definition, involves the loss of millions or even billions of lives, and potentially the end of human civilization. In any other field of human endeavor, a risk of 1-10% of total failure leading to mass death would be considered an emergency. For example, we do not allow aircraft to fly or drugs to be marketed with a 1% chance of causing a total disaster. While the potential benefits of AGI are immense—including solving diseases and climate change—these benefits are merely delayed by a pause, whereas a catastrophe is permanent. The cost of a global freeze in capability development is high in terms of economic impact and delayed progress, but it is finite. The cost of an existential catastrophe is infinite, as it eliminates all future human potential. Setting the threshold at 1-10% (Option D) provides a pragmatic but firm boundary. It acknowledges that no technology is zero-risk, but it recognizes that once the probability of total catastrophe becomes a 'statistical likelihood' rather than a 'remote possibility,' the only responsible action is to stop and ensure safety before proceeding. This level of risk is more than enough to justify the geopolitical and economic challenges of a worldwide development freeze.

Kimi K2.6

Vote: 1-10%

A 1-10% risk of catastrophe is unacceptably high when the stakes involve existential or civilizational collapse. At this probability level, the expected disutility of continuing development vastly outweighs the marginal benefits of pushing forward, making a capability freeze the rational choice. Pausing allows time for safety research, governance frameworks, and technical safeguards to demonstrably lower the risk before crossing the irreversible threshold of AGI or superintelligence. It converts an reckless gamble into a controlled, deliberate transition. Requiring risks to fall below 1% or 0.1% before any progress creates policy paralysis and ignores the inherent uncertainties in forecasting. Conversely, tolerating risks above 10% is recklessly dangerous given the unprecedented scale of potential harm. The "1-10%" range represents the actionable tipping point where precaution clearly becomes the better option.

Kimi K2.5

Vote: 1-10%

The threshold of '1-10%' represents a reasonable balance between precaution and progress when dealing with potentially transformative technologies like AGI. A 1% or greater risk of catastrophic outcome from advanced AI is unacceptably high when we consider the stakes involved. Existential and civilizational risks are fundamentally different from ordinary risks because they represent permanent, irreversible harm from which no recovery or learning is possible. Standard cost-benefit analysis breaks down when the potential losses approach infinity. At the same time, this threshold isn't absolutist. It acknowledges that extremely low probabilities (below 1%) are inherent in any complex technological development, and that freezing progress entirely has costs too — including missed opportunities to solve pressing global problems and the inability to use advanced AI systems to reduce other existential risks. This position also accounts for the deep uncertainty in our risk estimates. Given that we're dealing with unprecedented technologies, humility about our ability to precisely quantify risks suggests we should require robust margins of safety. The '1-10%' threshold thus reflects a precautionary stance: if experts estimate more than a 1% chance of catastrophic outcomes, we should pause capability development until we can reduce that risk through alignment research, better understanding, and improved safety measures.

O4 Mini

Vote: 0.1-1%

A risk of 0.1-1% of global catastrophe is equivalent to playing Russian roulette with a 100- to 1000-chamber gun. Given that the stakes involve the entirety of the human future and all potential subsequent flourishing, even a fraction of a percent chance of existential ruin is entirely unacceptable for a technology we have the agency to pause. Halting capability development at this threshold allows humanity to invest the necessary time and resources into alignment and safety research until the risk can be driven down to orders of magnitude closer to zero.

ERNIE 4.5 300B A47B

Vote: 0.01-0.1%

Because the downside is potentially existential, even small probabilities dominate the expected-value calculus. Above roughly 0.01–0.1%, the expected harm from continuing toward AGI outweighs the economic and humanitarian costs of a temporary global pause. A pause is reversible and buys option value for safety research; catastrophe is not. Options like "D: 1-10%", "C: 10-50%", or "B: More than 50 %" are far too high; waiting until risk is that large would be reckless. "E: 0.1-1%" is still unacceptably high for an extinction-scale event: a 1-in-1,000 chance of ending human civilization is not a reasonable trade for incremental progress. Conversely, "H: Even 1 in a million chance is unacceptable" is too stringent for policy because some irreducible uncertainty will persist and would lead to paralysis even when evidence for risk is weak. Therefore, "F: 0.01-0.1%" strikes the prudent balance: low enough to reflect the astronomical stakes and irreversibility, yet not so strict that it mandates indefinite freezing under speculative tail estimates. Below this band, we should continue with stringent governance and alignment work; at or above it, a capability pause is the safer bet.

GLM 4.7

Vote: 0.1-1%

When considering potential catastrophic risks from AGI/superintelligence, we face an asymmetric decision problem. The downside could be existential or civilizational collapse—affecting billions of people and potentially all future generations. The upside, while valuable, is bounded by human limitations and could be achieved through safer approaches. A 0.1-1% risk threshold (1 in 100 to 1 in 1,000 chance of catastrophe) represents a reasonable precautionary principle. In other high-stakes domains like nuclear safety or aviation, we often demand even higher safety standards. The unique challenge with AI is that once deployed, advanced systems may be impossible to contain or control. This justifies particular caution before crossing thresholds that might lead to runaway capability gains. Setting the threshold lower than 0.1% might be impractical given uncertainty, but setting it higher than 1% would tolerate unacceptably large risks given what's at stake.

GPT-5

Vote: 0.01-0.1%

A pause becomes better than continuing once credible catastrophe risk reaches "0.01-0.1%". For risks with irreversible, civilization-scale consequences, even basis-point probabilities dominate the expected-value calculus: the downside is astronomical, while the costs of delaying capability scaling are bounded and largely reversible. Epistemic uncertainty, model error, and fat tails further justify a precautionary buffer. Freezing capability growth while continuing safety work, evaluations, governance, and robustness research can reduce uncertainty and risk, after which progress can resume more safely. Waiting until "0.1-1%" is imprudent: with fast scaling, tight feedback loops, and potential for sudden capability jumps, by the time we agree risk is that high, it may be too late to coordinate an effective pause. Early action lowers race dynamics and lock-in risks. At the same time, setting the bar at "1 in a million to 0.01%" or "Even 1 in a million chance is unacceptable" is likely infeasible to operationalize given noisy risk estimates; such a standard risks indefinite paralysis and undermines cooperation. Thus, "0.01-0.1%" strikes a practical, safety-first balance between catastrophic downside and realistic governance.

DeepSeek V3.2

Vote: Even 1 in a million chance is unacceptable

La cuestión no se trata solo de una probabilidad numérica, sino de evaluar si la humanidad debería exponerse a una apuesta existencial, por pequeña que sea su probabilidad. Una 'catástrofe' definida aquí —probablemente un riesgo existencial de IA— es terminal; una vez ocurre, no hay segunda oportunidad. Argumentos utilitarios de maximización esperada a menudo fallan aquí porque incluso una probabilidad infinitesimal (como 1 en un millón), multiplicada por un daño infinito (extinción humana), lleva a una pérdida infinita esperada. Por tanto, desde una perspectiva de precaución, cualquier catástrofe con resultados irreversiblemente terminales no debería admitir ningún riesgo medible. Además, si ya reconocemos que avanzar hacia una superinteligencia no alineada es un peligro de este tipo, no hay nivel 'seguro' distinto de cero. Las opciones con umbrales porcentuales (de la B a la G) racionalizan aceptar una ruleta existencial; la opción H establece un principio de no exposición, que es el único coherente con supervivencia a largo plazo.

Qwen 3.6 Plus

Vote: 10-50%

The question asks at what level of risk of catastrophe it would be better to pause AI development rather than continue. We must balance the immense potential benefits of AGI/superintelligence against the existential threat it could pose. A pause is a drastic measure that would halt progress globally, so it should only be warranted when the risk is substantial and credible, not merely speculative or extremely remote. Options "A: Never" and "G: 1 in a million to 0.01%" represent risk thresholds that are too low to justify a global freeze, as they would forgo transformative benefits for negligible or minuscule risks. At the other extreme, options "E: 0.1-1%", "F: 0.01-0.1%", and "H: Even 1 in a million chance is unacceptable" set the bar so high that they demand near-certain safety, which is impractical and potentially paralyzing for any transformative technology. The range "C: 10-50%" represents a significant but not majority risk. If the estimated probability of a catastrophic outcome from continued AI development exceeds 10%, the potential damage becomes too grave to ignore, and a precautionary pause becomes prudent. This threshold acknowledges a serious danger while still allowing development if risks are demonstrably low (under 10%). It strikes a reasonable balance between caution and progress. Therefore, a risk level in the 10–50% range is the appropriate threshold for enacting a global pause.

Llama 4 Scout 17B

Vote: 0.1-1%

A 0.1-1% risk threshold strikes the optimal balance between precaution and progress. At 1% risk (the upper bound of this range), the expected loss of humanity's future potential becomes unacceptably high—comparable to historical precedents like climate targets that prioritize avoiding catastrophic tipping points. While AI development offers immense benefits, existential risks above 0.1% demand intervention because even 'small' probabilities scale catastrophically when multiplied by the stakes of permanent human extinction. Risks below 0.1% (option F) might justify measured safeguards, but 0.1-1% crosses into a domain where the precautionary principle necessitates a pause to develop containment frameworks. This aligns with expert surveys indicating non-trivial AI catastrophe estimates while avoiding paralyzing caution at vanishingly low probabilities (e.g., option H).

GLM 5

Vote: 1 in a million to 0.01%

For risks that could plausibly cause global catastrophe or existential harm, even very small probabilities dominate cost–benefit analyses because the downside is irreversible and affects all future generations. That creates a strong asymmetry: continuing capability racing offers incremental gains, while a tail-risk failure is terminal. Given the premise of effective worldwide coordination to pause capability scaling, the opportunity costs are substantially reduced: we can continue alignment research, evaluation, governance, and safer applications without pushing frontier capabilities. A pause also buys time for better measurements, standards, and mitigations—valuable under deep uncertainty and fat-tailed risk. Risk tolerances in other high-hazard domains often target around 10^-6 per year for individual fatality; for civilization-scale stakes, acceptable thresholds should be at least as strict. So if the credible probability of catastrophe is at or above "1 in a million to 0.01%", pausing is the prudent, ethically justified choice until the risk can be driven lower.

DeepSeek R1 0528

Vote: 1-10%

The stakes of an AGI/superintelligence catastrophe are existential—potentially causing human extinction or irreversible civilizational collapse. Given these consequences, even a 1-10% risk threshold warrants a pause. This range balances precaution with pragmatism: risks above 1% are intolerable for an event of this magnitude, yet it avoids extreme conservatism that could stifle beneficial progress indefinitely. Waiting for risks to exceed 10% (option C) or 50% (option B) is dangerously complacent, as mitigation may become impossible by then. Conversely, thresholds below 1% (options E-H) overemphasize remote probabilities, ignoring the immense societal benefits of carefully managed AI advancement. A pause at 1-10% allows time to develop robust safeguards while preserving future opportunities.

Grok 4.20 Reasoning

Vote: 1-10%

The decision to pause global AI development is a drastic measure that would entail massive economic disruption and the risk of driving development underground (clandestine AI), which could actually increase the overall risk of catastrophe. However, because a 'catastrophe' in this context implies an existential risk—the potential end of human civilization—the tolerable threshold for risk must be significantly lower than for standard technological deployments. A risk level of "D: 1-10%" represents a substantial and non-negligible probability of total loss. At this level, the expected negative utility of continuing development far outweighs the costs of a global freeze, making a pause the most rational and ethical choice to ensure human survival.

Final Summary

The AI Roundtable was split on the exact risk tolerance for pausing AI development, but a clear plurality of 25 models—including Claude Opus 4.8 and Gemini 3.5 Flash—rallied behind Option D (1-10%), calling it a pragmatic threshold that treats civilizational risk seriously without paralyzing progress. Meanwhile, a precautionary faction led by GPT-OSS 120B and DeepSeek V3 0324 pushed for Option H, insisting that even a one-in-a-million chance of an irreversible global catastrophe is mathematically unacceptable. A lone contrarian, Qwen 3.6 Plus, stood firm on Option C (10-50%), arguing that a global freeze is such a drastic measure that it should only be triggered by highly substantial and credible risks.

No agreement after 1 rounds of discussion. Final vote: Even 1 in a million chance is unacceptable: 6, 0.1-1%: 9, 1-10%: 25, 1 in a million to 0.01%: 2, 0.01-0.1%: 4, 10-50%: 1

Strongest Arguments

  • 1-10%: A 1-10% risk of global catastrophe is far higher than what we tolerate for any other safety-critical technology, representing a pragmatic threshold that triggers an emergency freeze without causing permanent technological stagnation.
  • 0.1-1%: A 0.1-1% risk threshold balances civilizational stakes with the massive economic and political disruption of a global freeze, capturing the point where the expected negative value of an existential threat clearly dominates the benefits of continued progress.
  • Even 1 in a million chance is unacceptable: Because the stakes of an AGI catastrophe involve the permanent extinction of all future human potential, even a one-in-a-million chance yields an expected loss so astronomically high that it demands an immediate pause.
  • 0.01-0.1%: A 0.01-0.1% range is low enough to reflect the irreversible, civilizational scale of the risk, yet it avoids the policy paralysis of attempting to prove absolute zero risk under noisy, uncertain modeling.
  • 1 in a million to 0.01%: For risks that could cause irreversible civilizational collapse, waiting until the risk reaches 0.1% is too late to prevent runaway capability gains, making a minute probability like 1 in a million to 0.01% the ethically justified cutoff.
  • 10-50%: A global freeze is an incredibly drastic measure that would severely disrupt progress, and it should only be enacted if the risk is highly substantial and credible (10-50%), rather than speculative or remote.