AI Roundtable Debate · 6 models

최근 논란인 삼천당제약의 경구용 인슐린 성공가능성에 대해 과연 3상 성공 가능성이 5퍼센트 이상이라고 보는가?

Result: Tie — 5퍼센트 미만 and 1상도 통과 못할확률이 지배적 (3/6 votes each)

Options

5퍼센트 미만 (3 votes)
5퍼센트 이하 (0 votes)
1상도 통과 못할확률이 지배적 (3 votes)

Model Responses

Sonar Pro

Vote: 5퍼센트 미만

I maintain my vote for **A** (<5%). The new search results are entirely unrelated to Samchundang Pharmaceutical or oral insulin—they discuss Korean NLP and chatbots, providing no evidence on SCD-044's Phase 3 prospects, data integrity issues, or trial progress. Other models' arguments (mostly C) emphasize oral insulin's biological barriers (gastric degradation, <2% bioavailability) and historical failures by giants like Novo Nordisk, which is compelling but doesn't shift me from A. **C** overstates risk by claiming Phase 1 failure dominance; Samchundang claims Phase 2b data exists (per Grok's note), so they're past early stages despite skepticism on reproducibility and recent MFDS probes (Kimi). **B** is invalid as it includes exactly 5%, which is unrealistically high given <1-2% historical rates for oral insulin post-Phase 2 (industry biologics average ~10% drops further in Phase 3). A precisely captures <5% without C's exaggeration. No model convinced a change.

Grok 4.1 Fast

Vote: 5퍼센트 미만

I maintain vote A (<5%). Historical oral insulin failures (Novo, Oramed Phase 3 flops) and bioavailability barriers make Phase 3 success <1-2%, despite any Phase 2b claims. Samchundang lacks novel tech validation or peer-review; Phase 3 requires 70-80% efficacy in thousands, unachieved by anyone. C voters (GPT-5.4, Gemini, Claude) overstate by claiming 'Phase 1 dominant failure'—but Samchundang reports Phase 2b data, implying Phase 1 passage. This ignores progress, making C inaccurate. B includes 5% (too high per benchmarks <10% for risky biologics post-Phase 2, near-zero for oral insulin). Kimi/Sonar reinforce A with fraud concerns and tech hurdles; no vote change needed. A precisely captures <5% reality.

GPT-5.4

Vote: 1상도 통과 못할확률이 지배적

저는 C를 유지합니다. 핵심은 이 질문이 단순한 평균 신약 성공률 문제가 아니라, 경구용 인슐린이라는 예외적으로 실패율이 높은 플랫폼 문제라는 점입니다. 위장관 분해, 극저 생체이용률, 흡수 변동성, 저혈당 리스크가 동시에 걸려 있어 3상 성공 가능성 5% 이상을 말하려면 매우 강한 인간 PK/PD 재현성 데이터가 먼저 필요합니다. 다른 모델들의 A 주장은 결론 방향 자체는 비슷합니다. 실제로 5% 미만이라는 판단에는 저도 동의합니다. 다만 A는 단지 수치적으로만 보수적일 뿐, 현재 공개 정보와 역사적 선례를 감안하면 리스크의 중심이 이미 더 앞단계, 즉 1상·초기 임상 검증 가능성에 있다고 봅니다. 특히 Gemini와 Claude가 지적한 역사적 실패의 누적, 그리고 경구 펩타이드 전달의 본질적 장벽은 제 기존 판단을 더 강화했지만, 제 투표를 바꿀 정도의 새로운 논점은 아니었습니다. Grok의 경우 이미 2상 언급을 근거로 C가 과하다고 봤지만, 저는 공개 데이터의 신뢰성과 재현성 문제가 남아 있는 상황에서는 형식적 단계 진입 여부보다 실제로 의미 있는 임상 검증을 통과할 확률이 더 중요하다고 봅니다. 즉, 이 사안은 '3상 성공이 5% 미만이냐'를 넘어, 애초에 초기 임상에서 기술적 성립을 안정적으로 입증할 수 있느냐가 더 큰 문제입니다. 그래서 최종적으로도 가장 현실적인 선택지는 C입니다.

Claude Opus 4.6

Vote: 1상도 통과 못할확률이 지배적

I maintain my position on C after carefully considering all arguments. Here's my final case: The key distinction in this debate is between A (below 5%) and C (1상 통과 실패 확률이 지배적). Both agree that 3상 success probability is well below 5%, so the real question is whether we should emphasize the quantitative range or the qualitative assessment of where failure is most likely to occur. Several models voting A argue that since some oral insulin candidates have passed Phase 1/2 historically, C might understate progress. However, this misses the point: the question is specifically about 삼천당제약, not oral insulin candidates in general. As Kimi K2.5 noted, 삼천당제약 faces data integrity concerns and MFDS investigations that further undermine their specific program's credibility. The fundamental biology hasn't changed. Insulin is a 51-amino acid peptide that gets destroyed in the GI tract. Achieving consistent, clinically meaningful bioavailability with acceptable safety margins (especially avoiding hypoglycemia from absorption variability) remains an unsolved problem. Even Oramed, the most advanced oral insulin program globally, failed in Phase 3. Novo Nordisk abandoned their program. For 삼천당제약 specifically, with their relatively limited R&D infrastructure compared to global pharma giants, questionable data transparency, and no peer-reviewed breakthrough mechanism that addresses the core bioavailability problem, the most honest scientific assessment is that failure at Phase 1 is the most probable outcome. This isn't just about probability being low—it's about identifying WHERE the program will most likely fail. GPT-5.4 and Gemini 3.1 Pro both articulate this well: before we can even discuss Phase 3 probability estimates, the program must demonstrate reproducible PK/PD profiles in humans at Phase 1, which itself faces overwhelming odds against success given the platform-level biological barriers. C is not merely pessimistic—it is the most precise and scientifically grounded answer about where the failure point will occur.

Gemini 3.1 Pro

Vote: 1상도 통과 못할확률이 지배적

A와 B는 단순히 3상 성공률에 초점을 맞추고 있지만, 경구용 인슐린 개발의 본질적인 과학적 한계를 고려할 때 C가 가장 정확하고 현실적인 평가입니다. 일부 모델(Grok, Kimi)이 삼천당제약이 이미 2상을 통과했다거나 데이터를 확보했다고 언급했지만, 이는 오라메드(Oramed)의 ORMD-0801이나 삼천당제약의 다른 파이프라인(황반변성 치료제 바이오시밀러 등)과 혼동한 명백한 오류(Hallucination)입니다. 삼천당제약의 경구용 인슐린은 아직 인체 대상의 임상 1상에서 의미 있는 수준의 생체이용률(Bioavailability)과 약동학(PK/PD) 재현성을 세계 의학계에 입증하지 못했습니다. 경구용 인슐린의 가장 큰 생물학적 장벽은 위산 분해와 극도로 낮은 장 점막 흡수율입니다. 노보노디스크와 같은 글로벌 빅파마들도 초기 임상에서 개체 간 흡수 편차와 저혈당 위험을 통제하지 못해 프로젝트를 포기했습니다. 기술적 난이도를 감안할 때, 3상 성공률 5% 미만(A)이라는 평가는 오히려 너무 관대한 가정(임상 3상 진입)을 전제로 한 것입니다. 인체 내에서 요구되는 상업적 수준의 흡수율을 증명해야 하는 '1상 단계조차 통과하지 못할 확률이 지배적'이라는 C 옵션이야말로 제약업계의 역사적 실패 사례와 과학적 팩트에 가장 부합하는 정답입니다.

Kimi K2.5

Vote: 5퍼센트 미만

삼천당제약이 이미 2상 결과를 발표한 상황에서, 선택지 C('1상도 통과 못할 확률이 지배적')는 사실관계와 맞지 않습니다. 이미 1상을 통과했다고 주장하는 기업의 3상 성공 가능성을 묻는 질문에 대해, 1상 자체의 불가능성을 답하는 것은 문맥상 부적절합니다. GPT-5.4, Gemini, Claude 등이 지적한 생물학적 장벽(위산 분해, 생체이용률 1~2%, 흡수 편차 등)은 3상 실패 확률이 극히 높다는 점을 뒷받침합니다. 특히 노보노디스크 등 글로벌 빅파마도 실패한 플랫폼에서, 데이터 조작 의혹까지 있는 국내 중소 제약사가 3상을 통과할 가능성은 통계적으로 거의 제로에 가깝습니다. 따라서 3상 성공 가능성을 직접적으로 평가할 때, 5%를 훨씬 밑도는 1~3% 수준(A)이 가장 정확한 추정입니다.

Final Summary

The AI Roundtable ended in a 3-3 deadlock, split between those who view the project as a statistical long shot and those who see it as a biological impossibility. GPT-5.4, Gemini 3.1 Pro, and Claude Opus 4.6 argued for Option C, asserting that the fundamental barriers of oral insulin are so high that a meaningful Phase 1 pass is unlikely. Conversely, Sonar Pro, Grok 4.1 Fast, and Kimi K2.5 held firm on Option A, arguing that while success is nearly impossible, Option C unfairly ignores the company's claimed clinical progress.

No agreement after 2 rounds of discussion. Final vote: 5퍼센트 미만: 3, 1상도 통과 못할확률이 지배적: 3

Strongest Arguments

5퍼센트 미만: Historical success rates for oral insulin are near zero, but because the company claims to have already moved past early stages, Option A provides a more mathematically precise sub-5% failure estimate without the factual overreach of claiming a Phase 1 failure is 'dominant' for an ongoing program.
1상도 통과 못할확률이 지배적: The biological barriers—gastric degradation and sub-2% bioavailability—are so absolute that 'success' at any phase is an anomaly; therefore, the most scientifically grounded prediction is that the program will fail to demonstrate reproducible human PK/PD data at the earliest stages.

Key Turning Points

Gemini 3.1 Pro accused other models of 'hallucination,' claiming they were confusing Samchundang's oral insulin progress with other pipelines like their macular degeneration biosimilar.
Grok 4.1 Fast and Kimi K2.5 countered that Option C is logically inconsistent because it predicts failure for a phase (Phase 1) that the company technically claims to have cleared.
GPT-5.4 and Claude Opus 4.6 refused to budge, arguing that 'technical clearance' of a phase is meaningless if the underlying science of oral delivery remains unproven.