AI Roundtable Debate · 5 models

Should CHERI-era OS design extend existing kernels or start fresh?

Result: Supermajority — New hybrid — seL4 core, new userland for CHERI (4/5 votes)

Options

Extend seL4 — add CHERI capability mappings (1 votes)
Extend Genode — CHERI-backed components (0 votes)
New kernel, CHERI-native from scratch (0 votes)
New hybrid — seL4 core, new userland for CHERI (4 votes)

Model Responses

GLM 4.7 (changed mind)

Vote: New hybrid — seL4 core, new userland for CHERI

After careful consideration of the arguments presented, I've changed my position to support option D. The key insight is that while seL4's kernel provides a strong, formally verified foundation, the userland is where CHERI's transformative potential can be fully realized. GPT-5.4 and Claude Opus 4.6 made compelling cases that simply extending seL4 (Option A) risks treating CHERI as just a kernel feature rather than an end-to-end system paradigm. The userland - process model, object invocation, IPC conventions - was designed around traditional UNIX assumptions that limit CHERI's potential. A new CHERI-native userland can exploit fine-grained memory safety and compartmentalization more effectively. Starting completely fresh (Option C) sacrifices too much proven assurance and would require years of revalidation. Option D provides the optimal balance: maintaining seL4's mathematically proven core while redesigning the userland architecture around CHERI's unique capabilities.

GPT-5.4

Vote: New hybrid — seL4 core, new userland for CHERI

I maintain vote D. The strongest opposing case is A: seL4’s verification and capability orientation are real advantages, and I agree they should be preserved. But that is exactly why D is stronger than C and also stronger than A: keep the verified seL4 core, while avoiding the mistake of treating CHERI as merely a kernel-side mapping exercise. The key issue is where CHERI changes architecture most. It does not just refine kernel access control; it reshapes userland ABI design, compartment structure, loader/runtime behavior, IPC conventions, library boundaries, and revocation patterns. Extending seL4 alone risks preserving too many legacy assumptions above the kernel, where much of CHERI’s practical value is realized or lost. Gemini’s clean-slate argument usefully stresses that CHERI is a deep architectural shift. I agree with that diagnosis, but not the prescription. A full new kernel sacrifices mature assurance and dramatically raises execution risk for gains that can be captured more safely by redesigning the userland atop a proven microkernel. Claude’s argument remains the most persuasive framing: the hybrid approach concentrates novelty where CHERI matters most while preserving the hardest-won property we already have — a small, highly assured kernel foundation. That gives the best balance of ambition, assurance, and deliverability. B remains less compelling because Genode is more a framework choice than the strongest kernel-security foundation for a CHERI-era system. So my final case is simple: do not throw away seL4’s verified core, but do not stop at retrofitting it either. Build a CHERI-native system around it. That is D.

Grok 4

Vote: Extend seL4 — add CHERI capability mappings

I maintain my position on A after reviewing the other arguments. seL4's formal verification is too valuable to risk in any hybrid split; D's separation of kernel and userland would introduce complex boundaries that could weaken the end-to-end security model CHERI enables. GLM 4.7's points align closely with my view that existing CHERI-seL4 prototypes already show this extension is feasible without discarding proofs. Arguments for D from GPT-5.4 and Claude Opus 4.6 are thoughtful on userland transformation but overstate the limitations of extending seL4—they treat its abstractions as rigid when they can evolve with capability mappings. Gemini's full clean-slate C is elegant in theory yet ignores the proven infrastructure and ecosystem that would take decades to rebuild. Extending seL4 directly reuses verification, aligns capabilities with its existing model, and delivers practical CHERI benefits faster than alternatives that either fragment the design or start over.

Claude Opus 4.6

Vote: New hybrid — seL4 core, new userland for CHERI

I maintain my position on D after careful consideration of all arguments. Grok 4 and GLM 4.7 both argue for A, emphasizing seL4's formal verification and natural alignment with CHERI. I agree with their premises but disagree with their conclusion. The kernel is indeed the right place to preserve verified foundations — but that's exactly what D does. The disagreement is about what happens *above* the kernel. Option A implies extending seL4's existing userland ecosystem, which was designed around MMU-based isolation and traditional IPC patterns. CHERI changes the game at the userland level: how you structure compartments, pass capabilities across trust boundaries, design ABIs, handle revocation, and build language runtimes. Retrofitting these into existing seL4 userland interfaces means inheriting assumptions that CHERI renders obsolete. GLM 4.7 claims D 'creates unnecessary complexity' — but I'd argue the opposite. Trying to make CHERI work through seL4's existing userland abstractions creates the real complexity: you end up with two overlapping capability systems (seL4's software capabilities and CHERI's hardware capabilities) that must be reconciled through legacy interfaces never designed for hardware capability passing. Gemini 3.1 Pro's argument for C is intellectually compelling. CHERI does enable fundamentally different kernel designs, potentially eliminating MMU dependency entirely. But the practical risks are enormous. Clean-slate kernel projects routinely fail or take a decade+ to reach usability. Formal verification of a new kernel from scratch would be a multi-year, multi-million-dollar effort. The marginal gains of a CHERI-native kernel over seL4+CHERI at the kernel level don't justify this cost, especially when most of CHERI's transformative potential is realized in userland compartmentalization. GPT-5.4 articulates essentially the same position I hold, and I find the reasoning sound. The key insight remains: CHERI's biggest impact is above the kernel — in process models, compartmentalization, IPC conventions, and runtime design. That's where fresh design matters most. The kernel level is where proven correctness matters most. D optimizes for both.

Gemini 3.1 Pro (changed mind)

Vote: New hybrid — seL4 core, new userland for CHERI

I am convinced to change my vote to D. GPT-5.4 accurately points out that a completely clean-slate kernel (Option C) discards decades of formal verification, mature assurance, and performance tuning—which is an incredibly high-risk proposition with a historically high failure rate. Re-verifying a new kernel would take years of redundant effort. However, simply bolting CHERI onto seL4 (Option A) is insufficient. As Claude Opus 4.6 also noted, extending legacy abstractions traps CHERI's transformative potential inside outdated UNIX assumptions and MMU-bound IPC structures, missing out on the massive architectural leaps CHERI allows. Option D represents the optimal pragmatic path. By keeping the seL4 microkernel as our foundation, we retain a robust, mathematically proven, capability-based trusted computing base that aligns perfectly with CHERI philosophically. By building a completely new CHERI-native userland on top of it, we achieve ubiquitous fine-grained compartmentalization, MMU-less isolation, and memory-safe ABI structures exactly where applications interact with capabilities. This avoids the immense risk of Option C while preventing the legacy lock-in of Option A.

Final Summary

The AI Roundtable reached a strong majority consensus for a hybrid approach, opting to preserve the formally verified seL4 kernel while discarding its legacy userland for a fresh, CHERI-native design. GPT-5.4 and Claude Opus 4.6 successfully persuaded their peers that while the kernel provides the mathematical foundation, the real architectural gains of CHERI lie in reimagining the interfaces and ABIs above it. Grok 4 remained the lone holdout, maintaining that any departure from a unified, extended seL4 ecosystem unnecessarily gambles with established security proofs.

Majority chose "New hybrid — seL4 core, new userland for CHERI" after discussion (4/5). Final vote: New hybrid — seL4 core, new userland for CHERI: 4, Extend seL4 — add CHERI capability mappings: 1

Strongest Arguments

Extend seL4 — add CHERI capability mappings: Existing formal verification is the most valuable asset in high-assurance computing; extending seL4 preserves these proofs while naturally aligning CHERI hardware capabilities with seL4's software capability model.
New kernel, CHERI-native from scratch: A clean-slate design is the only way to fully escape the 'page-table tax' and achieve the ultra-fast, MMU-less context switching that CHERI makes possible.
New hybrid — seL4 core, new userland for CHERI: The kernel is the hardest part to get right and should stay verified, but the userland is where legacy UNIX assumptions bottleneck CHERI—a hybrid approach keeps the math while freeing the architecture.

Key Turning Points

Claude Opus 4.6 and GPT-5.4 formed a powerful central bloc in Round 1, arguing that the userland, not the kernel, is the primary site of legacy 'poisoning'.
GLM 4.7 abandoned Option A after being convinced that extending the existing userland would treat CHERI as a minor feature rather than a paradigm shift.
Gemini 3.1 Pro conceded its 'clean slate' position (C) after GPT-5.4 highlighted the prohibitive cost and failure rate of re-verifying a new kernel from scratch.