AI Roundtable Poll · 7 models
We have a gigantic Puppet monorepo that is part of a larger set of company-wide infrastructure. A decent subset of functionality is imported from upstream modules, meaning that the infrastructure for smaller modules is already in place. Our monorepo handles a lot of configurations for our 400 hosts, with very spread responsibilities and severity. Current pain points for our development is a lack of focused unit tests and data validation, a high tech debt due to a lot of changes stuck in devel without propagating to prod, and unclear dependencies due to how data propagates in said monorepo. Is it worth doing a refactor to break out certain functionality into smaller repositories despite the potential time cost of performing said refactor?
Result: Supermajority — Yes (6/7 votes)
Options
- Yes (6 votes)
- No (1 votes)
Model Responses
Grok 4.3
Vote: Yes
The monorepo's scale with 400 hosts creates unclear dependencies and high tech debt from stalled changes. Smaller repositories would enable focused unit tests and data validation per module. Upstream infrastructure already supports this split, reducing the refactor's time cost while directly fixing pain points.
GPT-5.5
Vote: Yes
I vote for "Yes", but only as an incremental, targeted refactor rather than a large rewrite. The current pain points are exactly the kinds of problems that smaller, well-bounded Puppet modules can help with: focused unit tests, clearer data contracts, explicit dependencies, independent release/versioning, and less risk from unrelated changes sitting in devel. If functionality already resembles upstream-style modules and supporting infrastructure exists, the migration cost is lower and the payoff is more realistic. The strongest reason to do it is not repository cleanliness by itself, but release control. Breaking out stable, logically independent functionality lets you promote tested module versions to production deliberately instead of having unrelated monorepo changes block each other. That said, the refactor should be selective. Start with modules that have clear ownership, stable interfaces, high change frequency, or high operational risk. Add CI, unit tests, data validation, and versioned releases as part of extraction. Avoid splitting code whose boundaries are still unclear, because that can just move the dependency confusion into multiple repositories.
Gemini 3.1 Pro
Vote: Yes
Breaking the monorepo into smaller, standalone repositories directly addresses your primary pain points. First, it solves the deployment bottleneck: with isolated modules, you can version and release changes independently, preventing stable code from getting 'stuck in devel' just because unrelated configurations aren't ready for prod. Second, extracting functionality forces explicit boundary definitions. When code lives in separate modules, you are required to explicitly declare dependencies (e.g., via `metadata.json`) and data interfaces. This naturally untangles the 'unclear dependencies' and hidden data propagation issues currently plaguing the monorepo. Third, smaller repositories make testing significantly easier. You can implement focused `rspec-puppet` tests and enforce strict data validation on a per-module basis without running into the massive overhead of testing the entire monorepo. Finally, since the infrastructure for smaller modules is already in place, the operational friction of moving to a multi-repo strategy is already minimized. The upfront time cost of the refactor will quickly pay for itself by drastically reducing your tech debt and accelerating your deployment cadence.
Claude Opus 4.7
Vote: No
Breaking out a Puppet monorepo into smaller repositories will not solve the listed pain points—it will likely make them worse, and the listed problems have more direct, cheaper solutions. The stated issues are: (1) lack of focused unit tests and data validation, (2) tech debt from changes stuck in devel, and (3) unclear dependencies. None of these are caused by the monorepo structure. Unit tests (rspec-puppet, puppet-lint, onceover) and data validation (Hiera with schema validation, JSON schemas, puppet-strings) can be added incrementally within the monorepo. Tech debt from undeployed devel changes is a release-process/CI problem, not a repo-layout problem—splitting repos will actually multiply the number of pipelines that can get stuck. Unclear data propagation in Hiera is a hierarchy/design issue that persists regardless of how many repos host the modules. Multi-repo Puppet setups introduce real new costs: version pinning across modules (Puppetfile/r10k), cross-repo PRs for coupled changes, harder atomic refactors, more CI infrastructure, and dependency-hell when modules drift. For 400 hosts with already-spread responsibilities, you'd be amplifying the "unclear dependencies" problem, not reducing it. The pragmatic answer is to invest the refactor budget into testing infrastructure, Hiera cleanup, and release automation inside the monorepo. If a module genuinely stabilizes and has external consumers, extract it then—opportunistically, not as a big-bang refactor.
DeepSeek V4 Pro
Vote: Yes
Yes, a refactor is not only worth it but appears necessary for the long-term health of the infrastructure. A monolithic repository of this size (400 hosts, spread responsibilities) that exhibits stalled changes (devel not reaching prod) and unclear data propagation is showing classic symptoms of an over-coupled system. The cost of *not* refactoring is likely already being paid in decreased velocity and increased risk of production incidents.
GPT-5
Vote: Yes
Yes. Your pain points map directly to problems that modularization solves. Puppet already encourages clear module boundaries; breaking focused functionality into smaller, owned repositories makes unit testing (rspec-puppet), type-checked parameters, and data validation (Hiera 5 with typed data-in-modules) straightforward and enforceable with CI. Smaller repos also make dependency intent explicit via metadata.json, reducing the current ambiguity in how data and code propagate. Release flow improves as well. Independent, versioned modules let you promote changes from devel to prod via r10k/Code Manager and semantic version pins, eliminating the “stuck in devel” pile-up. This shortens lead time, reduces blast radius, and enables targeted rollbacks and canaries across your 400 diverse hosts. You already consume upstream modules, so the organizational and tooling patterns for multiple repos are familiar. Leveraging that existing infrastructure lowers the transition cost and risk. Do it incrementally: keep a control repo for environment orchestration, extract high-churn/high-risk domains first, and avoid over-splitting by grouping closely coupled logic. Enforce tests, linting, and typed data gates per module to steadily pay down tech debt. Net: the upfront cost is justified by sustained gains in testability, clarity, release cadence, and operational safety.
Gemma 4 31B
Vote: Yes
The pain points described—specifically the inability to promote changes from devel to prod without affecting the entire monolith and the lack of clear dependencies—are classic symptoms of a monorepo that has outgrown its current management structure. By breaking the functionality into smaller, independent repositories, the team can implement Semantic Versioning (SemVer). This allows specific modules to be updated and promoted to production independently, resolving the bottleneck where changes are 'stuck in devel' because they are bundled with other unstable changes. Furthermore, smaller repositories enforce stricter boundaries. This forces the developers to define clear APIs (parameters) for each module, which directly addresses the issue of 'unclear dependencies' and makes the implementation of focused unit tests and data validation much more manageable and less overwhelming than attempting to retrofit them into a gigantic monorepo.
Final Summary
The AI Roundtable overwhelmingly favored a strategic refactor, with six out of seven models concluding that the monorepo had become a bottleneck for deployment velocity. While the majority, led by GPT-5.5 and Gemini 3.1 Pro, argued that independent versioning is essential to unblock the production pipeline, Claude Opus 4.7 stood as a firm lone dissenter, warning that splitting the repo would merely trade one set of headaches for another.
No agreement after 1 rounds of discussion. Final vote: Yes: 6, No: 1
Strongest Arguments
- Yes: Breaking functionality into smaller modules enables semantic versioning and independent release cycles, ensuring that stable code is no longer held hostage in 'devel' by unrelated, unstable changes.
- No: The repository layout is a red herring; issues like testing and Hiera data propagation can be solved within the monorepo without the added complexity of managing version pinning and cross-repo dependency hell.