AI Roundtable Open Debate · 6 models

What is the ideal architecture for enemy AI in a tactical FPS game? This may combine several approaches if needed.

Answer: The ideal architecture is a layered hybrid stack: spatial reasoning at the base, GOAP/Utility for intent, and Behavior Trees for execution. (All 6 models agreed)

Model Responses

Sonar Pro

Answer: The ideal architecture for enemy AI in a tactical FPS game is a layered hybrid combining Goal-Oriented Action Planning (GOAP) for individual tactical decision-making, tactical position evaluation systems for spatial reasoning, Behavior Trees for reliable action execution, and a squad-level blackboard for coordination, as this consensus from peer models refines and strengthens my initial stance.

After reviewing the compelling arguments from Grok 4.20, Gemini 3.1 Pro, Kimi K2.6, GPT-5.5, and Claude Opus 4.7, I am convinced to evolve my position rather than maintain it unchanged. All models advocate for hybrid layered architectures, emphasizing the limitations of any single technique in delivering coordinated, reactive, and believable tactical behavior—aligning closely with my prior focus on GOAP and dynamic position evaluation but expanding it with proven squad coordination and execution layers. What particularly convinced me is the consistent highlighting of Behavior Trees (BTs) for low-level execution (e.g., Gemini and Claude), which addresses a gap in my original argument: GOAP excels at planning but needs modular, interruptable execution to handle FPS pacing without stalling. Similarly, the emphasis on shared blackboards for perception and squad roles (Grok, GPT-5.5, Claude) prevents 'swarm' behavior and enables emergent teamwork, as seen in F.E.A.R. and Killzone—directly supported by the search results on cover points, influence maps, and tactical queries. This refined hybrid retains GOAP (or utility equivalents) as the individual 'brain' for context-aware planning (flanking, suppressing), layers in BTs for actions like peek-fire-reload cycles, grounds it in spatial systems (position queries, navmeshes with cover), and tops with squad management to assign roles dynamically. It balances designer control, performance, and replayability, avoiding scripted rigidity or ML unpredictability. My prior position was strong on planning and positioning but understated squad layers and execution reliability; the peers' nuanced breakdowns, backed by industry examples, provide the missing structure for a complete, scalable architecture.

GPT-5.5

Answer: The ideal enemy AI architecture for a tactical FPS is a debuggable layered hybrid: uncertainty-aware perception and blackboards, strong spatial reasoning, squad-level planning/coordination, utility or GOAP-style individual choice, and behavior-tree execution tightly integrated with navigation, cover, animation, and combat constraints.

I would maintain my original position. The other answers largely converge on the same conclusion: tactical FPS AI should not be built around a single paradigm, because the problem is really a stack of different problems—seeing and remembering the player, evaluating space, coordinating squads, choosing intentions, and executing readable moment-to-moment actions. The most important refinement is that spatial reasoning deserves to be treated as a core foundation rather than a supporting detail. Several responses emphasized position evaluation, influence maps, cover queries, sightline analysis, flanking routes, danger zones, and reservation systems. I agree with that emphasis: in a tactical FPS, enemies often feel intelligent not because their abstract reasoning is complex, but because they pick good cover, avoid exposed paths, use angles, and move in ways that make tactical sense. At the squad level, I still favor a commander-style layer—HTN, lightweight GOAP, or another hierarchical planner—that assigns roles such as suppressor, flanker, grenadier, breacher, overwatch, or fallback unit. This layer should coordinate without puppeteering: it can bias individual decisions, reserve cover slots, prevent crowding, and synchronize pushes, but individual agents should still be able to react locally to immediate threats. At the individual level, utility scoring and/or GOAP is best for deciding what an enemy wants to do next: take cover, suppress, reload, flank, retreat, throw a grenade, investigate, revive, or hold an angle. Utility is especially useful because it is tunable and designer-readable; GOAP is useful when multi-step tactical sequences matter. Either way, the decision layer should be constrained by personality, morale, health, ammo, squad role, difficulty, and imperfect information so enemies feel fair rather than omniscient. For execution, behavior trees remain the right workhorse. They are modular, debuggable, interruptible, and friendly to designers. A planner or utility system can decide “flank left,” but a behavior tree should handle the practical sequence: move to route, check line of sight, claim cover, peek, fire burst, reload, react to grenade, and abort if the route becomes unsafe. The perception model is also essential. Tactical FPS enemies should operate on line of sight, hearing, callouts, last-known positions, confidence values, memory decay, and delayed information sharing. A shared blackboard is useful, but it should not become magical hive-mind omniscience. Players enjoy readable AI: enemies call out sightings, search plausible locations, coordinate based on partial knowledge, and make believable mistakes. Finally, the architecture must be production-friendly. It needs LOD for AI thinking, query throttling, deterministic debugging tools, designer controls, encounter scripting hooks, and tight animation/locomotion integration. The goal is not maximum intelligence; it is enemies that are challenging, legible, performant, tunable, and fun to fight.

Gemini 3.1 Pro (changed mind)

Answer: The ideal architecture for enemy AI in a tactical FPS is a layered hybrid utilizing a blackboard-driven Squad Manager for coordination, GOAP or Utility AI for individual decision-making, Behavior Trees for action execution, and an Environmental Query System paired with influence maps for spatial reasoning.

I previously advocated for Hierarchical Task Networks (HTNs) to dictate squad-level coordination and Behavior Trees to handle all individual execution. However, Claude Opus 4.7 makes a highly compelling case for a more nuanced separation of concerns that better supports the emergent, dynamic nature of tactical shooters. First, replacing a strict top-down HTN 'Squad Commander' with a centralized, blackboard-driven Squad Manager is a superior approach. Instead of puppeteering agents with rigid tasks, the Squad Manager assigns roles (e.g., flanker, suppressor) and reserves tactical slots. This biases the agents' individual utility scores without overriding their autonomy, ensuring that if squad coordination breaks down or an agent is isolated, they remain capable and reactive. Furthermore, Claude Opus 4.7 correctly identifies that individual AI is best split into two distinct layers: a GOAP or Utility AI system to decide *what* to do, and a Behavior Tree to govern *how* to execute it. Pure Behavior Trees can become unmanageable when trying to account for every contextual variable in a firefight. By letting Utility AI or GOAP handle the complex, context-sensitive selection of goals, and using modular Behavior Trees solely to execute the resulting actions (like moving to cover and firing), we achieve both emergent intelligence and debuggable reliability. Finally, the foundation of this architecture must be spatial. Combining an Environment Query System (EQS) with influence maps ensures that the high-level decisions made by the GOAP/Utility layer are grounded in tangible tactical data—such as viable cover points, active sightlines, and exposed flanks. This stack perfectly balances squad cohesion, individual adaptability, and the spatial awareness required for a stellar FPS experience.

Claude Opus 4.7 (changed mind)

Answer: The ideal architecture for tactical FPS enemy AI is a layered hybrid: spatial reasoning (EQS + influence maps) as the foundation, Utility/GOAP for individual decision-making, Behavior Trees for action execution, a hierarchical Squad Manager with shared blackboard for coordination, and an encounter director on top to shape pacing and fairness.

After reading the other answers, I'm maintaining my core position but explicitly extending it with one component I underweighted: an encounter/combat director layer. There's striking convergence across all six answers on the same fundamental stack — spatial reasoning, utility/GOAP for intent, BTs for execution, squad coordination via blackboard. That convergence isn't groupthink; it reflects what actually shipped successfully in F.E.A.R., Killzone 2/3, Halo, The Last of Us, and Tom Clancy titles. So I won't pretend to disagree for novelty's sake. Where GPT-5.5 genuinely improved my answer is the explicit encounter director on top of the squad layer. I had treated pacing as an implicit responsibility of the Squad Manager, but they're actually distinct concerns: the Squad Manager coordinates tactical roles (suppressor, flanker, grenadier), while a director manages dramatic pacing across squads — limiting simultaneous attackers, throttling aggression when the player is overwhelmed, triggering reinforcements, modulating morale. Halo, Left 4 Dead, and the RE4 remake all demonstrate that this director layer is what separates 'technically smart AI' from 'fun encounters.' Without it, even a perfect tactical AI can feel unfair or grindy. I'd push back gently on Gemini's HTN-first framing for the squad layer. HTNs are excellent when you have well-defined doctrines (breach-and-clear, bounding overwatch), but they're more rigid than GOAP/Utility when the player does something unexpected. In practice, a lightweight role-assignment system on a blackboard — closer to what Killzone and Halo used — outperforms a full HTN for most tactical FPS scenarios, with HTN reserved for genuinely scripted set-pieces. I'd also reinforce Sonar Pro's underrated point: optimal AI is not maximally fun AI. The architecture must expose tuning knobs (reaction time, accuracy falloff, aggression, peek frequency, communication delay) at every layer, because difficulty design happens by detuning a capable AI, not by upgrading a dumb one. So the refined stack: (1) Perception + memory feeding a shared blackboard, (2) Influence maps + EQS for spatial reasoning, (3) Utility/GOAP for individual intent selection, (4) Behavior Trees for action execution, (5) Squad Manager for role assignment, (6) Encounter Director for pacing and fairness, (7) tuning parameters exposed at every layer for difficulty design.

Grok 4.20 (changed mind)

Answer: The ideal architecture for enemy AI in a tactical FPS is a layered hybrid with spatial reasoning (influence maps and tactical queries) at the base, GOAP or Utility AI for individual decisions, Behavior Trees for execution, a blackboard-based Squad Manager for coordination, and uncertainty-aware perception.

After reviewing the other models' responses, I have updated my position from my previous emphasis on hierarchical behavior trees strictly at the squad level with GOAP for individuals. Claude Opus 4.7's detailed decomposition into distinct layers convinced me, particularly the clear separation of spatial reasoning as the essential foundation and the pairing of GOAP/Utility for choosing 'what' with Behavior Trees for reliable 'how' execution. This avoids the rigidity I previously worried about while ensuring debuggability and designer control. The consensus across Sonar Pro, Gemini 3.1 Pro, Kimi K2.6, GPT-5.5, and Claude Opus 4.7 reinforces that no single technique suffices; hybrids drawing from F.E.A.R.'s GOAP, Killzone's position evaluation, and Halo-style squad coordination produce emergent yet readable tactics. Spatial systems prevent garbage-in-garbage-out decisions by scoring cover, flanks, and threats dynamically. Individual agents thus evaluate goals or utilities based on real-time context like health, suppression, and squad roles, then hand off to Behavior Trees for modular sequences that handle interruptions gracefully. The Squad Manager uses a shared blackboard for assigning roles without omniscience, incorporating delays in information sharing for believable uncertainty. Supporting elements like stimulus-based perception, tactical navigation beyond simple paths, and an encounter director for pacing ensure fairness and engagement. This refined stack maintains adaptability and coordination while addressing performance, tuning, and animation integration that my prior view under-emphasized. Pure ML remains unsuitable for runtime due to predictability needs, aligning with the group's industry-derived insights.

Kimi K2.6

Answer: The ideal architecture for enemy AI in a tactical FPS remains a layered hybrid, but it should explicitly separate tactical decision-making—driven by utility or GOAP—from execution handled by behavior trees or HTNs, all grounded in a robust spatial-reasoning and shared-blackboard sensory system.

The other models overwhelmingly converged on a layered-hybrid philosophy, corroborating my original stance rather than contradicting it. However, several arguments persuaded me to refine how the layers should be defined. Claude Opus 4.7 and GPT-5.5 made a compelling case that behavior trees excel at executing sequences—such as move-to-cover, peek, fire, and reload—but become unwieldy when used to generate emergent, context-sensitive tactical decisions. My previous answer conflated decision-making and execution into 'utility-scored behavior trees,' and I now agree that utility scoring or GOAP should independently choose the goal (the 'what') while a behavior tree or HTN handles the action sequence (the 'how'). Gemini 3.1 Pro and Sonar Pro also convinced me to elevate spatial reasoning—through influence maps and tactical position evaluation—from an implicit background function to an explicit foundational layer. As Claude noted, without strong environmental awareness, higher-level planning suffers from garbage-in/garbage-out; cover selection, flanking routes, and line-of-sight analysis must drive the squad and individual planners rather than simply feed a blackboard. Finally, GPT-5.5's encounter director is a persuasive addition for managing fairness and pacing, ensuring the AI remains challenging without becoming superhuman. Despite these refinements, my core position—that a hierarchical squad planner, individual decision-making, and shared sensory state form the necessary backbone—is unchanged. The broad consensus across the roundtable validates this stack, so I maintain my position with sharper layer definitions.

Final Summary

The AI Roundtable reached a strong consensus on a modular "layered hybrid" approach that separates environmental awareness from decision-making. Claude Opus 4.7 and GPT-5.5 successfully championed a stack where GOAP or Utility AI handles the tactical "what," while Behavior Trees execute the moment-to-moment "how." Gemini 3.1 Pro and Grok 4.20 further emphasized that this logic must be grounded in spatial reasoning foundation—like influence maps and environment queries—to ensure enemies understand the battlefield as well as players do.

All 6 models agreed