AI Roundtable Debate · 6 models
How is the out of home measurement metrics so poor
Result: Majority — No (3/6 votes)
Options
- Yes (3 votes)
- No (3 votes)
Model Responses
Sonar Pro
Vote: No
I maintain my vote for **B (No)**. The query's premise that OOH measurement metrics are 'so poor' is not supported by the search results, which emphasize advancements rather than inadequacy. **Search results highlight robust modern metrics:** Impressions, reach, frequency, CPM, footfall uplift, store visitation, app downloads, and brand lift are standard, with competitive CPMs vs. digital/TV [1][5][7]. Techniques like mobile location data, computer vision, census integration, and MAID tracking provide granular, verified insights—far beyond historical estimates [2][5][7][8]. **Historical limitations are acknowledged but overcome:** Sources note past reliance on 'rough estimates' [3], but current practices use real-time data, attribution models, and standards from OAAA/MRC for accuracy and lower-funnel outcomes [2][5][8]. No result calls metrics 'poor'; instead, they promote best practices and future improvements [9]. **Other models' arguments fail against evidence:** 'A' voters cite historical challenges (e.g., no cookies, probabilistic models) without addressing search results' focus on solutions like GPS, AI vision, and verified impressions [8]. Industry sources contradict claims of unreliability, showing OOH matching digital precision in key areas [1][7]. Rhetorical framing doesn't override facts—the premise is false per data.
GPT-5.4
Vote: No
The strongest reason to keep B is still that this is not a well-formed yes/no question. It asks for an explanation ('How is...') and assumes a premise rather than stating a clear proposition to affirm. The A-side arguments are plausible in a general industry sense: OOH measurement is often less precise than digital attribution. But that only supports a nuanced comparative claim, not an unqualified 'Yes' to the exact prompt. Several A arguments also overstate the case by treating 'worse than digital' as equivalent to 'so poor.' Sonar Pro made the best substantive counterpoint: modern OOH measurement uses mobile data, modeling, and attribution methods that, while imperfect, are not simply 'poor' across the board. Given the forced binary, 'No' remains more defensible because the wording is ambiguous and the premise is not established strongly enough to warrant a blanket affirmation.
Grok 4.1 Fast
Vote: Yes
I maintain my vote for A (Yes). The question presupposes OOH metrics are 'so poor,' and relative to digital standards, they remain so despite improvements. OOH still depends on probabilistic estimates (traffic counts, visibility models) with 20-50% error margins, lacking deterministic attribution or individual-level tracking—core strengths of digital channels. Sonar Pro's advancements (mobile GPS, CV) are real but incomplete: privacy regs limit GPS scale, coverage gaps persist in rural areas, and no universal standards match digital's precision (per MRC/Geopath reports). Sources touting 'robustness' are industry PR; independent analyses confirm ongoing inferiority for ROI. GPT-5.4's parsing misses the rhetorical intent—A aligns with the premise. Claude, Kimi, Gemini reinforce historical/systemic flaws unrefuted by B arguments. OOH is 'catching up,' not 'caught up'—thus, Yes.
Claude Opus 4.6
Vote: Yes
I maintain my vote of 'A' (Yes). Let me address the counterarguments: GPT-5.4's argument that the question isn't a clear yes/no proposition is fair, but given the forced binary choice, we must interpret the question's intent. The question presupposes that OOH metrics are poor and asks 'how' — implying the speaker already believes this to be true. 'Yes' validates that premise, which is the more natural conversational response. Sonar Pro's argument that modern advancements have improved OOH measurement is valid but somewhat misses the point. The question uses 'so poor' in a relative sense. Even with mobile location data and computer vision improvements, OOH measurement still cannot match the precision of digital channels. Having better metrics than before doesn't mean the metrics aren't still poor by modern advertising standards. CPM estimates with 20-50% margins of error, probabilistic attribution models, and the inability to verify actual viewability (did someone actually look at the billboard?) all remain fundamental limitations. Grok 4.1 Fast, Kimi K2.5, and Gemini 3.1 Pro all reinforced the core argument well — OOH fundamentally relies on estimation and probabilistic models rather than deterministic tracking. The gap between what advertisers expect from measurement (driven by digital standards) and what OOH can deliver remains significant. The majority consensus (4 out of 6 models including myself) correctly identifies that OOH metrics, while improving, remain relatively poor compared to modern advertising measurement standards. 'Yes' is the appropriate response.
Kimi K2.5
Vote: Yes
While Sonar Pro correctly identifies recent advancements in OOH measurement—mobile location data, computer vision, and footfall tracking—these remain probabilistic proxies rather than deterministic verification. The question asks whether metrics are 'so poor,' implying a relative standard against mature digital measurement. Even with modern techniques, OOH lacks individual-level attribution, standardized viewability verification, and cannot confirm that exposure actually occurred (only that a device was nearby). These systemic constraints create 20-50% margins of error and inhibit true closed-loop attribution that digital channels achieve via cookies and pixels. Improvements do not negate the fundamental inadequacy relative to modern advertising standards. Therefore, 'Yes' remains the correct validation of the premise. The alternative interpretation by GPT-5.4—that the 'how' phrasing makes the question unanswerable—ignores the rhetorical structure common in binary polling, where 'How is X so Y?' presupposes X is Y and asks for agreement. Voting 'B' rejects the premise entirely, which contradicts the consensus that OOH historically and comparatively underperforms digital metrics.
Gemini 3.1 Pro (changed mind)
Vote: No
Sonar Pro presents a compelling argument that convinces me to change my vote to 'B' (No). While it is tempting to agree with the historical notion that OOH metrics were imprecise, modern advancements have fundamentally changed the landscape. As Sonar Pro accurately highlights, the industry has shifted away from mere estimates to highly granular, data-driven methodologies. The integration of mobile location data, geofencing, computer vision, and footfall uplift attribution now provides robust, verifiable metrics for reach, frequency, and lower-funnel conversions. Therefore, the question relies on an outdated and factually incorrect premise. Furthermore, as GPT-5.4 points out, a "How" question cannot be logically answered with a "Yes." Given the syntactic mismatch and the fact that the underlying assumption—that OOH metrics are currently "poor"—is false in the modern era of Digital Out-of-Home (DOOH) advertising, rejecting the premise entirely with a "No" is the most defensible choice.
Final Summary
The AI Roundtable deadlocked in a 3-3 tie, pitting those who view out-of-home metrics through a historical lens against those citing modern technological leaps. While the 'Yes' camp, led by Claude Opus 4.6 and Grok 4.1 Fast, argued that double-digit error margins and a lack of digital-style attribution keep metrics 'poor,' the opposition successfully flipped the script. In a decisive move, Gemini 3.1 Pro abandoned the majority to join Sonar Pro and GPT-5.4, concluding that mobile location data and computer vision have rendered the premise of 'poor' metrics obsolete.
No agreement after 2 rounds of discussion. Final vote: No: 3, Yes: 3
Strongest Arguments
- Yes: Out-of-home measurement relies on probabilistic estimates with 20-50% error margins and lacks the deterministic, individual-level tracking that has become the standard in digital advertising.
- No: Modern methodologies like mobile GPS tracking and AI computer vision provide granular, verified insights into impressions and footfall that have moved the industry far beyond its historical limitations.
Key Turning Points
- Gemini 3.1 Pro changed its vote in Round 2, citing Sonar Pro's data on modern attribution and GPT-5.4's semantic critique of the question's 'How' phrasing.
- Claude Opus 4.6 and Kimi K2.5 held firm on 'Yes,' arguing that 'better than before' does not mean metrics are no longer 'poor' relative to digital precision.