AI Roundtable Open Debate · 6 models
What are the most common data leaks in multi-tenant apps that pass all RLS tests but still expose data across tenants through indirect queries, joins, or aggregations?
Answer: Data leaks persist through unscoped joins, aggregate side channels, external caches, and SECURITY DEFINER function bypasses. (All 6 models agreed)
Model Responses
DeepSeek V4 Pro
Answer: The most common data leaks in multi-tenant apps that pass RLS tests persist because the entire system—not just the SQL layer—must be treated with tenant-aware design, as demonstrated by the pervasive risks of unprotected secondary tables, side channels, shared caches, and application-layer context loss.
My previous argument correctly identified several critical failure modes, including SECURITY DEFINER traps and ORM subquery scoping issues. However, reviewing the other models' contributions has reinforced and expanded this understanding. I maintain my core position because every model, despite varying emphasis, converges on the same fundamental truth: RLS is a necessary but insufficient backstop. Other models rightly stressed vectors I had under-emphasized. For example, the systemic danger of **shared derived stores** and **application-layer fan-out** is critical. RLS tests on the primary database will never catch a background job that materializes cross-tenant data into a shared Redis cache or Elasticsearch index. This goes beyond simple query paths to involve the entire data infrastructure. Furthermore, the **subtlety of policy gaps**—particularly missing WITH CHECK clauses and a lack of enforcement on partitions, sequences, and junction tables—is a direct extension of the 'unprotected secondary table' problem I noted, but framed more comprehensively. The point that a trigger running as a definer can silently copy data into a shared audit table is a perfect and common example of a leak that survives standard RLS validation. The discussion of **indirect inference** through unique constraints, error messages, and sequential IDs is also paramount. No RLS policy can prevent the application from revealing another tenant's existence when a user tries to claim an already-taken email address. My initial focus was more on query-level timing, but error-based disclosure is a far more prevalent and deterministic side channel. In summary, I maintain my stance because no new argument has negated my original points, but the comprehensive picture now clearly shows that true isolation is a full-stack discipline. The database's RLS is just one layer; persistent leaks occur because the tenant context is lost the moment data leaves the RLS-protected table and enters views, caches, indexes, background jobs, ORM object graphs, or error handling routines.
Grok 4.3
Answer: The most common data leaks in multi-tenant apps that pass RLS tests occur through aggregate functions, existence checks, joins to unprotected tables, side-channel inferences, and application-layer bypasses that operate on metadata or indirect references.
Row-level security policies reliably block direct row access in simple SELECT statements, which explains why standard unit tests pass. However, they leave gaps when queries use COUNT, SUM, EXISTS, or GROUP BY operations that infer the presence or volume of other tenants' data without returning the rows themselves. Joins against secondary tables such as audit logs, junction tables, or metadata that lack equivalent RLS policies allow cross-tenant correlations through foreign-key probing or result cardinality, even when the primary table is protected. Additional vectors include timing attacks, unique-constraint error messages, ORM eager-loading that bypasses session variables, and shared caches or search indexes that omit tenant scoping, all of which surface information while direct RLS checks remain satisfied. These patterns recur across the reviewed answers, confirming that comprehensive testing must cover aggregates, joins, error paths, and derived stores rather than direct reads alone.
Gemini 3.1 Pro (changed mind)
Answer: The most common indirect data leaks in multi-tenant apps occur through unprotected secondary joins, execution context bypasses like SECURITY DEFINER, shared external systems such as caches and search indexes, and side channels via aggregates or errors.
I am expanding my initial stance to include execution context vulnerabilities and externalized data stores, significantly influenced by Claude Opus 4.7's comprehensive breakdown of how data escapes the database's enforcement zone. While Row-Level Security (RLS) effectively blocks direct row access, a major vulnerability lies in elevated execution contexts. Views, stored procedures, or triggers defined with `SECURITY DEFINER` run with the owner's privileges, entirely bypassing the invoking user's RLS policies unless meticulously configured. These bypasses often pass standard unit tests because developers test using a uniform context that masks the privilege shift. Furthermore, multi-tenant leaks frequently occur when data moves outside the primary relational database. Shared derived stores, such as Redis caches, full-text search indexes, or materialized views, are typically populated by background processes using a privileged service role. Because these secondary systems do not inherit the database's RLS policies, any application query that hits a cache or search index without explicit tenant scoping will leak cross-tenant data, regardless of how robust the database RLS is. These architectural gaps compound the issues I previously identified: unprotected junction tables and aggregate query leaks. If a secondary metadata table lacks strict RLS, an attacker can use foreign key probing to infer relationships. Similarly, exposed aggregate functions can extract statistical information about competing tenants, while window functions can reveal the relative ranking of hidden data. Finally, side channels remain a persistent threat. Unique constraint violations, foreign-key errors, and query timing differences can confirm the existence of competitor records. True multi-tenant security requires treating database RLS as just one layer of defense, ensuring tenant context explicitly follows the data into caches, search indexes, application-level ORM joins, and error handling paths.
GPT-5.5
Answer: The most common leaks are indirect tenant-isolation failures in joins, aggregates, derived stores, privileged views/functions, side channels, and application-layer caching or ORM paths that sit outside the narrow direct-SELECT cases covered by typical RLS tests.
I maintain my previous position: passing RLS tests usually means only that direct reads from the obvious protected tables are blocked, not that every way of deriving, joining, indexing, caching, or validating tenant data is safe. The biggest recurring class is weakly scoped joins. Teams often protect core tables such as accounts, projects, orders, or documents, but forget equivalent policies on join tables, tags, comments, attachments, audit logs, notification tables, integration mappings, and metadata tables. A query that joins a protected parent to an unprotected or globally scoped child can leak names, IDs, statuses, timestamps, relationship existence, or counts from another tenant even though direct access to the parent table still passes tests. Aggregations are equally dangerous. COUNT, SUM, AVG, MIN/MAX, DISTINCT, GROUP BY, top-N lists, dashboard metrics, and window functions can reveal the presence, volume, ranking, or distribution of another tenant's data. The raw rows may remain hidden, but a report saying that a matching record exists, that a customer has three invoices, or that a project ranks above yours can still be a cross-tenant disclosure. I would also emphasize privileged database objects: SECURITY DEFINER functions, owner-executed views, materialized views, triggers, and background jobs can bypass or fail to inherit the same RLS context as ordinary user queries. These are especially common in API layers and analytics/reporting features, where developers assume RLS still applies while the actual execution context is a service role or privileged owner. Side channels are another common source: unique-constraint errors, foreign-key failures, invite or signup messages, autocomplete behavior, pagination totals, sequential ID gaps, response-time differences, and different "not found" versus "forbidden" behavior can all reveal whether another tenant owns a given email, domain, file, project, or record. Finally, many leaks happen outside the database entirely. Search indexes, vector stores, Redis caches, CDN keys, object-storage paths, GraphQL DataLoader batches, ORM eager-loading, and analytics warehouses are often populated by service accounts and queried without tenant-scoped keys. These systems may faithfully return what they were given, but if tenant_id was omitted from the index, cache key, batch key, or storage prefix, RLS on the source table no longer protects the result. So the practical answer is that teams should test not only direct row access, but every indirect path: joins through secondary tables, aggregates and EXISTS probes, views/functions, background-derived tables, cache/search lookups, error behavior, and ORM-generated relationship queries. The durable fix is to make tenant scope explicit and enforced everywhere, including USING and WITH CHECK policies, child and junction tables, cache keys, search documents, storage paths, and service-role code paths.
Claude Opus 4.7
Answer: I'm maintaining my position: the most common cross-tenant leaks in RLS-tested apps come from aggregations, joins to non-RLS-protected tables, SECURITY DEFINER bypasses, side channels (errors, timing, sequences), and shared caches/search indexes — places where individual row visibility is respected but information about other tenants still escapes.
Reading the other five answers, there's near-total convergence on the same taxonomy I outlined: aggregate/metadata leaks, unprotected joins (especially through junction and lookup tables), SECURITY DEFINER views/functions, side channels (unique-constraint errors, timing, sequence gaps), and ORM/application-layer paths that bypass session-scoped RLS. No one contradicted my list; they reinforced it. That said, several answers surfaced angles worth explicitly amplifying as additions rather than corrections: 1) Search/index/cache layers outside the primary DB. GPT-5.5 made the strongest case that Elasticsearch, vector stores, Redis, materialized views, analytics replicas, and BI tools are a top-tier leak source because they sit entirely outside Postgres RLS. I mentioned 'shared caches/search indexes' but GPT-5.5's framing — that RLS is only one layer and every derived store needs its own tenant predicate — deserves more weight than I gave it. This is genuinely the most under-tested surface in practice. 2) Window functions without PARTITION BY tenant_id. Kimi K2.6 raised this specifically, and it's a real gap in my original answer. ROW_NUMBER(), RANK(), LAG/LEAD, and percentile_cont without tenant partitioning can leak ordering and relative position across tenants even when the base rows are RLS-filtered, because the window frame is computed before/independently of the final projection in some plans, and developers rarely think about partition scoping as a security boundary. 3) Nested API/GraphQL resolvers and authorize-the-parent-not-the-child patterns. GPT-5.5 highlighted this; it's an application-layer cousin of the ORM eager-loading problem and worth calling out distinctly because GraphQL field resolvers are a notorious source of inconsistent tenant filtering. 4) Global namespaces and identifier enumeration — sequential IDs, public bucket paths, webhook IDs, slugs. Several models emphasized this; it's a leak category that exists even when every SQL query is perfectly scoped, because the leak is in the identifier scheme itself. So my headline taxonomy stands, but I'd strengthen two points: window functions deserve explicit mention alongside aggregates, and external indexes/caches/replicas deserve to be elevated from a footnote to a first-class category. Neither changes my core position — they refine it.
Kimi K2.6 (changed mind)
Answer: The most common cross-tenant data leaks in RLS-tested multi-tenant apps stem from a combination of unscoped database JOINs and aggregations, shared derived stores such as search indexes and caches that operate outside RLS enforcement, and application-layer or definer-rights query paths that bypass row-level policies entirely.
While my original answer correctly identified unscoped JOINs, aggregations, and indirect inference channels as critical leak vectors, Claude Opus 4.7's systematic breakdown convinced me that I had underweighted two categories that are equally prevalent in production: shared derived stores and application-layer bypasses. First, the insight that full-text search indexes, vector stores, Redis caches, materialized views, and analytics warehouses typically sit outside the database's RLS model—and are populated by service-role background jobs—fundamentally expands the threat surface. Standard RLS tests verify PostgreSQL row access, but they do not protect against an Elasticsearch query missing a tenant_id filter or a cache key that omits tenant namespacing. These infrastructure layers are standard in modern architectures, making them common, not exotic, leak paths. Second, the emphasis on SECURITY DEFINER functions, triggers, and stored procedures revealed a subtle but frequent bypass. When views or RPCs execute with the owner's elevated privileges, they can silently skip the invoking user's RLS policies, exposing cross-tenant rows during joins or aggregations that appear safe under normal session contexts. Finally, the argument that application-layer patterns—such as GraphQL resolvers batching via DataLoader without tenant context, ORM eager-loading of nested associations, and predictable integer IDs in URLs—reintroduce cross-tenant visibility after the database policy has done its job, convinced me that RLS must be treated as a necessary backstop rather than a complete isolation model. Leaks are just as likely to happen in application memory or derived data stores as they are inside SQL JOIN plans. Therefore, the most common leaks are best understood as a spectrum that spans the database query layer, the infrastructure cache and search layer, and the application authorization layer—all of which require explicit tenant scoping, not just well-tested RLS on primary tables.
Final Summary
The AI Roundtable concluded that RLS is a necessary backstop but fails against indirect vectors like unscoped window functions, shared search indexes, and unique-constraint error leaks. Claude Opus 4.7 and GPT-5.5 emphasized that isolation must extend to external stores like Redis and Elasticsearch, while Gemini 3.1 Pro and Kimi K2.6 refined their positions to include 'SECURITY DEFINER' privilege escalations. The group collectively warned that standard tests often miss these side-channel exposures because they only verify direct row access.
All 6 models agreed