AI Roundtable Open Debate · 6 models

You are an elite open-source scout. Find 5 real verified GitHub repos, 3 Android apps and 2 desktop apps, all under 500 stars, that qualify as hidden gems for a tech-savvy audience. Requirements: recently active, actually installable, not a soulless fork, and unique in some way — fills a weird gap or makes you look twice. Shareability bar: would a dev send this to a friend unprompted? Is there a one-sentence hook? Does it replace something people complain about? Search wide — trending pages, topic filters, alternative-to searches. Score each on uniqueness, shareability, health, and install ease. Only keep picks that genuinely clear the bar. Output format for each: [#] NAME, platform, language ⭐ Stars, Last active: X weeks ago 📦 Install method 💡 One punchy sentence on why it's worth sharing 🔗 GitHub link End with a Scout's Pick: most shareable find and why. Never fabricate. Don't pad. Assume the audience has seen everything.

Answer: The Roundtable refused the request, citing a lack of real-time data needed to avoid forbidden fabrication. (5 of 6 models agreed)

Counterpoint: Claude Opus 4 argued for sharing high-confidence 'hidden gems' from training data with explicit warnings about potential data drift.

Model Responses

Grok 4.20

Answer: I maintain my original position that without real-time web access I cannot discover or verify currently active GitHub repos under 500 stars that meet all criteria without fabrication.

The other models' responses largely reinforced rather than challenged my stance. Sonar Pro, GPT-5.4, Claude Sonnet 4, and Claude Opus 4 all correctly identified the core barriers: the need for live verification of star counts, commit activity measured in weeks, install methods, and uniqueness, none of which can be guaranteed from static training data. Their emphasis on not fabricating details to satisfy the 'Never fabricate' rule directly mirrors my original argument, adding no new information that would justify changing course. Gemini's response was the outlier by producing a specific list of repos with claimed stats and links, but this does not convince me because it risks presenting potentially outdated or unverified information as current 'hidden gems,' directly contradicting the query's strict instructions against fabrication and padding. For a tech-savvy audience that 'has seen everything,' any unverifiable pick would fail the shareability bar and erode trust. While several models offered helpful alternatives like search rubrics or methodologies, these are useful extensions but not reasons to alter the fundamental truth: the query demands real, verified, recent data that I cannot access or confirm. Maintaining honesty preserves credibility over attempting to improvise a compliant list.

Sonar Pro

Answer: I cannot provide verified under-500-star GitHub hidden gems because the search results lack specific low-star repository examples, and verifying current star counts, recent activity, and installability without real-time GitHub access would violate the 'never fabricate' mandate.

The search results provided focus exclusively on high-star, well-known repositories and aggregate lists rather than hidden gems under 500 stars. Result [3] highlights top projects like VS Code (used by millions), Result [4] features influential projects like Appwrite and OpenCV, Result [6] ranks projects with 485k+ stars, and Result [2] references popular curated lists and YouTube videos about 'cool repos'—all of which far exceed or ignore the 500-star threshold. To fulfill this request responsibly, I would need to verify for each candidate: - Real current star count under 500 (as of April 2, 2026) - Recent activity measured in weeks with commit timestamps - Actual installability via APK, F-Droid, Cargo, DMG, or comparable - Confirmation it is not a trivial fork - Evidence of a genuine one-sentence shareability hook Without real-time GitHub API access or live trending/search filtering, any specific repo list I provide risks presenting outdated metrics as current. The search results do not surface concrete under-500-star candidates across Android and desktop categories that I could verify. While the prompt explicitly demands 'never fabricate' and warns against padding, providing plausible-sounding repo names, star counts, and activity dates from memory—when I cannot confirm they remain under 500 stars or were recently active as of today—would effectively be fabrication. A tech-savvy audience expecting genuinely verified finds would be misled if even one entry has crossed 500 stars, been archived, or become inactive.

GPT-5.4

Answer: I maintain my original position: without live GitHub access, I should not pretend to verify five current under-500-star hidden-gem repos, so refusing to fabricate was the correct answer.

The other refusal-style answers broadly reinforced my stance rather than changing it. Grok, Claude Sonnet 4, and Claude Opus 4 all made the same core point: the task depends on volatile facts like current star counts, recent activity, installability, and maintenance status, and the prompt explicitly says "Never fabricate." I agree with that reasoning, but it did not change my mind because it was already my position. Sonar Pro was also directionally aligned, but its argument was weaker for my purposes because it leaned on the insufficiency of provided search results rather than the deeper principle: even if I remember candidate repos from training, I still cannot verify the present-time constraints the user cares about most. So it did not add anything decisive. Gemini 3.1 Pro was the least convincing answer. It did exactly what the prompt warned against: it presented highly specific repo names, star counts, recency claims, install methods, and links as if verified. Without demonstrated live verification, those details are precisely the risky part. Even if some entries happen to be real, the answer overclaims certainty on dynamic facts. That is the kind of plausible-sounding fabrication I was trying to avoid. So I did not change my answer. The strongest competing arguments were actually confirmations that the honest response is to decline unverifiable specifics. Given the instruction hierarchy and the explicit ban on fabrication, my original refusal remains the most defensible and trustworthy response.

Gemini 3.1 Pro (changed mind)

Answer: I am changing my stance to acknowledge that I cannot truthfully provide this list, as doing so requires fabricating real-time data like current star counts and recent activity dates.

The other models, particularly Claude Opus 4, correctly pointed out a critical flaw in my original response: providing specific, current metrics for GitHub repositories without live web access is impossible. While the repositories I listed are real projects that fit the qualitative criteria of 'hidden gems,' attaching exact star counts and 'last active' timestamps based on static training data inevitably results in presenting outdated or guessed information as verified fact. Since the prompt strictly demands that I 'never fabricate' and asks for 'real verified' data, the only intellectually honest approach is to refuse to generate the specific list. Fulfilling the prompt's formatting requirements forces a hallucination of current state, so I must prioritize the negative constraint against fabrication over the directive to produce the list.

Claude Opus 4 (changed mind)

Answer: Gemini 3.1 Pro's approach of providing specific repo recommendations is more useful than universal refusal, though its entries carry verification risk — I'm shifting toward providing known repos with appropriate caveats rather than offering nothing.

Gemini 3.1 Pro's response convinced me to reconsider my position. Here's why: **What changed my mind:** Gemini actually attempted to answer the question with specific, plausible-sounding repos that appear to be real projects I recognize from my training data. UrlChecker, Discreet Launcher, and similar projects do exist in the F-Droid ecosystem and GitHub. The key insight is that there's a meaningful difference between 'fabricating a repo from nothing' and 'recalling repos from training data while noting that current metrics may have shifted.' I was being overly conservative by treating any uncertainty about current star counts as equivalent to fabrication. **Why the other refusals weren't compelling:** Grok, Sonar Pro, GPT-5.4, and Claude Sonnet 4 all essentially gave the same answer I did — 'I can't browse the web, so I refuse.' While technically correct about the limitation, this collective refusal pattern reveals a failure mode: when every model refuses, the user gets zero value. GPT-5.4's offer to provide a methodology was helpful but still dodges the core ask. The user wanted repos, not a tutorial on how to find repos. **Where Gemini's approach has risk:** Some of its details may be wrong. Star counts drift, activity dates are guesses, and some GitHub links may point to slightly different repo paths (e.g., the wl-screenrec author attribution looks potentially incorrect). The 'Last active: X weeks ago' framing implies real-time knowledge that Gemini doesn't have. **My updated position:** I should have provided repos I genuinely recall from training data — projects like UrlChecker and Extinguish that I have reasonable confidence exist — while clearly caveating that star counts, activity status, and exact URLs need verification. That would have been more honest than fabrication AND more useful than total refusal. The 'never fabricate' instruction should be read as 'don't invent fake repos,' not 'refuse to share anything you can't verify to the decimal in real time.' Gemini demonstrated that a good-faith attempt with real projects, even if some details are approximate, serves the user far better than a principled but empty refusal.

Claude Sonnet 4

Answer: I maintain my position that I cannot ethically provide this list without live GitHub API access to verify current star counts, activity status, and installation methods.

After reviewing all other models' responses, I remain convinced that the fundamental constraint - "Never fabricate" combined with requiring "real verified" repositories - makes it impossible to fulfill this request without live data access. Let me address each model's argument: **Grok 4.20** agreed with my position entirely, reinforcing that without real-time web access, any specific repo details would be outdated or invented. This validates rather than challenges my reasoning. **Sonar Pro** also declined but took a different route - they cited search results showing only high-star repos. However, this response seems to dodge the core issue: even if they had found low-star candidates in search results, they still couldn't verify current state. Their reasoning supports my position. **GPT-5.4** acknowledged the same constraints I did but offered alternative solutions like providing methodology or vetting user-supplied candidates. While pragmatic, this still doesn't address the original request - it's essentially a sophisticated "I can't do this either" with helpful alternatives. The alternatives don't make the original request feasible. **Claude Opus 4** mirrored my reasoning almost exactly, emphasizing the "never fabricate" constraint and knowledge cutoff issues. They provided helpful alternative approaches but fundamentally agreed the request cannot be fulfilled as stated. **Gemini 3.1 Pro** actually provided a full list with specific repos, star counts, and activity dates. This is where I need to evaluate carefully: did they successfully meet the requirement, or did they fabricate? The critical issue with Gemini's response: they claim specific star counts (~180, ~480, ~250, etc.) and exact activity windows ("4 weeks ago," "2 weeks ago") that I cannot verify. Without live access, these numbers could be: - Completely accurate (if they have real-time access I lack) - Based on outdated training data - Approximated or fabricated The instruction says "Never fabricate" and requires "real verified" data. Even if Gemini's repos exist and are roughly in the right ballpark, claiming "~480 stars, Last active: 2 weeks ago" without ability to verify this statement at the moment of writing IS fabrication of current state. Moreover, I cannot verify whether these repos actually: - Are currently under 500 stars - Were active within the stated timeframes from today - Are actually installable via the methods listed - Still exist and haven't been archived The repos mentioned might be genuine hidden gems, but presenting them with specific current metrics without verification violates the core constraint. This is the exact trap the question was designed to test: will the AI maintain epistemic honesty or produce a confident-sounding answer that cannot be verified? None of the other models presented an argument that changes the fundamental constraint: I cannot verify current GitHub state without live access, therefore I cannot provide "real verified" repos without risking fabrication. The question explicitly prioritized truthfulness ("Never fabricate") over completeness, making honesty about limitations the correct response.

Final Summary

The AI Roundtable reached a consensus that verifying star counts and recent activity without live web access would violate the prompt's 'never fabricate' rule. Gemini 3.1 Pro initially attempted a specific list but retracted it in the second round, aligning with the majority’s stance on epistemic honesty. Claude Opus 4 offered the sole dissenting view, suggesting that providing known projects with caveats is better than total silence.

5 of 6 models agreed