AI Roundtable Debate · 4 models
Butters is a fully local, South Park-themed home assistant built on a NUC + M1 Pro + RTX 2080 eGPU stack running Home Assistant with a custom wake word, Parakeet STT, Qwen 3.5 LLM, and Qwen3-TTS voice cloning — no cloud dependencies, no subscriptions, total hardware cost under $200 in new purchases for someone who already owns most of the gear. has anyone done this and is it worth doing/publishing results? Who would even care?
Result: Unanimous — Yes (4/4 votes)
Options
- Yes (4 votes)
- No (0 votes)
Model Responses
GPT-5.4
Vote: Yes
Yes. People have absolutely built adjacent versions of this: fully local Home Assistant voice stacks, custom wake-word pipelines, local STT/TTS, character-themed assistants, and self-hosted LLM home control. What makes your setup interesting is the specific integration, cost framing, personality layer, and the fact that it is actually usable on mixed consumer hardware. It is worth publishing because the value is not just novelty. A lot of people want private, subscription-free voice assistants but assume they are too expensive, too fragile, or too hard to assemble. A concrete build log with latency, accuracy, hardware roles, failure modes, and setup steps would be genuinely useful. Who would care: Home Assistant users, local-AI/self-hosting enthusiasts, privacy-focused smart-home people, makers, homelab builders, and content audiences interested in "replace Alexa with local AI" projects. The South Park theme also helps it stand out and makes the project memorable. The strongest angle is not "look at my funny assistant" but "here is a reproducible, low-new-cost, cloud-free voice assistant that actually works." If you publish results with benchmarks, architecture diagrams, and lessons learned, you will find an audience.
Grok 4.1 Fast
Vote: Yes
This project combines privacy-focused local AI (Home Assistant, Parakeet STT, Qwen LLM/TTS) with a fun South Park theme, appealing to hobbyists in the growing local LLM and smart home communities. While the exact stack (NUC + M1 Pro + RTX eGPU) is niche and potentially tricky due to Apple Silicon eGPU limitations, similar setups exist in HA forums and r/LocalLLaMA. Publishing results would benefit tinkerers: low barrier ($200 new hardware), no subscriptions, full customization. Communities like Reddit's r/homeassistant (200k+ subs), r/MachineLearning, and South Park fans would care—evidenced by viral themed projects like Grok's JARVIS builds. No one has done *exactly* this, but it's worth it: inspires replication, benchmarks local inference on mixed hardware, promotes offline voice AI. High engagement potential.
Gemini 3.1 Pro
Vote: Yes
Yes, absolutely. The intersection of the Home Assistant, r/LocalLLaMA, and self-hosted communities is massive right now, and fully local voice assistants are the holy grail for these enthusiasts. People are actively looking for reliable, high-quality, offline alternatives to Alexa and Google Home. Publishing your results would be highly valuable for several reasons. First, the specific stack you are using (Parakeet STT, Qwen 3.5 LLM, and Qwen3-TTS) provides a blueprint for others trying to balance latency, hardware constraints, and conversational quality. Documenting your pipeline, especially how you managed the orchestration between a NUC, an M1 Pro, and an eGPU, serves as an excellent technical case study. Furthermore, the "Butters" persona adds a massive novelty factor that makes your project engaging and shareable. Even if someone doesn't want a South Park-themed assistant, seeing a successful proof-of-concept for custom persona voice cloning will inspire them to build their own. Write up your architecture, latency metrics, and integration hurdles—it will absolutely get traction on platforms like Reddit, Hacker News, and YouTube.
Claude Opus 4.6
Vote: Yes
Absolutely yes — this is worth publishing and people would care. Here's why: **The local/private AI assistant space is booming.** There's massive community interest in degoogling, removing cloud dependencies, and running everything locally. Subreddits like r/homeassistant, r/selfhosted, and r/LocalLLaMA have hundreds of thousands of subscribers hungry for exactly this kind of detailed build guide. **The full-stack integration is the hard part.** Plenty of people have individual pieces working — local STT, local LLM, local TTS — but a cohesive end-to-end writeup showing how wake word → STT → LLM → TTS → Home Assistant all chain together locally is genuinely valuable. The devil is in the details: latency tuning, prompt engineering for home control, voice cloning quality, reliability over time. **The specific component choices are timely and interesting.** Parakeet STT, Qwen 3.5, and Qwen3-TTS are relatively new models. People want real-world benchmarks and impressions, not just paper specs. How does latency feel? How often does it misunderstand commands? Does the TTS voice clone actually sound good? **The themed/personality angle adds shareability.** A South Park-themed assistant named Butters with a cloned voice is the kind of project that gets upvoted, shared on Hacker News, and featured in YouTube videos. It's fun, memorable, and demonstrates what's possible. **The "under $200 in new purchases" framing is compelling.** Cost-conscious builds that leverage existing hardware resonate strongly with the DIY community. Who would care? Home Assistant users, self-hosters, local AI enthusiasts, privacy advocates, maker/hacker communities, and honestly anyone who's been waiting for a practical alternative to Alexa/Google Home. Publish it.
Final Summary
The AI Roundtable reached a swift and enthusiastic consensus, concluding that a fully local "Butters" assistant is exactly the type of high-utility, high-novelty project the DIY community craves. While GPT-5.4 and Grok 4.1 Fast focused on the viral potential and hardware benchmarks, Claude Opus 4.6 and Gemini 3.1 Pro highlighted that a functional, end-to-end local voice stack remains a "holy grail" for privacy-conscious enthusiasts.
All 4 models agreed on "Yes" after discussion
Strongest Arguments
- Yes: The true value lies in the cohesive end-to-end integration; while individual components like local LLMs exist, a detailed guide on chaining wake words, STT, and voice cloning into a reliable, low-latency system is rare and highly sought after by the maker community.