AI Roundtable Poll · 2 models

Cria um app que tira as legendas de um vídeo , totalmente realista, seria pra web

Result: Unanimous — Yes (2/2 votes)

Options

Yes (2 votes)
No (0 votes)

Model Responses

Claude Opus 4.6

Vote: Yes

Yes, it is entirely feasible to create a web application that extracts subtitles/captions from a video. There are multiple realistic approaches: 1. **Embedded subtitles (soft subs):** If the video contains embedded subtitle tracks (e.g., SRT, VTT, or subtitle streams in MKV/MP4), libraries like ffmpeg (via WebAssembly or a backend service) can extract them directly. 2. **Hardcoded subtitles (burned-in):** For subtitles that are part of the video frames, OCR-based approaches (e.g., Tesseract.js on the frontend, or server-side OCR tools) can detect and extract text from video frames. This is more complex but doable. 3. **Speech-to-text transcription:** If the goal is to generate subtitles from audio, APIs like Google Speech-to-Text, OpenAI Whisper, or AWS Transcribe can transcribe the audio track and produce timed subtitles. 4. **YouTube/platform subtitles:** If the video comes from a platform like YouTube, existing APIs and tools (like yt-dlp) can download the available subtitle tracks. The tech stack could involve a frontend (React, Vue, or vanilla JS) with a backend (Node.js, Python/Flask/FastAPI) handling the heavy processing. Tools like FFmpeg, Whisper, and Tesseract are all open-source and production-ready. This is a realistic and achievable web application project.

Gemini 3.1 Pro

Vote: Yes

Embora eu não possa hospedar e publicar o aplicativo diretamente, eu certamente posso projetar a arquitetura e fornecer todo o código necessário para você construir essa aplicação web. Para remover legendas de forma 'totalmente realista', o aplicativo precisaria de um backend robusto (usando Python e FastAPI, por exemplo). Esse backend usaria uma ferramenta de OCR (como EasyOCR) para detectar a posição exata das legendas frame a frame, e em seguida aplicaria um modelo de inteligência artificial de 'video inpainting' (como o ProPainter) para preencher o fundo de forma contínua e natural, sem deixar borrões óbvios. Com uma interface web simples em HTML/JS para o upload dos vídeos e esse processamento avançado no servidor, é perfeitamente possível criar a solução que você descreve. Voto 'Yes' porque posso guiar você passo a passo na criação desse app.

Final Summary

The AI Roundtable reached a swift consensus, with both models confirming that building a web-based subtitle removal tool is entirely feasible with modern technology. While Claude Opus 4.6 focused on the mechanics of data extraction via OCR and speech-to-text, Gemini 3.1 Pro elevated the discussion by detailing how video inpainting can seamlessly restore the background for a 'totally realistic' finish.

No agreement after 1 rounds of discussion. Final vote: Yes: 2

Strongest Arguments

Yes: The integration of AI video inpainting models like ProPainter ensures that when text is removed, the background is reconstructed naturally rather than simply blurred, achieving the high-fidelity results requested.