Real-Time Captioning Platform with On-Device Translation

captions.events is an open-source real-time captioning platform built on Next.js, ElevenLabs Scribe v2 Realtime API, and Supabase Realtime.

It allows you to broadcast live transcriptions to unlimited viewers with automatic language detection and on-device translation.

You can use this platform to create professional-quality live captions for conferences, webinars, and presentations without external translation API costs or privacy concerns.

All translation processing happens locally in the viewer’s browser through Chrome’s Translator API. This means no transcription data leaves the device.

The system separates broadcaster and viewer roles, where broadcasters control the recording session and viewers receive real-time updates through Supabase’s realtime subscriptions.

GitHub

Live Demo

Features

🎤 Real-Time Transcription: Captures and processes speech with 100-200ms latency using ElevenLabs Scribe v2 Realtime API.

📡 Unlimited Viewer Broadcasting: Streams live captions to any number of concurrent viewers through Supabase Realtime.

🌍 On-Device Translation: Translates captions into 14+ languages using Chrome’s built-in Translator API without external API calls.

🔍 Automatic Language Detection: Identifies spoken language in real-time using Chrome’s Language Detector API with 50%+ confidence threshold.

👤 GitHub OAuth Authentication: Manages user sessions and event ownership through GitHub authentication.

📜 Caption History Management: Stores and displays complete transcription history with language metadata.

🔒 Single-Use Token System: Generates secure, time-limited tokens for ElevenLabs API access without exposing credentials.

📱 Separate Broadcaster and Viewer Interfaces: Provides dedicated pages for recording and viewing with real-time synchronization.

Use Cases

Conference Accessibility: Provide real-time captions for conference talks with automatic translation for international attendees.
Corporate Webinars: Broadcast live presentations with captions to remote employees across different language regions.
Educational Lectures: Record lectures with timestamped captions that students can review and translate into their preferred language.
Live Event Streaming: Add professional captioning to virtual events, product launches, or community meetings with multilingual support.

Installation

1. Clone the repository and install dependencies.

git clone https://github.com/yourusername/v0_realtime_scribe.git
cd v0_realtime_scribe
pnpm install

2. Create a new project at supabase.com. Run the database migrations located in supabase/migrations/ in sequential order through the Supabase Dashboard SQL editor. For local development, you can use the Supabase CLI by running supabase start in the project directory.

3. Navigate to github.com/settings/developers and create a new OAuth App. Set the callback URL to http://localhost:3000/auth/callback for local development. Copy the Client ID and Client Secret. Go to your Supabase Dashboard, navigate to Authentication, then Providers, and enable GitHub. Enter your Client ID and Client Secret in the respective fields.

4. Create a .env.local file in your project root with the following variables:

NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
NEXT_PUBLIC_SITE_URL=http://localhost:3000
ELEVENLABS_API_KEY=your_api_key

5. Get your ElevenLabs API key by signing in at elevenlabs.io, navigating to your profile settings, and copying the API key. The ELEVENLABS_API_KEY remains server-side only and never gets exposed to clients. The system uses it to generate single-use tokens that expire after 15 minutes.

6. Start the development server:

pnpm dev

Access the application at http://localhost:3000. Sign in using GitHub authentication to create and manage events.

Broadcasting Captions

After signing in, create a new event from the dashboard. Navigate to the broadcast page at /broadcast/[uid] where [uid] is your event’s unique identifier. Click “Start Recording” to begin transcription.

Grant microphone access when prompted by your browser. The system uses these microphone settings: echo cancellation enabled, noise suppression enabled, and auto gain control enabled for optimal audio quality.

Speak into your microphone. Partial transcripts appear immediately in italics with a light background as you speak. Final transcripts appear with a solid background when you pause or finish a sentence.

The Chrome Language Detector API automatically identifies the spoken language with real-time updates during recording. All final transcripts save to Supabase with the detected language code.

Click “Stop Recording” to end the session. The recording stops, and the microphone connection closes.

Viewing Captions

Viewers access the event at /view/[uid] without authentication. The latest caption displays prominently at the top of the page. Caption history appears below in chronological order. All captions update automatically through Supabase realtime subscriptions.

For translation, viewers using Chrome 138 or later can select their preferred language from the dropdown menu. On first use of a language pair, Chrome downloads the translation model locally with progress displayed on screen.

Once downloaded, translation happens instantly on the device with no network latency. The system automatically detects the source language from the transcription data, so viewers only select their target language.

Supported translation languages include English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Japanese, Korean, Chinese (Simplified), Arabic, Hindi, and Turkish.

If the Translator API is unavailable, viewers see an informational message with a link to Chrome documentation, but original captions still display normally.

Production Deployment

Deploy to Vercel by importing your repository at vercel.com. Add all environment variables from your .env.local file to the Vercel project settings.

Update your GitHub OAuth callback URL to match your production domain, such as https://yourdomain.com/auth/callback. Redeploy the application for the changes to take effect.

The /api/scribe-token endpoint handles token generation server-side. It accepts an optional eventUid query parameter to verify ownership. The endpoint returns a JSON response with a single-use token or appropriate error codes: 401 for unauthenticated requests, 403 for unauthorized access, 404 for missing events, and 500 for configuration errors.

Related Resources

ElevenLabs Scribe Documentation – Official API reference for ElevenLabs Scribe v2 Realtime API with configuration options and usage examples.
Chrome Translator API Documentation – Technical documentation for Chrome’s built-in translation capabilities and browser requirements.
Supabase Realtime Guide – Official guide to implementing real-time subscriptions and broadcasting with Supabase.
Chrome Language Detector API – Documentation for Chrome’s on-device language detection API with implementation details.

FAQs

Q: Why aren’t captions appearing for viewers?
A: Check that Row Level Security policies allow public reads on the captions table in your Supabase project. Verify that realtime is enabled on the captions table through the Supabase Dashboard. Confirm the event_id matches between the broadcaster and viewer pages. Open the browser console to see any connection or subscription errors.

Q: How do I fix translation features not working?
A: Verify you’re using Google Chrome 138 or later. Visit chrome://flags and enable “Enables optimization guide on device” and “Prompt API for Gemini Nano” flags. On first use of a language pair, Chrome downloads the translation model which requires an active internet connection and may take a few moments. After initial download, models are cached and work offline. Try selecting “Original (No Translation)” and then reselecting your target language if issues persist.

Q: What happens if token generation fails?
A: Verify your ELEVENLABS_API_KEY is correctly set in the .env.local file. Confirm you’re authenticated through GitHub and own the event you’re trying to broadcast. Check server logs for detailed error messages.

Q: Can viewers access captions without authentication?
A: Yes, viewers access the /view/[uid] page without signing in. The Row Level Security policies on the captions table allow public reads while restricting inserts to authenticated event owners. This design lets you share caption links widely without requiring viewer accounts.