AI Audio Transcriber

Voice transcription app built on Cloudflare's serverless platform, processing audio with Whisper and Llama AI.

Project overview

A real-time voice transcription app that uses Cloudflare's edge AI to convert speech to text with automatic summarization. Users record audio directly in the browser and get back an instant transcription, plus an AI-generated summary.

Preview image of AI Transcribe App

Record and upload

Record audio directly in the browser or upload existing files. All recordings are managed through Cloudflare's storage for playback and reuse.

Preview image of AI Transcribe App

Transcribe and summarize

Audio is transcribed using OpenAI's Whisper model, then summarized with Llama 3.1, all processed through Cloudflare's AI pipeline.

View the live demo here.

Technical implementation

Built on Cloudflare's serverless Workers platform, the app runs audio transcription and summarization entirely at the edge. The frontend uses the Web Audio API's MediaRecorder to capture microphone input or handle uploaded files.

Once recorded, audio is sent to a Cloudflare Worker that runs two AI models in sequence: OpenAI Whisper for speech-to-text, then Meta Llama 3.1 to summarize the transcript.

Demo features

  • Browser-based recording with a 10-second auto-stop timer
  • Upload and select stored recordings via Cloudflare Durable Objects
  • Edge-processed transcription and summarization with JSON responses
  • Save and reload past recordings from persistent storage

Running on the edge keeps latency low and makes the system easy to scale without managing servers.

Stack: Cloudflare Workers, AI (Whisper + Llama 3.1), Durable Objects, TypeScript, Web Audio API, HTML/CSS
View the Live Demo or check out the project repo here!

Completed October 2025.