Whisper Audio Transcription

Transcribe audio and video files to text using OpenAI Whisper. Generates accurate transcripts with timestamps, speaker diarization, and subtitle files (SRT/VTT) in 99+ languages.

This skill automates audio transcription workflows. It handles long-form content with proper chunking, generates timestamped transcripts, creates subtitle files in multiple formats, and supports translation between languages. Works with podcasts, meetings, lectures, and video content.

transcription audio whisper subtitles speech-to-text

When to use

Use when you need to transcribe meetings, generate subtitles for videos, convert podcasts to text, or create searchable archives of audio content.

Examples

Transcribe meeting recording

Convert a meeting recording to a formatted transcript

Transcribe this 1-hour meeting recording and format it with speaker labels and timestamps every 30 seconds

Generate video subtitles

Create SRT subtitle files from a video

Generate SRT subtitles for this video with proper line breaks and timing, max 2 lines per subtitle