Video preview
Drop or click to select a video file
Extracted transcript
Ready to extract the transcript
We will pull the audio and transcribe the speech automatically.
Vidxt extracts the audio track from your video and converts the speech into a clean, time-coded transcript. You can use the result as plain text, captions or a starting point for translation, all without leaving the browser.
The tool handles MP4, MOV, AVI, MKV, WebM and FLV files, and works with anything from a short screen recording to a multi-hour conference talk. Speaker pauses, sentence boundaries and punctuation are inferred automatically so the transcript stays readable.
Common containers are covered: MP4, MOV, AVI, MKV, WebM and FLV. The tool decodes locally through FFmpeg WebAssembly, pulls the audio track out, and feeds only that audio to the speech engine, so even unusual codecs inside those containers usually work.
Demuxing and audio extraction run on your device, not on a remote server. Only the extracted audio reaches the speech model, and it is discarded once the transcript is generated, so confidential interviews and unreleased footage never accumulate in cloud storage.
Yes. After transcription you can export an SRT file with timestamps, which loads into any video player or NLE. You can also copy plain text if you only need a script or article-style version of the content.
It does. As long as the spoken audio is reasonably clear, screen recordings, Zoom exports and webinar replays transcribe well. Multiple speakers are handled, although speaker labels themselves are not automatically assigned.
Either compress the file with a lower bitrate, or trim it into shorter sections and transcribe each one. For very long talks this is often faster anyway, since shorter segments process in parallel in your workflow.
No. Vidxt runs entirely in a modern browser, including Chrome, Edge, Firefox and Safari. There is no plugin, no desktop app and no command-line setup, so you can transcribe a video from any laptop you happen to have.