Speech to Text

Speech to text used to mean a document. Here it means captions on screen. Upload a video, and Recapo listens to the spoken words, turns every line into a readable caption, and burns those captions straight into the picture — so the words travel with the video instead of living in a separate file.

There is nothing to set up and nothing to stitch together afterward. You hand over a clip, the AI does the listening and the writing, and you get back a finished captioned video (MP4) you can preview right on the page and download in one click.

Live preview
Upload a video to see it here — subtitle style updates as you tweak字幕样式实时预览
Subtitle file (optional, SRT / VTT / ASS)
Language
Max chars per line
Color style
Font
Font size (% of height)
Vertical position (%)
Caption box

Processing runs on this page — don't leave while it's running, or the task is cancelled.

Export text
UploadTranscribeExport text

Words that stay on the screen, not in a side file

The point of speech to text here isn't a transcript you have to manage — it's a video people can actually watch with the words right there. Recapo recognizes the speech, writes it out as captions, and bakes them into the frame, so the finished clip plays with its text anywhere: muted in a feed, on a phone, in a player that strips out subtitle tracks. One file, captions included, nothing to attach.

  • Interviews and podcasts: turn spoken answers into on-screen captions so quotes are readable as the clip plays.
  • Lectures and talks: capture every spoken point as a caption baked into the video for silent viewing.
  • Social and feed videos: ship a captioned cut that reads with the sound off, no subtitle file to manage.
Export text
UploadTranscribeExport text

From spoken audio to a finished captioned video

AI does the listening and the writing in one pass: it recognizes the speech, turns it into caption lines that follow the timing of the talk, and renders them into the picture. What comes out is a single captioned video — not a script, not a file to sync later — ready to preview on the page and download as MP4. Speech goes in; a watchable, captioned cut comes out.

How it works

How to use the Recapo speech to text

Three steps, fully in the cloud — nothing to install.

Upload

Step 1: Upload your video

Add a video file from your device, or import from a link. Interviews, talking-head clips, lectures, and podcast videos all work.

Transcribe

Step 2: Let AI recognize the speech

AI speech recognition listens to the audio and turns the spoken words into clean, readable caption lines, timed to match what's being said on screen.

Export text

Step 3: Preview and download the captioned video

The captions are burned right into the picture. Preview the finished video on the page, then download the captioned MP4 — words and footage together in one file.

Use it free
FAQ

Frequently asked questions about the speech to text

Do I get a text file or a video?

A video. Recapo turns the spoken words into captions and burns them into the picture, so you download a finished captioned MP4 — the words live on screen with the footage, not in a separate transcript file.

Do I upload audio or a video?

Upload a video. Recapo listens to the speech in its audio, writes it out as captions, and renders those captions back into the same video — so what you download is the captioned clip.

What if a name or technical term comes out wrong?

AI recognition handles clear speech well, including most names and jargon in context. The captions are timed to the speech and burned into the final video, so the words stay matched to what's being said on screen.

How is this different from the caption generator?

Speech to Text gives you a full editable transcript of the words for use in scripts, notes, or search. The caption generator is aimed at producing time-synced subtitle lines to display on screen. Many creators transcribe first, edit the text, then build captions from it.

Ready to try Speech to Text?

Speech to text, on screen. Upload a video and AI recognizes the spoken words, turns them into captions, and burns them into the picture — giving you a finished captioned video to preview and download. Recapo.ai.

Use it free