Generate speech in any voice, from text or a short sample.

Voice Cloning

Generate speech in any voice, from text or a short sample.

Turn text or a short voice recording into natural-sounding speech in your own voice, or a licensed synthetic voice.

How it works, under the hood

From a short reference sample, the model builds a voice profile capturing pitch range, cadence, and characteristic pronunciation — not a static "voice font," but a profile that adapts its delivery to context. Longer or cleaner reference samples produce a closer match; a noisy 10-second phone clip clones recognizably but with less nuance than a clean 60-second studio sample.

What it’s good for

Narrating video or course content without hiring a voice actor
Prototyping dialogue before a final recording session
Localizing content into a consistent voice across languages

Can I clone a voice from a public figure or celebrity?: No — voice cloning requires the consent of the person whose voice is being cloned. Use a licensed library voice instead.
How short can the reference sample be?: 15 seconds is the technical minimum; quality improves noticeably up to about 60 seconds, with diminishing returns beyond that.

Built for your use case

Voice Cloning for Audiobooks & Podcasts

Narrate long-form scripts with consistent pacing and emphasis across hours of runtime — built for chapters, not 15-second clips.

Ready to get started?

Start free — no credit card required.

Get started free