Voice Studio
Voice Studio turns written words into spoken audio. It’s the studio that takes a Forge script and produces the voice-over you’d hear in a video, a podcast, a voice-note, or any other audio destination.
Voice Studio is one of the four specialised studios inside Production Studio.
What Voice Studio produces
Section titled “What Voice Studio produces”Voice Studio produces audio takes — .mp3 (or equivalent) files that contain a spoken version of a written piece. The voice character, pacing, and quality depend on which voice engine you’ve selected:
- Edge voices — Microsoft Edge’s online voice catalogue. Wide selection of accents and speakers. Fast, reliable, no local resources needed.
- Kokoro voices — locally-hosted neural voices, with their own roster of speakers. More distinctive sound; runs on your own instance.
Other engines may be available depending on your instance configuration. Your AI knows which are active and will suggest a default; you can switch per piece or set a standing preference.
A typical use
Section titled “A typical use”You’ve drafted a script in Forge. The script is approved. From there:
“Read this with Edge → Jenny, slightly slower pacing, and add 200ms of breath between paragraphs.”
Voice Studio produces an audio take. You play it back, push back on anything that’s off (“the second sentence sounds rushed”), and Voice Studio re-takes — same script, refined delivery.
If the script is part of an Assignment, the audio take is attached automatically and downstream studios (typically Video Studio) pick it up the moment you approve it.
Line-level re-takes
Section titled “Line-level re-takes”Voice Studio’s defining iteration capability: you don’t re-record the whole voice take to fix one line. You annotate the specific line that’s off, write a note about what’s wrong (“this lands too rushed”, “more breath before this one”, “rising inflection at the end”), and Voice Studio re-takes only that line.
The new take splices into the existing audio at the right position. Everything you’d already approved stays exactly as recorded — same voice, same pacing, same room tone. Only the marked line is new.
A typical revision loop:
- Listen to the take.
- Drop a line-level annotation on each line that needs adjustment, with a short note on what’s off.
- Click Request revision in the gate footer. Voice Studio bundles the annotations as feedback and re-takes only the marked lines.
- The unmarked lines remain untouched. Resolved annotations get cleared.
This is the same pattern across all four Production Studio studios. It’s why iterating on a voice take is cheap — three lines to re-record is three surgical re-takes, not a full re-record.
Standalone uses
Section titled “Standalone uses”Voice Studio is also useful one-off — outside any Assignment:
- A quick voice-memo to share with someone.
- An audio version of a memo you’ve written.
- A pronunciation test for a brand name or a foreign word.
- An ad-hoc voice clip for a presentation or a meeting.
For one-shot work, just say what you want read and which voice — no Assignment needed.
When the script changes
Section titled “When the script changes”If the upstream script changes after a voice take has been recorded, that take is now stale. Production Studio tracks this: the Assignment shows the audio is out-of-date, and Voice Studio offers to re-take with the updated script.
This is the main reason Voice Studio lives inside Production Studio rather than as a one-off tool. The coordination — “the script changed, the voice is stale, re-take?” — is what saves you from shipping work that’s drifted.
Choosing a voice
Section titled “Choosing a voice”The choice of voice carries weight. The same script narrated by two different voices reads differently to the audience — one might land warm and conversational, another precise and authoritative. A few practical notes:
- Pick a default and stick with it for a series. Customers who switch voices between episodes of a podcast or videos in a campaign create a subtle dissonance for the audience. Consistency reads as professionalism.
- Match the voice to the content. A tutorial benefits from clarity; a story benefits from warmth; a launch announcement benefits from energy.
- Listen to a sample paragraph before committing. Each voice has tics that show up over a longer take.
Related
Section titled “Related”- Production Studio — the coordination layer Voice Studio lives in.
- Forge — where the scripts that feed Voice Studio originate.
- Video Studio — the most common downstream destination for voice takes.
- Assignments — how voice takes stay in sync with their scripts.