3 min read

7 Best Practices: Writing for Text To Speech Voices

Featured Image

Writing text to speech scripts is a skill. One that requires practice, but also a clear understanding of the medium. What are the benefits and limitations of text to speech? What makes it different from other kinds of informational scripts and documents?

In general, people tend to write “for reading” rather than “for speaking.” When generating text to speech voiceover using this type of writing, the audio sounds stilted. Nobody wants that. 

There are a lot of tips and tricks for getting your text to speech voiceover to sound more natural and less robotic. The better the script, the better the voiceover, and the longer your audience will stick around.

Here are Videate’s best practices for writing informational scripts for text to speech videos that people will actually watch:

1. Keep your script SHORT!

(Notice: all-caps, bolded, italicized, underlined, with an exclamation at the end.  This one is kind of important…)

Brevity is the soul of wit. It’s also the lifeblood of your script. As we’ve said before, modern customers want to get in and get out without wasting time. The average user simply doesn’t have the attention span for a 10 minute video. So, make your videos short. And if what you need to say takes longer than a few minutes, consider chopping them up into multi-part, bite sized lessons.  Be more TikTok, less Ken Burns.


2. Use short sentences

Your short script should have short sentences. We’re dealing with text to speech technology; the longer the sentences, the more likely the voiceover will sound unnatural. If we were writing a text to speech script as opposed to writing a blog, we'd trim the last sentence.  And that one. Run-on sentences in text to speech voiceover make it glaringly obvious that the "robot voice" never takes a breath. It sounds weird and pulls you out of the experience. "Voice bots" don’t understand natural pauses. Yet. So, keep your sentences simple.

Download now ebook guide creating effective software training videos

3. You’re showing AND telling

Text to speech videos are both a visual and an audio medium. Sometimes, you let the picture tell the story, and not everything needs to be scripted. For instance, if putting in a username and password is a step in the video, you don’t have to say “put in the username ‘User’ and the password ‘12345.’” Just say “put in a username and password” and show the username and password being typed in. The viewer will see the action taking place on screen and won't be overwhelmed with redundant information in the voiceover.

(Bonus tip: Don’t make your actual password ‘12345’…or ‘password’.)


4. Remove the emotion

You want the text to speech script to sound warm and natural, so it may seem counterintuitive to remove the emotion. However, remember that while text to speech technology sounds more and more human, there is limited (or in some cases, no) control over a voice bot’s intonation. So, it sounds super weird when the TTS voice says “Good morning!” or “Hi, there!” It’s best to remove emotional or colloquial words and phrases, because what you envision it sounding like won’t necessarily sound like you want it to. 


5. KISS (Keep It Super Simple)

Your text to speech informational video is made to help people, not to show off, so keep the language simple. Avoid acronyms and industry jargon. Try to convey the information in the simplest, most direct way possible. 


6. Be consistent

Building on #5, find the simplest term for something and then use it consistently. For example, “click” (or “tap” on a mobile application) should be used instead of “select” or “choose” or “pick.” Technical documentation can be intimidating to users, and they may not immediately equate “select” or “choose” or “pick” with a click or a tap. Once you've determined the best vocab word to use, make sure it's used throughout. Don't worry about it feeling redundant-- clarity and consistency is best here.


7. Pick the right TTS voice

Not all voices are created equal. There are many options to choose from when it comes to text to speech voices. Picking the right text to speech voice is critical to your brand. So, take the time to see what voice works best for your video. Listen to samples from sites like Videate’s voice sampler, and review your final video. If you've got wiggle room in your brand to swap out voices, don't be afraid to do so.


8. Request a demo

Now you know how to put together an awesome script, why not make your video with Videate? Request a demo and we’ll show you how to plug it into the platform and generate a software video, complete with voiceover, in minutes.


4 min read

What are the Benefits of AI Generated Content for SaaS Companies?

AI generated content – it’s currently the biggest, buzziest subject in the worlds of business, technology, and art. It’s controversial, but it’s also...

Read More

1 min read

How-to video localization: Faster, better stronger– but how?

Did you know that Procore used to need half a year to localize a single customer education course? 

Key words: used to.

Procore underwent a...

Read More

2 min read

Why do software how-to videos take so long to make?

What is the most precious resource? Is it data? Money? Technology? Nope-- it's time.

In the 2022 State of SaaS Video Report, we saw a recurring...

Read More