3 min read

7 Best Practices: Writing for Text To Speech Voices

7 Best Practices: Writing for Text To Speech Voices


In general, people tend to write “for reading” rather than “for speaking.” When generating text to speech voiceover using this type of writing, the audio sounds stilted. Nobody wants that. 

Writing text to speech scripts is a skill.

One that requires practice, but also a clear understanding of the medium. What are the benefits and limitations of text to speech (TTS)? What makes it different from other kinds of informational scripts and documents?

There are several text to speech tricks to ensure your voiceover sounds more natural and less like that TikTok "text to speech robot voice". The better the script, the better the voiceover, and the longer your audience will stick around.

Here are Videate’s best practices for writing awesome scripts for text to speech videos that people will actually watch:


1. Keep your script SHORT!

(Notice: all-caps, bolded, italicized, underlined, with an exclamation at the end.  This one is kind of important…)

Brevity is the soul of wit. It’s also the lifeblood of your script. As we’ve said before, modern customers want to get in and get out without wasting time. The average user simply doesn’t have the attention span for a 10 minute video. So, make your videos short. And if what you need to say takes longer than a few minutes, consider chopping them up into multi-part, bite sized lessons.  Be more TikTok, less Ken Burns.


2. Use short sentences

This TTS trick sounds like the previous but I promise it's not. It's...similar.

Your short script should have short sentences.

We’re dealing with text to speech technology; the longer the sentences, the more likely the voiceover will sound unnatural. If we were writing a text to speech script as opposed to writing a blog, we'd trim the last sentence.  And that one. Run-on sentences in text to speech voiceover make it glaringly obvious that the "robot voice" never takes a breath. It sounds weird and pulls you out of the experience. "Voice bots" don’t understand natural pauses. Yet. So, keep your sentences simple.


How to create effective software training videos eBook


3. You’re showing AND telling

Text to speech videos are both a visual and an audio medium. Sometimes, you let the picture tell the story, and not everything needs to be scripted. For instance, if putting in a username and password is a step in the video, you don’t have to say “put in the username ‘User’ and the password ‘12345.’” Just say “put in a username and password” and show the username and password being typed in. The viewer will see the action taking place on screen and won't be overwhelmed with redundant information in the voiceover.

(Bonus tip: Don’t make your actual password ‘12345’…or ‘password’.)


4. Remove the emotion

You want the text to speech script to sound warm and natural, so it may seem counterintuitive to remove the emotion. However, remember that while text to speech technology sounds more and more human, there is limited (or in some cases, no) control over a voice bot’s intonation. So, it sounds super weird when the TTS voice says “Good morning!” or “Hi, there!” It’s best to remove emotional or colloquial words and phrases, because what you envision it sounding like won’t necessarily sound like you want it to. 


5. KISS (Keep It Super Simple)

Your text to speech informational video is made to help people, not to show off, so keep the language simple. Avoid acronyms and industry jargon. Try to convey the information in the simplest, most direct way possible. 


6. Be consistent

Building on #5, find the simplest term for something and then use it consistently. For example, “click” (or “tap” on a mobile application) should be used instead of “select” or “choose” or “pick.” Technical documentation can be intimidating to users, and they may not immediately equate “select” or “choose” or “pick” with a click or a tap. Once you've determined the best vocab word to use, make sure it's used throughout. Don't worry about it feeling redundant-- clarity and consistency is best here.


7. Pick the right TTS voice

Not all voices are created equal. There are many options to choose from when it comes to text to speech voices. Picking the right text to speech voice is critical to your brand. So, take the time to see what voice works best for your video. Listen to samples from sites like Videate’s voice sampler, and review your final video. If you've got wiggle room in your brand to swap out voices, don't be afraid to do so.


8. Request a demo

Now you know how to put together an awesome script, why not make your video with Videate? Request a demo and we’ll show you how to plug it into the platform and generate a software video, complete with voiceover, in minutes.