Writing text to speech scripts is a skill. One that requires practice, but also a clear understanding of the medium. What are the benefits and limitations of text to speech? What makes it different from other kinds of informational scripts and documents?
In general, people tend to write “for reading” rather than “for speaking.” When generating text to speech voiceover using this type of writing, the audio sounds stilted. Nobody wants that.
There are several text to speech tricks to ensure your voiceover sounds more natural and less like that TikTok "text to speech robot voice". The better the script, the better the voiceover, and the longer your audience will stick around.
Here are Videate’s best practices for writing informational scripts for text to speech videos that people will actually watch:
1. Keep your script SHORT!
(Notice: all-caps, bolded, italicized, underlined, with an exclamation at the end. This one is kindof important…)
Brevity is the soul of wit. It’s also the lifeblood of your script. As we’ve said before, modern customers want to get in and get out without wasting time. The average user simply doesn’t have the attention span for a 10 minute video. So, make your videos short. And if what you need to say takes longer than a few minutes, consider chopping them up into multi-part, bite sized lessons. Be more TikTok, less Ken Burns.
2. Use short sentences
Your short script should have short sentences. We’re dealing with text to speech technology; the longer the sentences, the more likely the voiceover will sound unnatural. If we were writing a text to speech script as opposed to writing a blog, we'd trim the last sentence. And that one. Run-on sentences in text to speech voiceover make it glaringly obvious that the "robot voice" never takes a breath. It sounds weird and pulls you out of the experience. "Voice bots" don’t understand natural pauses. Yet. So, keep your sentences simple.
3. You’re showing AND telling
Text to speech videos are both a visual and an audio medium. Sometimes, you let the picture tell the story, and not everything needs to be scripted. For instance, if putting in a username and password is a step in the video, you don’t have to say “put in the username ‘User’ and the password ‘12345.’” Just say “put in a username and password” and show the username and password being typed in. The viewer will see the action taking place on screen and won't be overwhelmed with redundant information in the voiceover.
(Bonus tip: Don’t make your actual password ‘12345’…or ‘password’.)
4. Remove the emotion
You want the text to speech script to sound warm and natural, so it may seem counterintuitive to remove the emotion. However, remember that while text to speech technology sounds more and more human, there is limited (or in some cases, no) control over a voice bot’s intonation. So, it sounds super weird when the TTS voice says “Good morning!” or “Hi, there!” It’s best to remove emotional or colloquial words and phrases, because what you envision it sounding like won’t necessarily sound like you want it to.
5. KISS (Keep It Super Simple)
Your text to speech informational video is made to help people, not to show off, so keep the language simple. Avoid acronyms and industry jargon. Try to convey the information in the simplest, most direct way possible.
6. Be consistent
Building on #5, find the simplest term for something and then use it consistently. For example, “click” (or “tap” on a mobile application) should be used instead of “select” or “choose” or “pick.” Technical documentation can be intimidating to users, and they may not immediately equate “select” or “choose” or “pick” with a click or a tap. Once you've determined the best vocab word to use, make sure it's used throughout. Don't worry about it feeling redundant-- clarity and consistency is best here.
Now you know how to put together an awesome script, why not make your video with Videate? Request a demo and we’ll show you how to plug it into the platform and generate a software video, complete with voiceover, in minutes.