New Learnings about Scaling Video Translation

When we got started 2+ years ago we made a couple of strategic bets. One was that text-to-speech voices would come of age and they have. Another was that translating software videos could be scalable and sustainable. We are seeing a very sharp increase in the number of customers wanting to translate their videos into multiple languages. Here are some of our recent learnings about video translation.

The first scenario was making native language videos of a native language user interface. Videate has now been used to produce such videos in Spanish, French (with different versions for France and Canada), and Italian. The videos were reviewed by native language speakers in the local country’s go-to-market teams, and in each case, at least one of the latest neural (machine learning) voices from our text-to-speech (TTS) providers was approved!

One of the learnings was that some English words that don't get translated may still need to have language-specific phonemes added. In one example the phrase “E-Learning” was used, but it was not pronounced correctly in some text-to-speech TTS voices. We now know we need to support language-specific as well as global phoneme replacements to accommodate different TTS capabilities. Using Videate and adjusting a few words is a lot faster and much less expensive making videos manually for each language.

The next scenario was creating native language audio for English user interfaces. Some customers only offer English UIs but still want to produce videos with audio in other languages. Since the length of the words will be different, Videate can generate a new video where the voice is naturally synchronized with the navigation. The learnings from this experience were that we had to make some enhancements to our document ingestion process to handle special characters in Japanese and right-to-left strings in Arabic. And as expected, the German videos are longer than the English ones!

Another scenario was capturing screenshots for language-specific support portals. The customer had translated the words in their documentation into other languages but did not have the resources to update the screenshots in every language. This forced them to use the English screenshots regardless of language. With Videate, we are able to automate the capture of screenshots as we are making videos. This means that their knowledge base articles can have screenshots that are always up to date in the local language.

Another learning from all of this is that the starting point of your source content will likely determine the translation process. Here is the spectrum of source materials we have seen:

(1) technical documentation in structured formats like DITA/XML

(2) help articles written in Google, Word, or HTML formats

(3) video scripts that are written in a storyboard format

(4) transcriptions of existing videos. 

Videate can make native language videos and capture screenshots in a sustainable and scalable way regardless of your source content format. And whether you use a sophisticated translation management tool (service) or simply Google translate, because we operate at the sentence or paragraph level, the quality of the audio will be very good. There may need to be a few adjustments since “do not translate” doesn’t mean do not change the pronunciation, but overall it is quite efficient.

People around the world are now used to computer-generated voices. You hear Siri, Alexa, navigation applications, and even radio ads made by TTS voices. There is a viable new way to keep videos up to date in other languages. The costs are a fraction of hiring a third party or making them manually yourself. Schedule a demo with Videate to learn more!