What is text to speech?

What is Text to Speech (TTS)?

Text to Speech (TTS): converts text content it to audible format that you can listen to.

Who uses Text to Speech technology?
TTS is used in various fields for different applications. Below are the 4 most common use cases.

1) helps the visually impaired listen to written content
2) helps children with reading
3) helps older people understand written text
4) helps people listen to online blogs and content which is usally read.
5) helps people with reading diabilities

A Brief History Text to Speech technology:
In 1968, Noriko Umeda from Japan developed the first general English text-to-speech system. Since then many individuals and companies that worked on this technoloyg. However, it wasn't until the early 2000's when large amounts of speech data could be gathered by companies such as Google that TTS really took off. It was the first time there were millions of people saying the same words. 

How does Text to Speech technology work?
The process works like this: the text content is first scanned content. Next, that text content is then organized using a process called "text normalization". This process assigns phonetic transcriptions to each word. Next, it divides and marks the text into prosodic units (phrases, clauses sentences). 

Next, output from text normailization is driven into a conversion technology that creates the voice or the audible portion we hear. To further enchance and make the audio reading sound human additional modulations are applied that create effects of a human transitioning their speech from one work to the next, how they end a setnence that is a question, or breathing effect on the voice during a long sentence.

As you can see, its the final part with the additional modulations that truely improve the output of the audio quality.  Companies like Google, Amazon, Microsoft and Voicery are leading in this space.  The first 2 for instance have access to trillions of human voice interactions from their Alex Home, or Google home and phone devices. This data helps them understand how a setnence is spoken. They can even understand different dialects from a person in Malayasia vs Singapore.

Which is the best Text to Speech technology?
The quality of voice syhtesis (aka text to speech) is judged by how closely it can mimk human voice, it's various levels of sound and accent nuances.

What are other names for Text to Speech?
You'll find it commonly referred to as TTS, Speech Synthesis, or Text to Voice.

Examples of Text to Speech solutions:
AudiBrow- listen to written web content from blogs and content sites
Audible- listen to written books

Install, Listen & Stay on top of News!