4) helps people listen to online blogs and content which is usally read.
In 1968, Noriko Umeda from Japan developed the first general English text-to-speech system. Since then many individuals and companies that worked on this technoloyg. However, it wasn't until the early 2000's when large amounts of speech data could be gathered by companies such as Google that TTS really took off. It was the first time there were millions of people saying the same words.
How does Text to Speech technology work?
The process works like this: the text content is first scanned content. Next, that text content is then organized using a process called "text normalization". This process assigns phonetic transcriptions to each word. Next, it divides and marks the text into prosodic units (phrases, clauses sentences).
Next, output from text normailization is driven into a conversion technology that creates the voice or the audible portion we hear. To further enchance and make the audio reading sound human additional modulations are applied that create effects of a human transitioning their speech from one work to the next, how they end a setnence that is a question, or breathing effect on the voice during a long sentence.
As you can see, its the final part with the additional modulations that truely improve the output of the audio quality. Companies like Google, Amazon, Microsoft and Voicery are leading in this space. The first 2 for instance have access to trillions of human voice interactions from their Alex Home, or Google home and phone devices. This data helps them understand how a setnence is spoken. They can even understand different dialects from a person in Malayasia vs Singapore.
Which is the best Text to Speech technology?
The quality of voice syhtesis (aka text to speech) is judged by how closely it can mimk human voice, it's various levels of sound and accent nuances.
What are other names for Text to Speech?
You'll find it commonly referred to as TTS, Speech Synthesis, or Text to Voice.