In the battle for virtual assistant excellence, Google seems to be taking a considerable lead against Apple Siri and Microsoft Cortana. The Google Assistant app is the next-generation replacement of Google Now, and it comes packed with powerful artificial intelligence features that makes it smarter than its competitors. The latest upgrade is WaveNet, a computational engine that is the best voice synthesizer available to the public.
WaveNet is based on a machine learning module that has been actively improving its text-to-speech functionality for years. This speech engine is the most natural sounding of all virtual assistants, even better than the Google Now female voice, and it has been developed with artificial intelligence constructs created by DeepMind, a Google division that in recent years built a software player that could actually beat humans at the ancient game of Go.
For WaveNet, developers at DeepMind trained the speech engine under parametric conditions based on a neural engine consisting of millions of human voice samples. With the computational processing power available these days, WaveNet can process up to 16,000 samples per second, thereby generating a digital voice that has inflection, cadence and accent.
The machine learning routines developed to train WaveNet come from a neural network that analyzed certain speech properties such as intonation, accent and the use of the tongue and lips when humans speak. When Wavenet recordings were tested among human listeners, the speech engine received a mean opinion score of 4.347, which is very impressive since the listeners rated human recordings at 4.667.
It is important to note that WaveNet is a purely synthetic and parametric text-to-speech engine, which means that it is a robot voice that just happens to sound very human. The most common text-to-speech development model is concatenative, which involves recording hours upon hours of human speech outputs from a professional voice actor; this model happens to be very expensive, and this is something that Google is trying to avoid as it intends to release WaveNet options in all languages.
For the time being, two languages are available for WaveNet: American English and Japanese. The company plans to release more language packages and variants as soon as they are available. WaveNet is expected to be a major component of Google Home smart speakers, which are part of Google’s push into smart home automation. Google is also working on a voice and speech operating system that will also use WaveNet’s libraries and models.