October 8, 2017 weblog

Google leverages WaveNet model's gains, sounds seem more natural

by Nancy Owano , Tech Xplore

(Tech Xplore)—DeepMind's artificial intelligence talents have been working up capabilities for a consumer product. Sam Shead, Senior Technology Reporter for Business Insider UK, said Google applied software developed by DeepMind for use in its virtual assistant.

DeepMind, the AI company, has a version of a WaveNet system for American English and Japanese, according to a blog post published on Wednesday. They said, "we are proud to announce that an updated version of WaveNet is being used to generate the Google Assistant voices for US English and Japanese across all platforms."

"Google has been slow to integrate DeepMind's technology into its products, with just one data centre efficiency project announced so far, albeit on a global scale," said Shead. "Now the company's WaveNet neural network is being used to generate the Google Assistant voices for US English and Japanese."

Google Assistant is a virtual personal assistant developed by Google.

Pocket-lint described Google Assistant as a voice-controlled smart assistant. "It's considered an upgrade or an extension of Google Now - designed to be personal - while expanding on Google's existing 'OK Google' voice controls."

The DeepMind blog post was from Aäron van den Oord, research scientist, Tom Walters, research scientist, and Trevor Strohman, Google Speech software engineer.

The update they talk about is by the DeepMind WaveNet research and engineering teams, together with the Google Text-to-Speech team.

WaveNet

WaveNet has come a long way in a short time.

Just over a year ago, WaveNet was presented, a deep neural network generating raw audio waveforms and capable of producing speech.

How they built it: A convolutional neural network was trained on a large dataset of speech samples. The goal was more natural-sounding speech than in existing techniques. In their original paper, they said it "creates individual waveforms from scratch, one sample at a time, with 16,000 samples per second and seamless transitions between individual sounds."

As the blog authors put it, "WaveNet showed promise but was not something we could deploy in the real world." It was "too computationally intensive" for use in consumer products. The team got busy to improve the model. They said it now can run "at scale and is the first product to launch on Google's latest TPU cloud infrastructure."

Key gains:

"The new, improved WaveNet model still generates a raw waveform but at speeds 1,000 times faster than the original model, meaning it requires just 50 milliseconds to create one second of speech."

Ryan Whitwam in ExtremeTech: "DeepMind promises a full paper soon that will detail how this was accomplished."

Also, the results are more natural sounding according to tests with human listeners, they blogged.

Whitwam remarked on Friday: "The voice model used in Assistant at launch wasn't bad, but Google just rolled a vastly improved version of the voices for English and Japanese."

The blog has some interesting summaries of how far the technology has come.

As for current text to speech systems they noted that concatenative TTS not only results in unnatural sounding voices but such systems are hard to modify: a new database needs to be recorded each time there is a shift, such as new emotions or intonations.

To overcome some of these problems, they said an alternative model, parametric TTS, is sometimes used. This approach uses rules and parameters about mouth movements and grammar to deliver—with voices that do not sound altogether natural.

There there's WaveNet.

So, DeepMind, what's next? They said this is just the start for WaveNet. They said they were excited over possibilities that "the power of a voice interface could now unlock for all the world's languages."

More information: deepmind.com/blog/wavenet-laun … es-google-assistant/

Citation: Google leverages WaveNet model's gains, sounds seem more natural (2017, October 8) retrieved 18 April 2024 from https://techxplore.com/news/2017-10-google-leverages-wavenet-gains-natural.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

You may well ask. Who, not what, is talking?

18 shares

Feedback to editors

How 3D printers can give robots a soft touch

6 hours ago

Clearing the air: Wind farms more land efficient than previously thought

7 hours ago

National roaming can increase resilience of Dutch mobile networks

8 hours ago

Researchers use machine learning to create a fabric-based touch sensor

9 hours ago

Retro-reflectors could help future cities keep their cool

12 hours ago

New material for hydrogen storage confines this clean yet troublesome fuel

12 hours ago

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

13 hours ago

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

14 hours ago

A rimless wheel robot that can reliably overcome steps

17 hours ago

Student engineering team successfully builds and runs hydrogen-powered engine

20 hours ago

Load comments (0)

Google leverages WaveNet model's gains, sounds seem more natural

How 3D printers can give robots a soft touch

Clearing the air: Wind farms more land efficient than previously thought

National roaming can increase resilience of Dutch mobile networks

Researchers use machine learning to create a fabric-based touch sensor

Retro-reflectors could help future cities keep their cool

New material for hydrogen storage confines this clean yet troublesome fuel

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

You may well ask. Who, not what, is talking?

Alphabet's DeepMind forms ethics unit for artificial intelligence

Google Home's assistant can now recognize different voices

Google buys artificial intelligence firm DeepMind

Google Brain posse takes neural network approach to translation

Google teams with Oxford to teach machines to think

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Advance in light-based computing shows capabilities for future smart cameras

Researchers develop stretchable quantum dot display

Engineers recreate Star Trek's Holodeck using ChatGPT and video game assets

New 3D-printing method makes printing objects more affordable and eco-friendly

New quantum material promises up to 190% quantum efficiency in solar cells

Phys.org

Medical Xpress

Science X

Google leverages WaveNet model's gains, sounds seem more natural

How 3D printers can give robots a soft touch

Clearing the air: Wind farms more land efficient than previously thought

National roaming can increase resilience of Dutch mobile networks

Researchers use machine learning to create a fabric-based touch sensor

Retro-reflectors could help future cities keep their cool

New material for hydrogen storage confines this clean yet troublesome fuel

Novel method proposed to design high-efficiency guest components for ternary organic solar cells

Researchers develop energy-efficient probabilistic computer by combining CMOS with stochastic nanomagnet

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

Related Stories

You may well ask. Who, not what, is talking?

Alphabet's DeepMind forms ethics unit for artificial intelligence

Google Home's assistant can now recognize different voices

Google buys artificial intelligence firm DeepMind

Google Brain posse takes neural network approach to translation

Google teams with Oxford to teach machines to think

Recommended for you

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Advance in light-based computing shows capabilities for future smart cameras

Researchers develop stretchable quantum dot display

Engineers recreate Star Trek's Holodeck using ChatGPT and video game assets

New 3D-printing method makes printing objects more affordable and eco-friendly

New quantum material promises up to 190% quantum efficiency in solar cells

Your Privacy