Parallel Neural Text-to-Speech

05/21/2019
by   Kainan Peng, et al.
0

In this work, we propose a non-autoregressive seq2seq model that converts text to spectrogram. It is fully convolutional and obtains about 17.5 times speed-up over Deep Voice 3 at synthesis while maintaining comparable speech quality using a WaveNet vocoder. Interestingly, it has even fewer attention errors than the autoregressive model on the challenging test sentences. Furthermore, we build the first fully parallel neural text-to-speech system by applying the inverse autoregressive flow (IAF) as the parallel neural vocoder. Our system can synthesize speech from text through a single feed-forward pass. We also explore a novel approach to train the IAF from scratch as a generative model for raw waveform, which avoids the need for distillation from a separately trained WaveNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2018

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech

In this work, we propose an alternative solution for parallel wave gener...
research
05/22/2020

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet hav...
research
05/22/2019

FastSpeech: Fast, Robust and Controllable Text to Speech

Neural network based end-to-end text to speech (TTS) has significantly i...
research
12/12/2018

FPUAS : Fully Parallel UFANS-based End-to-End Acoustic System with 10x Speed Up

A lightweight end-to-end acoustic system is crucial in the deployment of...
research
12/07/2020

EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture

In this work, we address the Text-to-Speech (TTS) task by proposing a no...
research
10/06/2021

Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS

Neural text-to-speech (TTS) synthesis can generate speech that is indist...
research
10/25/2019

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

We propose Parallel WaveGAN, a distillation-free, fast, and small-footpr...

Please sign up or login with your details

Forgot password? Click here to reset