
-
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Although neural end-to-end text-to-speech models can synthesize highly n...
read it
-
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling
This paper presents Non-Attentive Tacotron based on the Tacotron 2 text-...
read it
-
WaveGrad: Estimating Gradients for Waveform Generation
This paper introduces WaveGrad, a conditional model for waveform generat...
read it
-
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
This paper proposes a hierarchical, fine-grained and interpretable laten...
read it
-
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Recent neural text-to-speech (TTS) models with fine-grained latent featu...
read it
-
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
We present a multispeaker, multilingual text-to-speech (TTS) synthesis m...
read it
-
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
This paper introduces a new speech corpus called "LibriTTS" designed for...
read it
-
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Lingvo is a Tensorflow framework offering a complete solution for collab...
read it
-
Hierarchical Generative Modeling for Controllable Speech Synthesis
This paper proposes a neural end-to-end text-to-speech (TTS) model which...
read it
-
Sample Efficient Adaptive Text-to-Speech
We present a meta-learning approach for adaptive text-to-speech (TTS) wi...
read it
-
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
The recently-developed WaveNet architecture is the current state of the ...
read it
-
WaveNet: A Generative Model for Raw Audio
This paper introduces WaveNet, a deep neural network for generating raw ...
read it
-
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
Acoustic models based on long short-term memory recurrent neural network...
read it