Deep Text-to-Speech System with Seq2Seq Model

03/11/2019
by   Gary Wang, et al.
0

Recent trends in neural network based text-to-speech/speech synthesis pipelines have employed recurrent Seq2seq architectures that can synthesize realistic sounding speech directly from text characters. These systems however have complex architectures and takes a substantial amount of time to train. We introduce several modifications to these Seq2seq architectures that allow for faster training time, and also allows us to reduce the complexity of the model architecture at the same time. We show that our proposed model can achieve attention alignment much faster than previous architectures and that good audio quality can be achieved with a model that's much smaller in size. Sample audio available at https://soundcloud.com/gary-wang-23/sets/tts-samples-for-cmpt-419.

READ FULL TEXT
research
01/16/2020

SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis

Automatic speech synthesis is a challenging task that is becoming increa...
research
10/08/2021

Environment Aware Text-to-Speech Synthesis

This study aims at designing an environment-aware text-to-speech (TTS) s...
research
10/13/2021

A Melody-Unsupervision Model for Singing Voice Synthesis

Recent studies in singing voice synthesis have achieved high-quality res...
research
05/31/2021

Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla

Speech synthesis is one of the challenging tasks to automate by deep lea...
research
10/12/2018

A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

This paper introduces a deep neural network model for subband-based spee...
research
08/11/2020

Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems

LPCNet is an efficient vocoder that combines linear prediction and deep ...
research
10/08/2021

Phone-to-audio alignment without text: A Semi-supervised Approach

The task of phone-to-audio alignment has many applications in speech res...

Please sign up or login with your details

Forgot password? Click here to reset