Emotional End-to-End Neural Speech Synthesizer

11/15/2017
by   Younggun Lee, et al.
0

In this paper, we introduce an emotional speech synthesizer based on the recent end-to-end neural model, named Tacotron. Despite its benefits, we found that the original Tacotron suffers from the exposure bias problem and irregularity of the attention alignment. Later, we address the problem by utilization of context vector and residual connection at recurrent neural networks (RNNs). Our experiments showed that the model could successfully train and generate speech for given emotion labels.

READ FULL TEXT

page 2

page 4

research
06/13/2019

Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer

Emotion is not limited to discrete categories of happy, sad, angry, fear...
research
01/10/2023

Generative Emotional AI for Speech Emotion Recognition: The Case for Synthetic Emotional Speech Augmentation

Despite advances in deep learning, current state-of-the-art speech emoti...
research
11/05/2019

emotional speech synthesis with rich and granularized control

This paper proposes an effective emotion control method for an end-to-en...
research
12/20/2022

Emotion Selectable End-to-End Text-based Speech Editing

Text-based speech editing allows users to edit speech by intuitively cut...
research
06/26/2019

End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training

This paper proposes an end-to-end emotional speech synthesis (ESS) metho...
research
10/28/2022

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Several fully end-to-end text-to-speech (TTS) models have been proposed ...
research
10/19/2021

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

End-to-end TTS suffers from high data requirements as it is difficult fo...

Please sign up or login with your details

Forgot password? Click here to reset