emotional speech synthesis with rich and granularized control

11/05/2019
by   Se-Yun Um, et al.
0

This paper proposes an effective emotion control method for an end-to-end text-to-speech (TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is essential to determine embedding vectors representing the TTS input. We introduce an inter-to-intra emotional distance ratio algorithm to the embedding vectors that can minimize the distance to the target emotion category while maximizing its distance to the other emotion categories. To further enhance the expressiveness of a target speech, we also introduce an effective interpolation technique that enables the intensity of a target emotion to be gradually changed to that of neutral speech. Subjective evaluation results in terms of emotional expressiveness and controllability show the superiority of the proposed algorithm to the conventional methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2019

Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer

Emotion is not limited to discrete categories of happy, sad, angry, fear...
research
03/02/2023

Fine-grained Emotional Control of Text-To-Speech: Learning To Rank Inter- And Intra-Class Emotion Intensities

State-of-the-art Text-To-Speech (TTS) models are capable of producing hi...
research
06/26/2019

End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training

This paper proposes an end-to-end emotional speech synthesis (ESS) metho...
research
11/17/2020

Controllable Emotion Transfer For End-to-End Speech Synthesis

Emotion embedding space learned from references is a straightforward app...
research
06/01/2018

Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks

Articulation, emotion, and personality play strong roles in the orofacia...
research
11/15/2017

Emotional End-to-End Neural Speech Synthesizer

In this paper, we introduce an emotional speech synthesizer based on the...
research
06/30/2022

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

This paper proposes an effective emotional text-to-speech (TTS) system w...

Please sign up or login with your details

Forgot password? Click here to reset