EG-GAN: Cross-Language Emotion Gain Synthesis based on Cycle-Consistent Adversarial Networks

05/27/2019
by   Xiaoqi Jia, et al.
0

Despite remarkable contributions from existing emotional speech synthesizers, we find that these methods are based on Text-to-Speech system or limited by aligned speech pairs, which suffered from pure emotion gain synthesis. Meanwhile, few studies have discussed the cross-language generalization ability of above methods to cope with the task of emotional speech synthesis in various languages. We propose a cross-language emotion gain synthesis method named EG-GAN which can learn a language-independent mapping from source emotion domain to target emotion domain in the absence of paired speech samples. EG-GAN is based on cycle-consistent generation adversarial network with a gradient penalty and an auxiliary speaker discriminator. The domain adaptation is introduced to implement the rapid migrating and sharing of emotional gains among different languages. The experiment results show that our method can efficiently synthesize high quality emotional speech from any source speech for given emotion categories, without the limitation of language differences and aligned speech pairs.

READ FULL TEXT
research
06/13/2019

Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer

Emotion is not limited to discrete categories of happy, sad, angry, fear...
research
06/17/2021

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

Recently, there has been an increasing interest in neural speech synthes...
research
07/25/2020

Non-parallel Emotion Conversion using a Deep-Generative Hybrid Network and an Adversarial Pair Discriminator

We introduce a novel method for emotion conversion in speech that does n...
research
01/09/2021

Emotion transplantation through adaptation in HMM-based speech synthesis

This paper proposes an emotion transplantation method capable of modifyi...
research
05/07/2023

SynthMix: Mixing up Aligned Synthesis for Medical Cross-Modality Domain Adaptation

The adversarial methods showed advanced performance by producing synthet...
research
10/08/2021

Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

In expressive speech synthesis, there are high requirements for emotion ...
research
10/07/2021

StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis

Recently, emotional speech synthesis has achieved remarkable performance...

Please sign up or login with your details

Forgot password? Click here to reset