Mixed Emotion Modelling for Emotional Voice Conversion

10/25/2022
by   Kun Zhou, et al.
0

Emotional voice conversion (EVC) aims to convert the emotional state of an utterance from one emotion to another while preserving the linguistic content and speaker identity. Current studies mostly focus on modelling the conversion between several specific emotion types. Synthesizing mixed effects of emotions could help us to better imitate human emotions, and facilitate more natural human-computer interaction. In this research, for the first time, we formulate and study the research problem of mixed emotion synthesis for EVC. We regard emotional styles as a series of emotion attributes that are learnt from a ranking-based support vector machine (SVM). Each attribute measures the degree of the relevance between the speech recordings belonging to different emotion types. We then incorporate those attributes into a sequence-to-sequence (seq2seq) emotional voice conversion framework. During the training, the framework not only learns to characterize the input emotional style, but also quantifies its relevance with other emotion types. At run-time, various emotional mixtures can be produced by manually defining the attributes. We conduct objective and subjective evaluations to validate our idea in terms of mixed emotion synthesis. We further build an emotion triangle as an application of emotion transition. Codes and speech samples are publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2022

Speech Synthesis with Mixed Emotions

Emotional speech synthesis aims to synthesize human voices with various ...
research
01/10/2022

Emotion Intensity and its Control for Emotional Voice Conversion

Emotional voice conversion (EVC) seeks to convert the emotional state of...
research
03/29/2022

An Overview Analysis of Sequence-to-Sequence Emotional Voice Conversion

Emotional voice conversion (EVC) focuses on converting a speech utteranc...
research
01/09/2021

Spanish expressive voices: Corpus for emotion research in spanish

A new emotional multimedia database has been recorded and aligned. The d...
research
01/14/2021

EmoCat: Language-agnostic Emotional Voice Conversion

Emotional voice conversion models adapt the emotion in speech without ch...
research
09/14/2023

StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings

Voice conversion (VC) transforms an utterance to sound like another pers...
research
06/15/2022

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

Emotion classification of speech and assessment of the emotion strength ...

Please sign up or login with your details

Forgot password? Click here to reset