Nonparallel Emotional Speech Conversion

11/03/2018
by   Jian Gao, et al.
0

We propose a nonparallel data-driven emotional speech conversion method. It enables the transfer of emotion-related characteristics of a speech signal while preserving the speaker's identity and linguistic content. Most existing approaches require parallel data and time alignment, which is not available in most real applications. We achieve nonparallel training based on an unsupervised style transfer technique, which learns a translation model between two distributions instead of a deterministic one-to-one mapping between paired examples. The conversion model consists of an encoder and a decoder for each emotion domain. We assume that the speech signal can be decomposed into an emotion-invariant content code and an emotion-related style code in latent space. Emotion conversion is performed by extracting and recombining the content code of the source speech and the style code of the target emotion. We tested our method on a nonparallel corpora with four emotions. Both subjective and objective evaluations show the effectiveness of our approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2021

Identity Conversion for Emotional Speakers: A Study for Disentanglement of Emotion Style and Speaker Identity

Expressive voice conversion performs identity conversion for emotional s...
research
07/25/2020

Multi-speaker Emotion Conversion via Latent Variable Regularization and a Chained Encoder-Decoder-Predictor Network

We propose a novel method for emotion conversion in speech based on a ch...
research
05/15/2020

Challenges in Emotion Style Transfer: An Exploration with a Lexical Substitution Pipeline

We propose the task of emotion style transfer, which is particularly cha...
research
11/14/2021

Textless Speech Emotion Conversion using Decomposed and Discrete Representations

Speech emotion conversion is the task of modifying the perceived emotion...
research
11/09/2022

A Diffeomorphic Flow-based Variational Framework for Multi-speaker Emotion Conversion

This paper introduces a new framework for non-parallel emotion conversio...
research
06/02/2023

In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis

Speech emotion conversion aims to convert the expressed emotion of a spo...
research
09/14/2023

EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data

Speech emotion conversion is the task of converting the expressed emotio...

Please sign up or login with your details

Forgot password? Click here to reset