U-Singer: Multi-Singer Singing Voice Synthesizer that Controls Emotional Intensity

03/02/2022
by   Sungjae Kim, et al.
0

We propose U-Singer, the first multi-singer emotional singing voice synthesizer that expresses various levels of emotional intensity. During synthesizing singing voices according to the lyrics, pitch, and duration of the music score, U-Singer reflects singer characteristics and emotional intensity by adding variances in pitch, energy, and phoneme duration according to singer ID and emotional intensity. Representing all attributes by conditional residual embeddings in a single unified embedding space, U-Singer controls mutually correlated style attributes, minimizing interference. Additionally, we apply emotion embedding interpolation and extrapolation techniques that lead the model to learn a linear embedding space and allow the model to express emotional intensity levels not included in the training data. In experiments, U-Singer synthesized high-fidelity singing voices reflecting the singer ID and emotional intensity. The visualization of the unified embedding space exhibits that U-singer estimates the correct variations in pitch and energy highly correlated with the singer ID and emotional intensity level. The audio samples are presented at https://u-singer.github.io.

READ FULL TEXT

page 5

page 10

page 15

research
06/21/2021

UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control

We propose a novel high-fidelity expressive speech synthesis model, UniT...
research
01/10/2022

Emotion Intensity and its Control for Emotional Voice Conversion

Emotional voice conversion (EVC) seeks to convert the emotional state of...
research
11/11/2022

Continuous Emotional Intensity Controllable Speech Synthesis using Semi-supervised Learning

With the rapid development of the speech synthesis system, recent text-t...
research
06/28/2023

EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech

State-of-the-art speech synthesis models try to get as close as possible...
research
11/15/2020

Direct Classification of Emotional Intensity

In this paper, we present a model that can directly predict emotion inte...
research
11/11/2019

Emotional Voice Conversion using multitask learning with Text-to-speech

Voice conversion (VC) is a task to transform a person's voice to differe...
research
12/16/2020

How the emotion's type and intensity affect rumor spreading

The implication and contagion effect of emotion cannot be ignored in rum...

Please sign up or login with your details

Forgot password? Click here to reset