Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech

09/21/2023
by   Rui Liu, et al.
0

Prosodic phrasing is crucial to the naturalness and intelligibility of end-to-end Text-to-Speech (TTS). There exist both linguistic and emotional prosody in natural speech. As the study of prosodic phrasing has been linguistically motivated, prosodic phrasing for expressive emotion rendering has not been well studied. In this paper, we propose an emotion-aware prosodic phrasing model, termed EmoPP, to mine the emotional cues of utterance accurately and predict appropriate phrase breaks. We first conduct objective observations on the ESD dataset to validate the strong correlation between emotion and prosodic phrasing. Then the objective and subjective evaluations show that the EmoPP outperforms all baselines and achieves remarkable performance in terms of emotion expressiveness. The audio samples and the code are available at <https://github.com/AI-S2-Lab/EmoPP>.

READ FULL TEXT
research
06/17/2021

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

Recently, there has been an increasing interest in neural speech synthes...
research
09/21/2023

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

Text-based speech editing (TSE) techniques are designed to enable users ...
research
06/01/2018

Speech-Driven Expressive Talking Lips with Conditional Sequential Generative Adversarial Networks

Articulation, emotion, and personality play strong roles in the orofacia...
research
05/12/2020

Learning and Evaluating Emotion Lexicons for 91 Languages

Emotion lexicons describe the affective meaning of words and thus consti...
research
11/24/2022

Prosody-controllable spontaneous TTS with neural HMMs

Spontaneous speech has many affective and pragmatic functions that are i...
research
02/11/2019

GET-AID: Visual Recognition of Human Rights Abuses via Global Emotional Traits

In the era of social media and big data, the use of visual evidence to d...
research
09/10/2023

Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation

Audio-driven talking-head synthesis is a popular research topic for virt...

Please sign up or login with your details

Forgot password? Click here to reset