ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings

05/23/2023
by   Yuki Saito, et al.
0

We propose ChatGPT-EDSS, an empathetic dialogue speech synthesis (EDSS) method using ChatGPT for extracting dialogue context. ChatGPT is a chatbot that can deeply understand the content and purpose of an input prompt and appropriately respond to the user's request. We focus on ChatGPT's reading comprehension and introduce it to EDSS, a task of synthesizing speech that can empathize with the interlocutor's emotion. Our method first gives chat history to ChatGPT and asks it to generate three words representing the intention, emotion, and speaking style for each line in the chat. Then, it trains an EDSS model using the embeddings of ChatGPT-derived context words as the conditioning features. The experimental results demonstrate that our method performs comparably to ones using emotion labels or neural network-derived context embeddings learned from chat histories. The collected ChatGPT-derived context information is available at https://sarulab-speech.github.io/demo_ChatGPT_EDSS/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2022

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History

We propose an end-to-end empathetic dialogue speech synthesis (DSS) mode...
research
06/10/2019

CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification

Detecting emotion from dialogue is a challenge that has not yet been ext...
research
04/21/2021

MoonGrad at SemEval-2019 Task 3: Ensemble BiRNNs for Contextual Emotion Detection in Dialogues

When reading “I don’t want to talk to you any more”, we might interpret ...
research
03/28/2022

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

We present STUDIES, a new speech corpus for developing a voice agent tha...
research
09/26/2022

Modeling Content-Emotion Duality via Disentanglement for Empathetic Conversation

The task of empathetic response generation aims to understand what feeli...
research
09/05/2018

Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection

In this paper, we present the system we have used for the Implicit WASSA...
research
07/05/2023

Going Retro: Astonishingly Simple Yet Effective Rule-based Prosody Modelling for Speech Synthesis Simulating Emotion Dimensions

We introduce two rule-based models to modify the prosody of speech synth...

Please sign up or login with your details

Forgot password? Click here to reset