Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue

12/07/2022
by   Daxin Tan, et al.
0

Entrainment is the phenomenon by which an interlocutor adapts their speaking style to align with their partner in conversations. It has been found in different dimensions as acoustic, prosodic, lexical or syntactic. In this work, we explore and utilize the entrainment phenomenon to improve spoken dialogue systems for voice assistants. We first examine the existence of the entrainment phenomenon in human-to-human dialogues in respect to acoustic feature and then extend the analysis to emotion features. The analysis results show strong evidence of entrainment in terms of both acoustic and emotion features. Based on this findings, we implement two entrainment policies and assess if the integration of entrainment principle into a Text-to-Speech (TTS) system improves the synthesis performance and the user experience. It is found that the integration of the entrainment principle into a TTS system brings performance improvement when considering acoustic features, while no obvious improvement is observed when considering emotion features.

READ FULL TEXT
research
12/13/2021

Detecting Emotion Carriers by Combining Acoustic and Lexical Representations

Personal narratives (PN) - spoken or written - are recollections of fact...
research
03/28/2022

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

We present STUDIES, a new speech corpus for developing a voice agent tha...
research
05/02/2018

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

In this paper, we comprehensively describe the methodology of our submis...
research
03/01/2023

I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue

Current Spoken Dialogue Systems (SDSs) often serve as passive listeners ...
research
07/02/2022

Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation

Estimating dimensional emotions, such as activation, valence and dominan...
research
09/14/2023

Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel Emotion-Preserving Voice Conversion

Speech anonymisation prevents misuse of spoken data by removing any pers...
research
09/29/2019

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews

Bipolar disorder, a severe chronic mental illness characterized by patho...

Please sign up or login with your details

Forgot password? Click here to reset