HiGRU: Hierarchical Gated Recurrent Units for Utterance-level Emotion Recognition

04/09/2019
by   Wenxiang Jiao, et al.
0

In this paper, we address three challenges in utterance-level emotion recognition in dialogue systems: (1) the same word can deliver different emotions in different contexts; (2) some emotions are rarely seen in general dialogues; (3) long-range contextual information is hard to be effectively captured. We therefore propose a hierarchical Gated Recurrent Unit (HiGRU) framework with a lower-level GRU to model the word-level inputs and an upper-level GRU to capture the contexts of utterance-level embeddings. Moreover, we promote the framework to two variants, HiGRU with individual features fusion (HiGRU-f) and HiGRU with self-attention and features fusion (HiGRU-sf), so that the word/utterance-level individual inputs and the long-range contextual information can be sufficiently utilized. Experiments on three dialogue emotion datasets, IEMOCAP, Friends, and EmotionPush demonstrate that our proposed HiGRU models attain at least 8.7 over the state-of-the-art methods on each dataset, respectively. Particularly, by utilizing only the textual feature in IEMOCAP, our HiGRU models gain at least 3.8 (CMN) with the trimodal features of text, video, and audio.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2020

Hierarchical Transformer Network for Utterance-level Emotion Recognition

While there have been significant advances in de-tecting emotions in tex...
research
11/20/2019

Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network

Real-time emotion recognition (RTER) in conversations is significant for...
research
10/24/2019

Conversational Emotion Analysis via Attention Mechanisms

Different from the emotion recognition in individual utterances, we prop...
research
12/21/2021

Contrast and Generation Make BART a Good Dialogue Emotion Recognizer

In dialogue systems, utterances with similar semantics may have distinct...
research
04/21/2021

MoonGrad at SemEval-2019 Task 3: Ensemble BiRNNs for Contextual Emotion Detection in Dialogues

When reading “I don’t want to talk to you any more”, we might interpret ...
research
11/09/2020

Language Through a Prism: A Spectral Approach for Multiscale Language Representations

Language exhibits structure at different scales, ranging from subwords t...
research
06/23/2023

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems

Current ASR systems are mainly trained and evaluated at the utterance le...

Please sign up or login with your details

Forgot password? Click here to reset