SI-LSTM: Speaker Hybrid Long-short Term Memory and Cross Modal Attention for Emotion Recognition in Conversation

05/04/2023
by   Xingwei Liang, et al.
0

Emotion Recognition in Conversation (ERC) across modalities is of vital importance for a variety of applications, including intelligent healthcare, artificial intelligence for conversation, and opinion mining over chat history. The crux of ERC is to model both cross-modality and cross-time interactions throughout the conversation. Previous methods have made progress in learning the time series information of conversation while lacking the ability to trace down the different emotional states of each speaker in a conversation. In this paper, we propose a recurrent structure called Speaker Information Enhanced Long-Short Term Memory (SI-LSTM) for the ERC task, where the emotional states of the distinct speaker can be tracked in a sequential way to enhance the learning of the emotion in conversation. Further, to improve the learning of multimodal features in ERC, we utilize a cross-modal attention component to fuse the features between different modalities and model the interaction of the important information from different modalities. Experimental results on two benchmark datasets demonstrate the superiority of the proposed SI-LSTM against the state-of-the-art baseline methods in the ERC task on multimodal data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2023

CFN-ESA: A Cross-Modal Fusion Network with Emotion-Shift Awareness for Dialogue Emotion Recognition

Multimodal Emotion Recognition in Conversation (ERC) has garnered growin...
research
09/23/2017

Cross-modal Recurrent Models for Weight Objective Prediction from Multimodal Time-series Data

We analyse multimodal time-series data corresponding to weight, sleep an...
research
02/03/2018

Multi-attention Recurrent Network for Human Communication Comprehension

Human face-to-face communication is a complex multimodal signal. We use ...
research
11/22/2019

Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset

Human emotions unfold over time, and more affective computing research h...
research
07/19/2017

The Role of Conversation Context for Sarcasm Detection in Online Interactions

Computational models for sarcasm detection have often relied on the cont...
research
02/19/2020

Multilogue-Net: A Context Aware RNN for Multi-modal Emotion Detection and Sentiment Analysis in Conversation

Sentiment Analysis and Emotion Detection in conversation is key in a num...
research
10/15/2020

DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in the Conversation

Emotion Recognition in Conversations (ERC) is essential for building emp...

Please sign up or login with your details

Forgot password? Click here to reset