Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

01/17/2022
by   Pengfei Liu, et al.
0

Emotion recognition is a challenging and actively-studied research area that plays a critical role in emotion-aware human-computer interaction systems. In a multimodal setting, temporal alignment between different modalities has not been well investigated yet. This paper presents a new model named as Gated Bidirectional Alignment Network (GBAN), which consists of an attention-based bidirectional alignment network over LSTM hidden states to explicitly capture the alignment relationship between speech and text, and a novel group gated fusion (GGF) layer to integrate the representations of different modalities. We empirically show that the attention-aligned representations outperform the last-hidden-states of LSTM significantly, and the proposed GBAN model outperforms existing state-of-the-art multimodal approaches on the IEMOCAP dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2023

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition

Multimodal emotion recognition is a challenging research area that aims ...
research
06/22/2021

Key-Sparse Transformer with Cascaded Cross-Attention Block for Multimodal Speech Emotion Recognition

Speech emotion recognition is a challenging and important research topic...
research
05/17/2018

Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data

Emotion recognition has become a popular topic of interest, especially i...
research
07/25/2022

GA2MIF: Graph and Attention based Two-stage Multi-source Information Fusion for Conversational Emotion Detection

Multimodal Emotion Recognition in Conversation (ERC) plays an influentia...
research
09/05/2023

Leveraging Label Information for Multimodal Emotion Recognition

Multimodal emotion recognition (MER) aims to detect the emotional status...
research
05/22/2018

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

Multimodal affective computing, learning to recognize and interpret huma...
research
11/20/2019

Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network

Real-time emotion recognition (RTER) in conversations is significant for...

Please sign up or login with your details

Forgot password? Click here to reset