Leveraging Label Information for Multimodal Emotion Recognition

09/05/2023
by   Peiying Wang, et al.
0

Multimodal emotion recognition (MER) aims to detect the emotional status of a given expression by combining the speech and text information. Intuitively, label information should be capable of helping the model locate the salient tokens/frames relevant to the specific emotion, which finally facilitates the MER task. Inspired by this, we propose a novel approach for MER by leveraging label information. Specifically, we first obtain the representative label embeddings for both text and speech modalities, then learn the label-enhanced text/speech representations for each utterance via label-token and label-frame interactions. Finally, we devise a novel label-guided attentive fusion module to fuse the label-aware text and speech representations for emotion classification. Extensive experiments were conducted on the public IEMOCAP dataset, and experimental results demonstrate that our proposed approach outperforms existing baselines and achieves new state-of-the-art performance.

READ FULL TEXT
research
04/08/2023

An Empirical Study and Improvement for Speech Emotion Recognition

Multimodal speech emotion recognition aims to detect speakers' emotions ...
research
04/30/2022

Gaze-enhanced Crossmodal Embeddings for Emotion Recognition

Emotional expressions are inherently multimodal – integrating facial beh...
research
12/20/2022

AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection

Supervised approaches generally rely on majority-based labels. However, ...
research
09/05/2023

Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition

There are individual differences in expressive behaviors driven by cultu...
research
10/10/2018

Multimodal Speech Emotion Recognition Using Audio and Text

Speech emotion recognition is a challenging task, and extensive reliance...
research
10/27/2020

Emotion recognition by fusing time synchronous and time asynchronous representations

In this paper, a novel two-branch neural network model structure is prop...
research
01/17/2022

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

Emotion recognition is a challenging and actively-studied research area ...

Please sign up or login with your details

Forgot password? Click here to reset