Key-Sparse Transformer with Cascaded Cross-Attention Block for Multimodal Speech Emotion Recognition

06/22/2021
by   Weidong Chen, et al.
0

Speech emotion recognition is a challenging and important research topic that plays a critical role in human-computer interaction. Multimodal inputs can improve the performance as more emotional information is used for recognition. However, existing studies learnt all the information in the sample while only a small portion of it is about emotion. Moreover, under the multimodal framework, the interaction between different modalities is shallow and insufficient. In this paper, a keysparse Transformer is proposed for efficient SER by only focusing on emotion related information. Furthermore, a cascaded cross-attention block, which is specially designed for multimodal framework, is introduced to achieve deep interaction between different modalities. The proposed method is evaluated by IEMOCAP corpus and the experimental results show that the proposed method gives better performance than the state-of-theart approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2020

Temporal aggregation of audio-visual modalities for emotion recognition

Emotion recognition has a pivotal role in affective computing and in hum...
research
06/23/2023

Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers

Despite the recent progress in speech emotion recognition (SER), state-o...
research
01/17/2022

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

Emotion recognition is a challenging and actively-studied research area ...
research
03/03/2023

DWFormer: Dynamic Window transFormer for Speech Emotion Recognition

Speech emotion recognition is crucial to human-computer interaction. The...
research
07/08/2019

Attending to Emotional Narratives

Attention mechanisms in deep neural networks have achieved excellent per...
research
10/12/2017

Multimodal Observation and Interpretation of Subjects Engaged in Problem Solving

In this paper we present the first results of a pilot experiment in the ...
research
05/04/2023

Noise-Resistant Multimodal Transformer for Emotion Recognition

Multimodal emotion recognition identifies human emotions from various da...

Please sign up or login with your details

Forgot password? Click here to reset