Learning Fine-Grained Multimodal Alignment for Speech Emotion Recognition

10/24/2020
by   Hang Li, et al.
0

Speech emotion recognition is a challenging task because the emotion expression is complex, multimodal and fine-grained. In this paper, we propose a novel multimodal deep learning approach to perform fine-grained emotion recognition from real-life speeches. We design a temporal alignment pooling mechanism to capture the subtle and fine-grained emotions implied in every utterance. In addition, we propose a cross modality excitation module to conduct sample-specific activations on acoustic embedding dimensions and adaptively recalibrate the corresponding values by latent semantic features. The proposed model is evaluated on two well-known real-world speech emotion recognition datasets. The results demonstrate that our approach is superior on the prediction tasks for multimodal speech utterances, and it outperforms a wide range of baselines in terms of prediction accuracy. In order to encourage the research reproducibility, we make the code publicly available at https://github.com/hzlihang99/icassp2021_CME.git.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2019

Learning Alignment for Multimodal Emotion Recognition from Speech

Speech emotion recognition is a challenging problem because human convey...
research
06/22/2023

Speech Emotion Diarization: Which Emotion Appears When?

Speech Emotion Recognition (SER) typically relies on utterance-level sol...
research
11/23/2018

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors

Humans convey their intentions through the usage of both verbal and nonv...
research
05/09/2023

Emolysis: A Multimodal Open-Source Group Emotion Analysis and Visualization Toolkit

Automatic group emotion recognition plays an important role in understan...
research
09/21/2022

Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks

Most automatic emotion recognition systems exploit time-continuous annot...
research
09/20/2022

Data-Centric AI Paradigm Based on Application-Driven Fine-grained Dataset Design

Deep learning has a wide range of applications in industrial scenario, b...
research
09/03/2020

Knowing What to Listen to: Early Attention for Deep Speech Representation Learning

Deep learning techniques have considerably improved speech processing in...

Please sign up or login with your details

Forgot password? Click here to reset