Temporal aggregation of audio-visual modalities for emotion recognition

07/08/2020
by   Andreea Birhala, et al.
0

Emotion recognition has a pivotal role in affective computing and in human-computer interaction. The current technological developments lead to increased possibilities of collecting data about the emotional state of a person. In general, human perception regarding the emotion transmitted by a subject is based on vocal and visual information collected in the first seconds of interaction with the subject. As a consequence, the integration of verbal (i.e., speech) and non-verbal (i.e., image) information seems to be the preferred choice in most of the current approaches towards emotion recognition. In this paper, we propose a multimodal fusion technique for emotion recognition based on combining audio-visual modalities from a temporal window with different temporal offsets for each modality. We show that our proposed method outperforms other methods from the literature and human accuracy rating. The experiments are conducted over the open-access multimodal dataset CREMA-D.

READ FULL TEXT
research
11/20/2022

Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text

Speech emotion recognition is a challenge and an important step towards ...
research
06/22/2021

Key-Sparse Transformer with Cascaded Cross-Attention Block for Multimodal Speech Emotion Recognition

Speech emotion recognition is a challenging and important research topic...
research
07/28/2020

Variants of BERT, Random Forests and SVM approach for Multimodal Emotion-Target Sub-challenge

Emotion recognition has become a major problem in computer vision in rec...
research
08/12/2018

Multimodal Local-Global Ranking Fusion for Emotion Recognition

Emotion recognition is a core research area at the intersection of artif...
research
04/28/2023

SGED: A Benchmark dataset for Performance Evaluation of Spiking Gesture Emotion Recognition

In the field of affective computing, researchers in the community have p...
research
07/23/2022

Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss

Emotion recognition is involved in several real-world applications. With...
research
05/12/2023

Versatile Audio-Visual Learning for Handling Single and Multi Modalities in Emotion Regression and Classification Tasks

Most current audio-visual emotion recognition models lack the flexibilit...

Please sign up or login with your details

Forgot password? Click here to reset