TACOformer:Token-channel compounded Cross Attention for Multimodal Emotion Recognition

06/23/2023
by   Xinda Li, et al.
0

Recently, emotion recognition based on physiological signals has emerged as a field with intensive research. The utilization of multi-modal, multi-channel physiological signals has significantly improved the performance of emotion recognition systems, due to their complementarity. However, effectively integrating emotion-related semantic information from different modalities and capturing inter-modal dependencies remains a challenging issue. Many existing multimodal fusion methods ignore either token-to-token or channel-to-channel correlations of multichannel signals from different modalities, which limits the classification capability of the models to some extent. In this paper, we propose a comprehensive perspective of multimodal fusion that integrates channel-level and token-level cross-modal interactions. Specifically, we introduce a unified cross attention module called Token-chAnnel COmpound (TACO) Cross Attention to perform multimodal fusion, which simultaneously models channel-level and token-level dependencies between modalities. Additionally, we propose a 2D position encoding method to preserve information about the spatial distribution of EEG signal channels, then we use two transformer encoders ahead of the fusion module to capture long-term temporal dependencies from the EEG signal and the peripheral physiological signal, respectively. Subject-independent experiments on emotional dataset DEAP and Dreamer demonstrate that the proposed model achieves state-of-the-art performance.

READ FULL TEXT
research
11/29/2019

Multimodal Emotion Recognition Model using Physiological Signals

As an important field of research in Human-Machine Interactions, emotion...
research
07/28/2023

CFN-ESA: A Cross-Modal Fusion Network with Emotion-Shift Awareness for Dialogue Emotion Recognition

Multimodal Emotion Recognition in Conversation (ERC) has garnered growin...
research
05/07/2023

Lightweight Convolution Transformer for Cross-patient Seizure Detection in Multi-channel EEG Signals

Background: Epilepsy is a neurological illness affecting the brain that ...
research
09/20/2022

An Efficient End-to-End Transformer with Progressive Tri-modal Attention for Multi-modal Emotion Recognition

Recent works on multi-modal emotion recognition move towards end-to-end ...
research
02/26/2016

Multimodal Emotion Recognition Using Multimodal Deep Learning

To enhance the performance of affective models and reduce the cost of ac...
research
12/22/2021

Multimodal Personality Recognition using Cross-Attention Transformer and Behaviour Encoding

Personality computing and affective computing have gained recent interes...
research
08/23/2023

Multimodal Latent Emotion Recognition from Micro-expression and Physiological Signals

This paper discusses the benefits of incorporating multimodal data for i...

Please sign up or login with your details

Forgot password? Click here to reset