Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

08/05/2022
by   Jia Li, et al.
3

In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability of multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model from learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2023

Multimodal Feature Extraction and Fusion for Emotional Reaction Intensity Estimation and Expression Classification in Videos with Transformers

In this paper, we present our solutions to the two sub-challenges of Aff...
research
10/16/2021

Hybrid Mutimodal Fusion for Dimensional Emotion Recognition

In this paper, we extensively present our solutions for the MuSe-Stress ...
research
09/18/2017

Depression Scale Recognition from Audio, Visual and Text Analysis

Depression is a major mental health disorder that is rapidly affecting l...
research
03/26/2021

DBATES: DataBase of Audio features, Text, and visual Expressions in competitive debate Speeches

In this work, we present a database of multimodal communication features...
research
12/12/2018

A Multimodal LSTM for Predicting Listener Empathic Responses Over Time

People naturally understand the emotions of-and often also empathize wit...
research
05/04/2023

MEDIC: A Multimodal Empathy Dataset in Counseling

Although empathic interaction between counselor and client is fundamenta...
research
03/16/2023

Emotional Reaction Intensity Estimation Based on Multimodal Data

This paper introduces our method for the Emotional Reaction Intensity (E...

Please sign up or login with your details

Forgot password? Click here to reset