Explainable Multimodal Emotion Reasoning

06/27/2023
by   Zheng Lian, et al.
0

Multimodal emotion recognition is an active research topic in artificial intelligence. Its primary objective is to integrate multi-modalities (such as acoustic, visual, and lexical clues) to identify human emotional states. Current works generally assume accurate emotion labels for benchmark datasets and focus on developing more effective architectures. But due to the inherent subjectivity of emotions, existing datasets often lack high annotation consistency, resulting in potentially inaccurate labels. Consequently, models built on these datasets may struggle to meet the demands of practical applications. To address this issue, it is crucial to enhance the reliability of emotion annotations. In this paper, we propose a novel task called “Explainable Multimodal Emotion Reasoning (EMER)”. In contrast to previous works that primarily focus on predicting emotions, EMER takes a step further by providing explanations for these predictions. The prediction is considered correct as long as the reasoning process behind the predicted emotion is plausible. This paper presents our initial efforts on EMER, where we introduce a benchmark dataset, establish baseline models, and define evaluation metrics. Meanwhile, we observe the necessity of integrating multi-faceted capabilities to deal with EMER. Therefore, we propose the first multimodal large language model (LLM) in affective computing, called AffectGPT. We aim to tackle the long-standing challenge of label ambiguity and chart a path toward more reliable techniques. Furthermore, EMER offers an opportunity to evaluate the audio-video-text understanding capabilities of recent multimodal LLM. To facilitate further research, we make the code and data available at: https://github.com/zeroQiaoba/AffectGPT.

READ FULL TEXT

page 6

page 7

research
06/05/2022

A Multimodal Corpus for Emotion Recognition in Sarcasm

While sentiment and emotion analysis have been studied extensively, the ...
research
05/12/2023

Versatile Audio-Visual Learning for Handling Single and Multi Modalities in Emotion Regression and Classification Tasks

Most current audio-visual emotion recognition models lack the flexibilit...
research
02/02/2022

Interpretability for Multimodal Emotion Recognition using Concept Activation Vectors

Multimodal Emotion Recognition refers to the classification of input vid...
research
08/12/2018

Multimodal Local-Global Ranking Fusion for Emotion Recognition

Emotion recognition is a core research area at the intersection of artif...
research
04/28/2023

SGED: A Benchmark dataset for Performance Evaluation of Spiking Gesture Emotion Recognition

In the field of affective computing, researchers in the community have p...
research
12/21/2022

Automatic Emotion Modelling in Written Stories

Telling stories is an integral part of human communication which can evo...
research
03/15/2019

Applying Probabilistic Programming to Affective Computing

Affective Computing is a rapidly growing field spurred by advancements i...

Please sign up or login with your details

Forgot password? Click here to reset