Multimodal Utterance-level Affect Analysis using Visual, Audio and Text Features

05/02/2018
by   Didan Deng, et al.
0

Affective computing models are essential for human behavior analysis. A promising trend of affective system is enhancing the recognition performance by analyzing the contextual information over time and across modalities. To overcome the limitations of instantaneous emotion recognition, the 2018 IJCNN challenge on One-Minute Gradual-Emotion Recognition (OMG-Emotion) encourages the participants to address long-term emotion recognition using multiple modalities data like facial expression, audio and language context. Compared with single modality models given by the baseline method, a multi-modal inference network can leverage the information from each modality and their correlations to improve the performance of recognition. In this paper, we propose a multi-modal architecture which uses facial, audio and language context features to recognize human sentiment from utterances. Our model outperforms the provided unimodal baseline, and achieves the concordance correlation coefficients (CCC) 0.400 of arousal task, and 0.353 of valence task.

READ FULL TEXT
research
06/05/2022

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Emotion Recognition in Conversations (ERC) is crucial in developing symp...
research
04/21/2020

WiFE: WiFi and Vision based Intelligent Facial-Gesture Emotion Recognition

Emotion is an essential part of Artificial Intelligence (AI) and human m...
research
06/07/2018

Multimodal Relational Tensor Network for Sentiment and Emotion Classification

Understanding Affect from video segments has brought researchers from th...
research
03/16/2021

Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition

Emotional expressions are the behaviors that communicate our emotional s...
research
10/24/2019

AI in Pursuit of Happiness, Finding Only Sadness: Multi-Modal Facial Emotion Recognition Challenge

The importance of automated Facial Emotion Recognition (FER) grows the m...
research
08/04/2023

Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques

Traditional approaches in speech emotion recognition, such as LSTM, CNN,...
research
03/28/2018

Topic Modeling Based Multi-modal Depression Detection

Major depressive disorder is a common mental disorder that affects almos...

Please sign up or login with your details

Forgot password? Click here to reset