Learning Emotion Representations from Verbal and Nonverbal Communication

05/22/2023
by   Sitao Zhang, et al.
0

Emotion understanding is an essential but highly challenging component of artificial general intelligence. The absence of extensively annotated datasets has significantly impeded advancements in this field. We present EmotionCLIP, the first pre-training paradigm to extract visual emotion representations from verbal and nonverbal communication using only uncurated data. Compared to numerical labels or descriptions used in previous methods, communication naturally contains emotion information. Furthermore, acquiring emotion representations from communication is more congruent with the human learning process. We guide EmotionCLIP to attend to nonverbal emotion cues through subject-aware context encoding and verbal emotion cues using sentiment-guided contrastive learning. Extensive experiments validate the effectiveness and transferability of EmotionCLIP. Using merely linear-probe evaluation protocol, EmotionCLIP outperforms the state-of-the-art supervised visual emotion recognition methods and rivals many multimodal approaches across various benchmarks. We anticipate that the advent of EmotionCLIP will address the prevailing issue of data scarcity in emotion understanding, thereby fostering progress in related domains. The code and pre-trained models are available at https://github.com/Xeaver/EmotionCLIP.

READ FULL TEXT

page 1

page 6

research
10/27/2021

MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition

Multimodal emotion recognition study is hindered by the lack of labelled...
research
01/30/2021

LSSED: a large-scale dataset and benchmark for speech emotion recognition

Speech emotion recognition is a vital contributor to the next generation...
research
06/22/2021

Exemplars-guided Empathetic Response Generation Controlled by the Elements of Human Communication

The majority of existing methods for empathetic response generation rely...
research
05/02/2018

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

In this paper, we comprehensively describe the methodology of our submis...
research
08/06/2023

StyleEDL: Style-Guided High-order Attention Network for Image Emotion Distribution Learning

Emotion distribution learning has gained increasing attention with the t...
research
06/12/2023

A Weakly Supervised Approach to Emotion-change Prediction and Improved Mood Inference

Whilst a majority of affective computing research focuses on inferring e...
research
11/07/2021

Global-Local Attention for Emotion Recognition

Human emotion recognition is an active research area in artificial intel...

Please sign up or login with your details

Forgot password? Click here to reset