Exploring the Contextual Dynamics of Multimodal Emotion Recognition in Videos

by   Prasanta Bhattacharya, et al.

Emotional expressions form a key part of user behavior on today's digital platforms. While multimodal emotion recognition techniques are gaining research attention, there is a lack of deeper understanding on how visual and non-visual features can be used in better recognizing emotions for certain contexts, but not others. This study analyzes the interplay between the effects of multimodal emotion features derived from facial expressions, tone and text in conjunction with two key contextual factors: 1) the gender of the speaker, and 2) the duration of the emotional episode. Using a large dataset of more than 2500 manually annotated videos from YouTube, we found that while multimodal features consistently outperformed bimodal and unimodal features, their performances varied significantly for different emotions, gender and duration contexts. Multimodal features were found to perform particularly better for male than female speakers in recognizing most emotions except for fear. Furthermore, multimodal features performed particularly better for shorter than for longer videos in recognizing neutral, happiness, and surprise, but not sadness, anger, disgust and fear. These findings offer new insights towards the development of more context-aware emotion recognition and empathetic systems.


The Contextual Dynamics of Multimodal Emotion Recognition in Videos

Emotional expressions form a key part of user behavior on today's digita...

Controlling for Confounders in Multimodal Emotion Classification via Adversarial Learning

Various psychological factors affect how individuals express emotions. Y...

EmoCo: Visual Analysis of Emotion Coherence in Presentation Videos

Emotions play a key role in human communication and public presentations...

CEFER: A Four Facets Framework based on Context and Emotion embedded features for Implicit and Explicit Emotion Recognition

People's conduct and reactions are driven by their emotions. Online soci...

Multimodal Emotion Recognition among Couples from Lab Settings to Daily Life using Smartwatches

Couples generally manage chronic diseases together and the management ta...

K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations

Recognizing emotions during social interactions has many potential appli...

Whose Emotion Matters? Speaker Detection without Prior Knowledge

The task of emotion recognition in conversations (ERC) benefits from the...

Please sign up or login with your details

Forgot password? Click here to reset