M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

11/09/2019
by   Trisha Mittal, et al.
0

We present M3ER, a learning-based method for emotion recognition from multiple input modalities. Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities. M3ER models a novel, data-driven multiplicative fusion method to combine the modalities, which learn to emphasize the more reliable cues and suppress others on a per-sample basis. By introducing a check step which uses Canonical Correlational Analysis to differentiate between ineffective and effective modalities, M3ER is robust to sensor noise. M3ER also generates proxy features in place of the ineffectual modalities. We demonstrate the efficiency of our network through experimentation on two benchmark datasets, IEMOCAP and CMU-MOSEI. We report a mean accuracy of 82.7 CMU-MOSEI, which, collectively, is an improvement of about 5

READ FULL TEXT

page 1

page 4

page 6

page 9

research
09/09/2020

Multi-modal Attention for Speech Emotion Recognition

Emotion represents an essential aspect of human speech that is manifeste...
research
05/17/2020

Impact of multiple modalities on emotion recognition: investigation into 3d facial landmarks, action units, and physiological data

To fully understand the complexities of human emotion, the integration o...
research
11/10/2019

Dynamic Fusion for Multimodal Data

Effective fusion of data from multiple modalities, such as video, speech...
research
03/14/2020

EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege's Principle

We present EmotiCon, a learning-based algorithm for context-aware percei...
research
05/03/2018

Dimensional emotion recognition using visual and textual cues

This paper addresses the problem of automatic emotion recognition in the...
research
08/24/2022

Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Insufficient Labelled Data

This paper proposes a multimodal emotion recognition system, VIsual Spok...
research
08/06/2020

Learnable Graph Inception Network for Emotion Recognition

Analyzing emotion from verbal and non-verbal behavioral cues is critical...

Please sign up or login with your details

Forgot password? Click here to reset