Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-video Emotion Recognition

01/15/2019
by   Yuanyuan Zhang, et al.
16

Automatic emotion recognition (AER) is a challenging task due to the abstract concept and multiple expressions of emotion. Although there is no consensus on a definition, human emotional states usually can be apperceived by auditory and visual systems. Inspired by this cognitive process in human beings, it's natural to simultaneously utilize audio and visual information in AER. However, most traditional fusion approaches only build a linear paradigm, such as feature concatenation and multi-system fusion, which hardly captures complex association between audio and video. In this paper, we introduce factorized bilinear pooling (FBP) to deeply integrate the features of audio and video. Specifically, the features are selected through the embedded attention mechanism from respective modalities to obtain the emotion-related regions. The whole pipeline can be completed in a neural network. Validated on the AFEW database of the audio-video sub-challenge in EmotiW2018, the proposed approach achieves an accuracy of 62.48

READ FULL TEXT

page 2

page 3

page 5

research
11/17/2021

Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition

Multimodal emotion recognition is a challenging task in emotion computin...
research
12/27/2020

Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition

The audio-video based emotion recognition aims to classify a given video...
research
03/28/2016

Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention

This paper focuses on two key problems for audio-visual emotion recognit...
research
03/03/2022

Attention-based Region of Interest (ROI) Detection for Speech Emotion Recognition

Automatic emotion recognition for real-life appli-cations is a challengi...
research
08/06/2020

Learnable Graph Inception Network for Emotion Recognition

Analyzing emotion from verbal and non-verbal behavioral cues is critical...
research
06/25/2019

Emotion Recognition Using Fusion of Audio and Video Features

In this paper we propose a fusion approach to continuous emotion recogni...
research
11/29/2019

Attentive Modality Hopping Mechanism for Speech Emotion Recognition

In this work, we explore the impact of visual modality in addition to sp...

Please sign up or login with your details

Forgot password? Click here to reset