Audio-video Emotion Recognition in the Wild using Deep Hybrid Networks

02/20/2020
by   Xin Guo, et al.
0

This paper presents an audiovisual-based emotion recognition hybrid network. While most of the previous work focuses either on using deep models or hand-engineered features extracted from images, we explore multiple deep models built on both images and audio signals. Specifically, in addition to convolutional neural networks (CNN) and recurrent neutral networks (RNN) trained on facial images, the hybrid network also contains one SVM classifier trained on holistic acoustic feature vectors, one long short-term memory network (LSTM) trained on short-term feature sequences extracted from segmented audio clips, and one Inception(v2)-LSTM network trained on image-like maps, which are built based on short-term acoustic feature sequences. Experimental results show that the proposed hybrid network outperforms the baseline method by a large margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2019

Direct Modelling of Speech Emotion from Raw Speech

Speech emotion recognition is a challenging task and heavily depends on ...
research
11/20/2018

Utterance-Based Audio Sentiment Analysis Learned by a Parallel Combination of CNN and LSTM

Audio Sentiment Analysis is a popular research area which extends the co...
research
03/28/2016

Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention

This paper focuses on two key problems for audio-visual emotion recognit...
research
11/29/2016

Learning Filter Banks Using Deep Learning For Acoustic Signals

Designing appropriate features for acoustic event recognition tasks is a...
research
06/14/2017

Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification

Videos are inherently multimodal. This paper studies the problem of how ...
research
06/24/2022

An Intensity and Phase Stacked Analysis of Phase-OTDR System using Deep Transfer Learning and Recurrent Neural Networks

Distributed acoustic sensors (DAS) are effective apparatus which are wid...
research
12/26/2021

Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition

Recent analysis on speech emotion recognition has made considerable adva...

Please sign up or login with your details

Forgot password? Click here to reset