Spatiotemporal Networks for Video Emotion Recognition

04/03/2017
by   Lijie Fan, et al.
0

Our experiment adapts several popular deep learning methods as well as some traditional methods on the problem of video emotion recognition. In our experiment, we use the CNN-LSTM architecture for visual information extraction and classification and utilize traditional methods such as for audio feature classification. For multimodal fusion, we use the traditional Support Vector Machine. Our experiment yields a good result on the AFEW 6.0 Dataset.

READ FULL TEXT
research
09/13/2018

Investigation of Multimodal Features, Classifiers and Fusion Methods for Emotion Recognition

Automatic emotion recognition is a challenging task. In this paper, we p...
research
11/21/2018

Towards Emotion Recognition: A Persistent Entropy Application

Emotion recognition and classification is a very active area of research...
research
01/16/2017

Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading

The Aduio-visual Speech Recognition (AVSR) which employs both the video ...
research
03/28/2016

Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention

This paper focuses on two key problems for audio-visual emotion recognit...
research
07/06/2023

Evaluating raw waveforms with deep learning frameworks for speech emotion recognition

Speech emotion recognition is a challenging task in speech processing fi...
research
10/30/2018

Deep Learning as Feature Encoding for Emotion Recognition

Deep learning is popular as an end-to-end framework extracting the promi...
research
08/04/2023

Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques

Traditional approaches in speech emotion recognition, such as LSTM, CNN,...

Please sign up or login with your details

Forgot password? Click here to reset