Emotion Recognition from Speech

12/22/2019
by   Kannan Venkataramanan, et al.
0

In this work, we conduct an extensive comparison of various approaches to speech based emotion recognition systems. The analyses were carried out on audio recordings from Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). After pre-processing the raw audio files, features such as Log-Mel Spectrogram, Mel-Frequency Cepstral Coefficients (MFCCs), pitch and energy were considered. The significance of these features for emotion classification was compared by applying methods such as Long Short Term Memory (LSTM), Convolutional Neural Networks (CNNs), Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs). On the 14-class (2 genders x 7 emotions) classification task, an accuracy of 68 dimensional CNN using the Log-Mel Spectrogram features. We also observe that, in emotion recognition, the choice of audio features impacts the results much more than the model complexity.

READ FULL TEXT

page 9

page 12

page 13

page 14

research
06/23/2018

Evaluating Gammatone Frequency Cepstral Coefficients with Neural Networks for Emotion Recognition from Speech

Current approaches to speech emotion recognition focus on speech feature...
research
04/08/2019

Direct Modelling of Speech Emotion from Raw Speech

Speech emotion recognition is a challenging task and heavily depends on ...
research
12/26/2021

Novel Dual-Channel Long Short-Term Memory Compressed Capsule Networks for Emotion Recognition

Recent analysis on speech emotion recognition has made considerable adva...
research
04/27/2017

End-to-End Multimodal Emotion Recognition using Deep Neural Networks

Automatic affect recognition is a challenging task due to the various mo...
research
07/06/2023

Evaluating raw waveforms with deep learning frameworks for speech emotion recognition

Speech emotion recognition is a challenging task in speech processing fi...
research
08/14/2017

Learning spectro-temporal features with 3D CNNs for speech emotion recognition

In this paper, we propose to use deep 3-dimensional convolutional networ...
research
12/23/2019

Learning Transferable Features for Speech Emotion Recognition

Emotion recognition from speech is one of the key steps towards emotiona...

Please sign up or login with your details

Forgot password? Click here to reset