Deep Learning based Emotion Recognition System Using Speech Features and Transcriptions

06/11/2019
by   Suraj Tripathi, et al.
0

This paper proposes a speech emotion recognition method based on speech features and speech transcriptions (text). Speech features such as Spectrogram and Mel-frequency Cepstral Coefficients (MFCC) help retain emotion-related low-level characteristics in speech whereas text helps capture semantic meaning, both of which help in different aspects of emotion detection. We experimented with several Deep Neural Network (DNN) architectures, which take in different combinations of speech features and text as inputs. The proposed network architectures achieve higher accuracies when compared to state-of-the-art methods on a benchmark dataset. The combined MFCC-Text Convolutional Neural Network (CNN) model proved to be the most accurate in recognizing emotions in IEMOCAP data.

READ FULL TEXT
research
06/19/2019

Learning Discriminative features using Center Loss and Reconstruction as Regularizer for Speech Emotion Recognition

This paper proposes a Convolutional Neural Network (CNN) inspired by Mul...
research
06/23/2018

Evaluating Gammatone Frequency Cepstral Coefficients with Neural Networks for Emotion Recognition from Speech

Current approaches to speech emotion recognition focus on speech feature...
research
03/04/2022

Deep Learning Neural Networks for Emotion Classification from Text: Enhanced Leaky Rectified Linear Unit Activation and Weighted Loss

Accurate emotion classification for online reviews is vital for business...
research
06/11/2019

Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition

This paper proposes a Residual Convolutional Neural Network (ResNet) bas...
research
10/25/2018

Multi-Channel Auto-Encoder for Speech Emotion Recognition

Inferring emotion status from users' queries plays an important role to ...
research
03/28/2022

vTTS: visual-text to speech

This paper proposes visual-text to speech (vTTS), a method for synthesiz...
research
09/21/2023

The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains

Initialization of neural network weights plays a pivotal role in determi...

Please sign up or login with your details

Forgot password? Click here to reset