End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition

10/13/2020
by   Puneet Kumar, et al.
0

In this paper, an end-to-end neural embedding system based on triplet loss and residual learning has been proposed for speech emotion recognition. The proposed system learns the embeddings from the emotional information of the speech utterances. The learned embeddings are used to recognize the emotions portrayed by given speech samples of various lengths. The proposed system implements Residual Neural Network architecture. It is trained using softmax pre-training and triplet loss function. The weights between the fully connected and embedding layers of the trained network are used to calculate the embedding values. The embedding representations of various emotions are mapped onto a hyperplane, and the angles among them are computed using the cosine similarity. These angles are utilized to classify a new speech sample into its appropriate emotion class. The proposed system has demonstrated 91.67 while recognizing emotions for RAVDESS and IEMOCAP dataset, respectively.

READ FULL TEXT
research
10/07/2021

End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks

Emotions are subjective constructs. Recent end-to-end speech emotion rec...
research
10/28/2021

End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings

Recognizing a speaker's emotion from their speech can be a key element i...
research
06/11/2019

Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition

This paper proposes a Residual Convolutional Neural Network (ResNet) bas...
research
11/30/2021

Affect-DML: Context-Aware One-Shot Recognition of Human Affect using Deep Metric Learning

Human affect recognition is a well-established research area with numero...
research
06/16/2021

Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI

Speech sounds of spoken language are obtained by varying configuration o...
research
10/23/2019

Speech Emotion Recognition via Contrastive Loss under Siamese Networks

Speech emotion recognition is an important aspect of human-computer inte...
research
03/05/2021

Harnessing Geometric Constraints from Emotion Labels to improve Face Verification

For the task of face verification, we explore the utility of harnessing ...

Please sign up or login with your details

Forgot password? Click here to reset