Speech Emotion Recognition via Contrastive Loss under Siamese Networks

10/23/2019
by   Zheng Lian, et al.
0

Speech emotion recognition is an important aspect of human-computer interaction. Prior work proposes various end-to-end models to improve the classification performance. However, most of them rely on the cross-entropy loss together with softmax as the supervision component, which does not explicitly encourage discriminative learning of features. In this paper, we introduce the contrastive loss function to encourage intra-class compactness and inter-class separability between learnable features. Furthermore, multiple feature selection methods and pairwise sample selection methods are evaluated. To verify the performance of the proposed system, we conduct experiments on The Interactive Emotional Dyadic Motion Capture (IEMOCAP) database, a common evaluation corpus. Experimental results reveal the advantages of the proposed method, which reaches 62.19 unweighted accuracy. It outperforms the baseline system that is optimized without the contrastive loss function with 1.14 accuracy and the unweighted accuracy, respectively.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/04/2018

A pairwise discriminative task for speech emotion recognition

Speech emotion recognition is an important task in human-machine interac...
11/11/2018

Improving speech emotion recognition via Transformer-based Predictive Coding through transfer learning

Speech emotion recognition is an important aspect of human-computer inte...
10/13/2020

End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition

In this paper, an end-to-end neural embedding system based on triplet lo...
06/04/2020

A Siamese Neural Network with Modified Distance Loss For Transfer Learning in Speech Emotion Recognition

Automatic emotion recognition plays a significant role in the process of...
06/19/2019

Learning Discriminative features using Center Loss and Reconstruction as Regularizer for Speech Emotion Recognition

This paper proposes a Convolutional Neural Network (CNN) inspired by Mul...
04/07/2021

Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data

Electronic Health Record (EHR) data has been of tremendous utility in Ar...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.