Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora

05/05/2021
by   Sneha Das, et al.
10

In recent years, speech emotion recognition (SER) has been used in wide ranging applications, from healthcare to the commercial sector. In addition to signal processing approaches, methods for SER now also use deep learning techniques. However, generalizing over languages, corpora and recording conditions is still an open challenge in the field. Furthermore, due to the black-box nature of deep learning algorithms, a newer challenge is the lack of interpretation and transparency in the models and the decision making process. This is critical when the SER systems are deployed in applications that influence human lives. In this work we address this gap by providing an in-depth analysis of the decision making process of the proposed SER system. Towards that end, we present low-complexity SER based on undercomplete- and denoising- autoencoders that achieve an average classification accuracy of over 55% for four-class emotion classification. Following this, we investigate the clustering of emotions in the latent space to understand the influence of the corpora on the model behavior and to obtain a physical interpretation of the latent embedding. Lastly, we explore the role of each input feature towards the performance of the SER.

READ FULL TEXT
research
03/28/2022

Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations

In recent years, speech emotion recognition (SER) has been used in wide ...
research
12/23/2017

Variational Autoencoders for Learning Latent Representations of Speech Emotion

Latent representation of data in unsupervised fashion is a very interest...
research
03/28/2022

Continuous Metric Learning For Transferable Speech Emotion Recognition and Embedding Across Low-resource Languages

Speech emotion recognition (SER) refers to the technique of inferring th...
research
04/28/2022

Emotion Recognition In Persian Speech Using Deep Neural Networks

Speech Emotion Recognition (SER) is of great importance in Human-Compute...
research
12/23/2017

Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study

Learning the latent representation of data in unsupervised fashion is a ...
research
02/02/2022

Interpretability for Multimodal Emotion Recognition using Concept Activation Vectors

Multimodal Emotion Recognition refers to the classification of input vid...
research
10/30/2018

Transferable Positive/Negative Speech Emotion Recognition via Class-wise Adversarial Domain Adaptation

Speech emotion recognition plays an important role in building more inte...

Please sign up or login with your details

Forgot password? Click here to reset