Variational Autoencoders for Learning Latent Representations of Speech Emotion: A Preliminary Study

12/23/2017
by   Siddique Latif, et al.
0

Learning the latent representation of data in unsupervised fashion is a very interesting process that provides relevant features for enhancing the performance of a classifier. For speech emotion recognition tasks, generating effective features is crucial. Currently, handcrafted features are mostly used for speech emotion recognition, however, features learned automatically using deep learning have shown strong success in many problems, especially in image processing. In particular, deep generative models such as Variational Autoencoders (VAEs) have gained enormous success for generating features for natural images. Inspired by this, we propose VAEs for deriving the latent representation of speech signals and use this representation to classify emotions. To the best of our knowledge, we are the first to propose VAEs for speech emotion classification. Evaluations on the IEMOCAP dataset demonstrate that features learned by VAEs can produce state-of-the-art results for speech emotion classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2017

Variational Autoencoders for Learning Latent Representations of Speech Emotion

Latent representation of data in unsupervised fashion is a very interest...
research
04/13/2017

Learning Latent Representations for Speech Generation and Transformation

An ability to model a generative process and learn a latent representati...
research
08/17/2023

Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition

Recent advancements in transformer-based speech representation models ha...
research
11/15/2015

Learning Representations of Affect from Speech

There has been a lot of prior work on representation learning for speech...
research
04/05/2022

Learning Speech Emotion Representations in the Quaternion Domain

The modeling of human emotion expression in speech signals is an importa...
research
05/05/2021

Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora

In recent years, speech emotion recognition (SER) has been used in wide ...
research
07/30/2018

CAKE: Compact and Accurate K-dimensional representation of Emotion

Inspired by works from the psychology community, we first study the link...

Please sign up or login with your details

Forgot password? Click here to reset