Multimodal Speech Emotion Recognition and Ambiguity Resolution

04/12/2019
by   Gaurav Sahu, et al.
0

Identifying emotion from speech is a non-trivial task pertaining to the ambiguous definition of emotion itself. In this work, we adopt a feature-engineering based approach to tackle the task of speech emotion recognition. Formalizing our problem as a multi-class classification problem, we compare the performance of two categories of models. For both, we extract eight hand-crafted features from the audio signal. In the first approach, the extracted features are used to train six traditional machine learning classifiers, whereas the second approach is based on deep learning wherein a baseline feed-forward neural network and an LSTM-based classifier are trained over the same features. In order to resolve ambiguity in communication, we also include features from the text domain. We report accuracy, f-score, precision, and recall for the different experiment settings we evaluated our models in. Overall, we show that lighter machine learning based models trained over a few hand-crafted features are able to achieve performance comparable to the current deep learning based state-of-the-art method for emotion recognition.

READ FULL TEXT
research
03/30/2018

Reusing Neural Speech Representations for Auditory Emotion Recognition

Acoustic emotion recognition aims to categorize the affective state of t...
research
11/09/2022

A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition

Automated emotion recognition in speech is a long-standing problem. Whil...
research
04/03/2018

Music Genre Classification using Machine Learning Techniques

Categorizing music files according to their genre is a challenging task ...
research
02/11/2021

Disentanglement for audio-visual emotion recognition using multitask setup

Deep learning models trained on audio-visual data have been successfully...
research
06/14/2023

Continuous Learning Based Novelty Aware Emotion Recognition System

Current works in human emotion recognition follow the traditional closed...
research
09/09/2021

Accounting for Variations in Speech Emotion Recognition with Nonparametric Hierarchical Neural Network

In recent years, deep-learning-based speech emotion recognition models h...
research
01/14/2023

Modulation spectral features for speech emotion recognition using deep neural networks

This work explores the use of constant-Q transform based modulation spec...

Please sign up or login with your details

Forgot password? Click here to reset