Attentive Convolutional Neural Network based Speech Emotion Recognition: A Study on the Impact of Input Features, Signal Length, and Acted Speech

06/02/2017
by   Michael Neumann, et al.
0

Speech emotion recognition is an important and challenging task in the realm of human-computer interaction. Prior work proposed a variety of models and feature sets for training a system. In this work, we conduct extensive experiments using an attentive convolutional neural network with multi-view learning objective function. We compare system performance using different lengths of the input signal, different types of acoustic features and different types of emotion speech (improvised/scripted). Our experimental results on the Interactive Emotional Motion Capture (IEMOCAP) database reveal that the recognition performance strongly depends on the type of speech data independent of the choice of input features. Furthermore, we achieved state-of-the-art results on the improvised speech data of IEMOCAP.

READ FULL TEXT
research
10/31/2021

Speech Emotion Recognition Using Quaternion Convolutional Neural Networks

Although speech recognition has become a widespread technology, inferrin...
research
06/07/2017

Characterizing Types of Convolution in Deep Convolutional Recurrent Neural Networks for Robust Speech Emotion Recognition

Deep convolutional neural networks are being actively investigated in a ...
research
02/03/2021

Speech Emotion Recognition with Multiscale Area Attention and Data Augmentation

In Speech Emotion Recognition (SER), emotional characteristics often app...
research
10/23/2019

Speech Emotion Recognition via Contrastive Loss under Siamese Networks

Speech emotion recognition is an important aspect of human-computer inte...
research
04/03/2023

Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP

There is an imminent need for guidelines and standard test sets to allow...
research
06/19/2019

Learning Discriminative features using Center Loss and Reconstruction as Regularizer for Speech Emotion Recognition

This paper proposes a Convolutional Neural Network (CNN) inspired by Mul...
research
07/03/2022

A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition

Speech emotion recognition (SER) is an essential part of human-computer ...

Please sign up or login with your details

Forgot password? Click here to reset