DeepAI AI Chat
Log In Sign Up

Biologically inspired speech emotion recognition

by   Reza Lotfidereshgi, et al.

Conventional feature-based classification methods do not apply well to automatic recognition of speech emotions, mostly because the precise set of spectral and prosodic features that is required to identify the emotional state of a speaker has not been determined yet. This paper presents a method that operates directly on the speech signal, thus avoiding the problematic step of feature extraction. Furthermore, this method combines the strengths of the classical source-filter model of human speech production with those of the recently introduced liquid state machine (LSM), a biologically-inspired spiking neural network (SNN). The source and vocal tract components of the speech signal are first separated and converted into perceptually relevant spectral representations. These representations are then processed separately by two reservoirs of neurons. The output of each reservoir is reduced in dimensionality and fed to a final classifier. This method is shown to provide very good classification performance on the Berlin Database of Emotional Speech (Emo-DB). This seems a very promising framework for solving efficiently many other problems in speech processing.


page 3

page 4


Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition

Robust speech emotion recognition relies on the quality of the speech fe...

Variational Autoencoders for Learning Latent Representations of Speech Emotion

Latent representation of data in unsupervised fashion is a very interest...

DNN-HMM based Speaker Adaptive Emotion Recognition using Proposed Epoch and MFCC Features

Speech is produced when time varying vocal tract system is excited with ...

Analysis of Statistical Parametric and Unit Selection Speech Synthesis Systems Applied to Emotional Speech

We have applied two state-of-the-art speech synthesis techniques (unit s...

Emotional State Categorization from Speech: Machine vs. Human

This paper presents our investigations on emotional state categorization...

Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network

The echo state network (ESN) is a powerful and efficient tool for displa...