Jointly Aligning and Predicting Continuous Emotion Annotations

07/05/2019
by   Soheil Khorram, et al.
0

Time-continuous dimensional descriptions of emotions (e.g., arousal, valence) allow researchers to characterize short-time changes and to capture long-term trends in emotion expression. However, continuous emotion labels are generally not synchronized with the input speech signal due to delays caused by reaction-time, which is inherent in human evaluations. To deal with this challenge, we introduce a new convolutional neural network (multi-delay sinc network) that is able to simultaneously align and predict labels in an end-to-end manner. The proposed network is a stack of convolutional layers followed by an aligner network that aligns the speech signal and emotion labels. This network is implemented using a new convolutional layer that we introduce, the delayed sinc layer. It is a time-shifted low-pass (sinc) filter that uses a gradient-based algorithm to learn a single delay. Multiple delayed sinc layers can be used to compensate for a non-stationary delay that is a function of the acoustic space. We test the efficacy of this system on two common emotion datasets, RECOLA and SEWA, and show that this approach obtains state-of-the-art speech-only results by learning time-varying delays while predicting dimensional descriptors of emotions.

READ FULL TEXT
research
10/07/2021

End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks

Emotions are subjective constructs. Recent end-to-end speech emotion rec...
research
11/14/2022

Describing emotions with acoustic property prompts for speech emotion recognition

Emotions lie on a broad continuum and treating emotions as a discrete nu...
research
10/29/2022

Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition

Traditionally, in paralinguistic analysis for emotion detection from spe...
research
08/23/2017

Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition

The goal of continuous emotion recognition is to assign an emotion value...
research
01/31/2020

Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions

Emotion plays an essential role in human-to-human communication, enablin...
research
09/21/2022

Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks

Most automatic emotion recognition systems exploit time-continuous annot...
research
04/03/2018

EmoRL: Continuous Acoustic Emotion Classification using Deep Reinforcement Learning

Acoustically expressed emotions can make communication with a robot more...

Please sign up or login with your details

Forgot password? Click here to reset