Speaker Attentive Speech Emotion Recognition

04/15/2021
by   Clément Le Moine, et al.
0

Speech Emotion Recognition (SER) task has known significant improvements over the last years with the advent of Deep Neural Networks (DNNs). However, even the most successful methods are still rather failing when adaptation to specific speakers and scenarios is needed, inevitably leading to poorer performances when compared to humans. In this paper, we present novel work based on the idea of teaching the emotion recognition network about speaker identity. Our system is a combination of two ACRNN classifiers respectively dedicated to speaker and emotion recognition. The first informs the latter through a Self Speaker Attention (SSA) mechanism that is shown to considerably help to focus on emotional information of the speech signal. Experiments on social attitudes database Att-HACK and IEMOCAP corpus demonstrate the effectiveness of the proposed method and achieve the state-of-the-art performance in terms of unweighted average recall.

READ FULL TEXT
research
02/02/2022

Speaker Normalization for Self-supervised Speech Emotion Recognition

Large speech emotion recognition datasets are hard to obtain, and small ...
research
01/19/2022

Unsupervised Personalization of an Emotion Recognition System: The Unique Properties of the Externalization of Valence in Speech

The prediction of valence from speech is an important, but challenging p...
research
09/05/2023

Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition

There are individual differences in expressive behaviors driven by cultu...
research
11/04/2022

SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers

In recent years, Speech Emotion Recognition (SER) has been investigated ...
research
05/25/2018

Curriculum Learning for Speech Emotion Recognition from Crowdsourced Labels

This study introduces a method to design a curriculum for machine-learni...
research
06/04/2018

DNN-HMM based Speaker Adaptive Emotion Recognition using Proposed Epoch and MFCC Features

Speech is produced when time varying vocal tract system is excited with ...
research
08/05/2020

Compact Graph Architecture for Speech Emotion Recognition

We propose a deep graph approach to address the task of speech emotion r...

Please sign up or login with your details

Forgot password? Click here to reset