x-vectors meet emotions: A study on dependencies between emotion and speaker recognition

02/12/2020
by   Raghavendra Pappagari, et al.
0

In this work, we explore the dependencies between speaker recognition and emotion recognition. We first show that knowledge learned for speaker recognition can be reused for emotion recognition through transfer learning. Then, we show the effect of emotion on speaker recognition. For emotion recognition, we show that using a simple linear model is enough to obtain good performance on the features extracted from pre-trained models such as the x-vector model. Then, we improve emotion recognition performance by fine-tuning for emotion classification. We evaluated our experiments on three different types of datasets: IEMOCAP, MSP-Podcast, and Crema-D. By fine-tuning, we obtained 30.40 and Crema-D respectively over baseline model with no pre-training. Finally, we present results on the effect of emotion on speaker verification. We observed that speaker verification performance is prone to changes in test speaker emotions. We found that trials with angry utterances performed worst in all three datasets. We hope our analysis will initiate a new line of research in the speaker recognition community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2020

Meta Transfer Learning for Emotion Recognition

Deep learning has been widely adopted in automatic emotion recognition a...
research
06/04/2018

DNN-HMM based Speaker Adaptive Emotion Recognition using Proposed Epoch and MFCC Features

Speech is produced when time varying vocal tract system is excited with ...
research
10/27/2020

CopyPaste: An Augmentation Method for Speech Emotion Recognition

Data augmentation is a widely used strategy for training robust machine ...
research
09/30/2020

Embedded Emotions – A Data Driven Approach to Learn Transferable Feature Representations from Raw Speech Input for Emotion Recognition

Traditional approaches to automatic emotion recognition are relying on t...
research
12/29/2020

A Hierarchical Transformer with Speaker Modeling for Emotion Recognition in Conversation

Emotion Recognition in Conversation (ERC) is a more challenging task tha...
research
10/26/2022

Effect of different splitting criteria on the performance of speech emotion recognition

Traditional speech emotion recognition (SER) evaluations have been perfo...
research
08/23/2017

Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition

The goal of continuous emotion recognition is to assign an emotion value...

Please sign up or login with your details

Forgot password? Click here to reset