Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

02/17/2023
by   Yan Zhao, et al.
0

In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled testing (target) speech signals come from different corpora. Specifically, DIDAN first adopts a simple deep regression network consisting of a set of convolutional and fully connected layers to directly regress the source speech spectrums into the emotional labels such that the proposed DIDAN can own the emotion discriminative ability. Then, such ability is transferred to be also applicable to the target speech samples regardless of corpus variance by resorting to a well-designed regularization term called implicit distribution alignment (IDA). Unlike widely-used maximum mean discrepancy (MMD) and its variants, the proposed IDA absorbs the idea of sample reconstruction to implicitly align the distribution gap, which enables DIDAN to learn both emotion discriminative and corpus invariant features from speech spectrums. To evaluate the proposed DIDAN, extensive cross-corpus SER experiments on widely-used speech emotion corpora are carried out. Experimental results show that the proposed DIDAN can outperform lots of recent state-of-the-art methods in coping with the cross-corpus SER tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2023

Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Recognition

Cross-corpus speech emotion recognition (SER) seeks to generalize the ab...
research
01/19/2018

Cross Corpus Speech Emotion Classificaiton - An Effective Transfer Learning Technique

Cross-corpus speech emotion recognition can be a useful transfer learnin...
research
01/19/2018

Cross Corpus Speech Emotion Classification- An Effective Transfer Learning Technique

Cross-corpus speech emotion recognition can be a useful transfer learnin...
research
03/10/2021

EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition

In this manuscript, the topic of multi-corpus Speech Emotion Recognition...
research
03/28/2019

Barking up the Right Tree: Improving Cross-Corpus Speech Emotion Recognition with Adversarial Discriminative Domain Generalization (ADDoG)

Automatic speech emotion recognition provides computers with critical co...
research
09/09/2021

Accounting for Variations in Speech Emotion Recognition with Nonparametric Hierarchical Neural Network

In recent years, deep-learning-based speech emotion recognition models h...
research
07/18/2022

CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition

Speech Emotion Recognition (SER) has become a growing focus of research ...

Please sign up or login with your details

Forgot password? Click here to reset