Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network

12/21/2020
by   Xiong Cai, et al.
0

By using deep learning approaches, Speech Emotion Recog-nition (SER) on a single domain has achieved many excellentresults. However, cross-domain SER is still a challenging taskdue to the distribution shift between source and target domains.In this work, we propose a Domain Adversarial Neural Net-work (DANN) based approach to mitigate this distribution shiftproblem for cross-lingual SER. Specifically, we add a languageclassifier and gradient reversal layer after the feature extractor toforce the learned representation both language-independent andemotion-meaningful. Our method is unsupervised, i. e., labelson target language are not required, which makes it easier to ap-ply our method to other languages. Experimental results showthe proposed method provides an average absolute improve-ment of 3.91 arousal and valenceclassification task. Furthermore, we find that batch normaliza-tion is beneficial to the performance gain of DANN. Thereforewe also explore the effect of different ways of data combinationfor batch normalization.

READ FULL TEXT
research
07/14/2022

Semi-supervised cross-lingual speech emotion recognition

Speech emotion recognition (SER) on a single language has achieved remar...
research
03/01/2018

Cross-lingual and Multilingual Speech Emotion Recognition on English and French

Research on multilingual speech emotion recognition faces the problem th...
research
07/13/2019

Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

Cross-lingual speech emotion recognition (SER) is a crucial task for man...
research
09/02/2023

DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech – A Study between English and Mandarin

While the performance of cross-lingual TTS based on monolingual corpora ...
research
04/29/2022

Por Qué Não Utiliser Alla Språk? Mixed Training with Gradient Optimization in Few-Shot Cross-Lingual Transfer

The current state-of-the-art for few-shot cross-lingual transfer learnin...
research
06/15/2022

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

Articulatory features are inherently invariant to acoustic signal distor...
research
12/17/2021

Linguistic and Gender Variation in Speech Emotion Recognition using Spectral Features

This work explores the effect of gender and linguistic-based vocal varia...

Please sign up or login with your details

Forgot password? Click here to reset