Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search

10/31/2022
by   Zihan Wang, et al.
0

Speech emotion recognition (SER) classifies audio into emotion categories such as Happy, Angry, Fear, Disgust and Neutral. While Speech Emotion Recognition (SER) is a common application for popular languages, it continues to be a problem for low-resourced languages, i.e., languages with no pretrained speech-to-text recognition models. This paper firstly proposes a language-specific model that extract emotional information from multiple pre-trained speech models, and then designs a multi-domain model that simultaneously performs SER for various languages. Our multidomain model employs a multi-gating mechanism to generate unique weighted feature combination for each language, and also searches for specific neural network structure for each language through a neural architecture search module. In addition, we introduce a contrastive auxiliary loss to build more separable representations for audio data. Our experiments show that our model raises the state-of-the-art accuracy by 3

READ FULL TEXT
research
11/29/2019

Bimodal Speech Emotion Recognition Using Pre-Trained Language Models

Speech emotion recognition is a challenging task and an important step t...
research
06/08/2021

Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention

Emotion recognition from speech is a challenging task. Re-cent advances ...
research
07/03/2022

A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition

Speech emotion recognition (SER) is an essential part of human-computer ...
research
08/17/2023

Decoding Emotions: A comprehensive Multilingual Study of Speech Models for Speech Emotion Recognition

Recent advancements in transformer-based speech representation models ha...
research
07/20/2023

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition

This paper presents a paradigm that adapts general large-scale pretraine...
research
05/23/2023

Improving Speech Emotion Recognition Performance using Differentiable Architecture Search

Speech Emotion Recognition (SER) is a critical enabler of emotion-aware ...
research
10/26/2022

Pretrained audio neural networks for Speech emotion recognition in Portuguese

The goal of speech emotion recognition (SER) is to identify the emotiona...

Please sign up or login with your details

Forgot password? Click here to reset