Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training

10/21/2020
by   Alex Wilf, et al.
0

Robustness to environmental noise is important to creating automatic speech emotion recognition systems that are deployable in the real world. Prior work on noise robustness has assumed that systems would not make use of sample-by-sample training noise conditions, or that they would have access to unlabelled testing data to generalize across noise conditions. We avoid these assumptions and introduce the resulting task as heterogeneous condition training. We show that with full knowledge of the test noise conditions, we can improve performance by dynamically routing samples to specialized feature encoders for each noise condition, and with partial knowledge, we can use known noise conditions and domain adaptation algorithms to train systems that generalize well to unseen noise conditions. We then extend these improvements to the multimodal setting by dynamically routing samples to maintain temporal ordering, resulting in significant improvements over approaches that do not specialize or generalize based on noise type.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2021

Best Practices for Noise-Based Augmentation to Improve the Performance of Emotion Recognition "In the Wild"

Emotion recognition as a key component of high-stake downstream applicat...
research
04/06/2018

On the Robustness of Speech Emotion Recognition for Human-Robot Interaction with Deep Neural Networks

Speech emotion recognition (SER) is an important aspect of effective hum...
research
03/02/2021

Investigations on Audiovisual Emotion Recognition in Noisy Conditions

In this paper we explore audiovisual emotion recognition under noisy aco...
research
09/03/2023

Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Speech emotion recognition (SER) often experiences reduced performance d...
research
12/24/2020

Wheel-Rail Interface Condition Estimation (W-RICE)

The surface roughness between the wheel and rail has a huge influence on...
research
05/18/2020

Deep Architecture Enhancing Robustness to Noise, Adversarial Attacks, and Cross-corpus Setting for Speech Emotion Recognition

Speech emotion recognition systems (SER) can achieve high accuracy when ...
research
06/01/2018

Machines hear better when they have ears

Deep-neural-network (DNN) based noise suppression systems yield signific...

Please sign up or login with your details

Forgot password? Click here to reset