Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation

03/30/2022
by   Kuan-Po Huang, et al.
0

Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models. It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech distortions while not hurting the original performance on clean speech. In this work, we propose to improve the robustness of speech processing models by domain adversarial training (DAT). We conducted experiments based on the SUPERB framework on five different speech processing tasks. In case we do not always have knowledge of the distortion types for speech data, we analyzed the binary-domain and multi-domain settings, where the former treats all distorted speech as one domain, and the latter views different distortions as different domains. In contrast to supervised training methods, we obtained promising results in target domains where speech data is distorted with different distortions including new unseen distortions introduced during testing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

Improving generalizability of distilled self-supervised speech processing models under distorted settings

Self-supervised learned (SSL) speech pre-trained models perform well acr...
research
03/14/2023

Improving Accented Speech Recognition with Multi-Domain Training

Thanks to the rise of self-supervised learning, automatic speech recogni...
research
03/01/2022

Towards a Common Speech Analysis Engine

Recent innovations in self-supervised representation learning have led t...
research
03/15/2021

DHASP: Differentiable Hearing Aid Speech Processing

Hearing aids are expected to improve speech intelligibility for listener...
research
10/20/2022

Speech Dereverberation with a Reverberation Time Shortening Target

This work proposes a new learning target based on reverberation time sho...
research
07/24/2023

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

Voice activity and overlapped speech detection (respectively VAD and OSD...

Please sign up or login with your details

Forgot password? Click here to reset