Improving generalizability of distilled self-supervised speech processing models under distorted settings

10/14/2022
by   Kuan-Po Huang, et al.
0

Self-supervised learned (SSL) speech pre-trained models perform well across various speech processing tasks. Distilled versions of SSL models have been developed to match the needs of on-device speech applications. Though having similar performance as original SSL models, distilled counterparts suffer from performance degradation even more than their original versions in distorted environments. This paper proposes to apply Cross-Distortion Mapping and Domain Adversarial Training to SSL models during knowledge distillation to alleviate the performance gap caused by the domain mismatch problem. Results show consistent performance improvements under both in- and out-of-domain distorted setups for different downstream tasks while keeping efficient model size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2022

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation

Speech distortions are a long-standing problem that degrades the perform...
research
02/24/2023

Ensemble knowledge distillation of self-supervised speech models

Distilled self-supervised models have shown competitive performance and ...
research
07/08/2021

Improved Language Identification Through Cross-Lingual Self-Supervised Learning

Language identification greatly impacts the success of downstream tasks ...
research
09/11/2023

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

Self-supervised learning (SSL) is at the origin of unprecedented improve...
research
02/27/2023

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

Self-supervised speech representation learning (SSL) has shown to be eff...
research
11/17/2022

MelHuBERT: A simplified HuBERT on Mel spectrogram

Self-supervised models have had great success in learning speech represe...
research
02/18/2023

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness

Self-supervised speech pre-training enables deep neural network models t...

Please sign up or login with your details

Forgot password? Click here to reset