Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation

03/01/2023
by   Rehan Ahmad, et al.
0

Knowledge distillation has widely been used for model compression and domain adaptation for speech applications. In the presence of multiple teachers, knowledge can easily be transferred to the student by averaging the models output. However, previous research shows that the student do not adapt well with such combination. This paper propose to use an elitist sampling strategy at the output of ensemble teacher models to select the best-decoded utterance generated by completely out-of-domain teacher models for generalizing unseen domain. The teacher models are trained on AMI, LibriSpeech and WSJ while the student is adapted for the Switchboard data. The results show that with the selection strategy based on the individual models posteriors the student model achieves a better WER compared to all the teachers and baselines with a minimum absolute improvement of about 8.4 percent. Furthermore, an insights on the model adaptation with out-of-domain data has also been studied via correlation analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2023

Weight Averaging Improves Knowledge Distillation under Domain Shift

Knowledge distillation (KD) is a powerful model compression technique br...
research
11/15/2022

Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

The linear ensemble based strategy, i.e., averaging ensemble, has been p...
research
11/08/2020

Ensembled CTR Prediction via Knowledge Distillation

Recently, deep learning-based models have been widely studied for click-...
research
04/07/2019

Long-Term Vehicle Localization by Recursive Knowledge Distillation

Most of the current state-of-the-art frameworks for cross-season visual ...
research
10/16/2021

Sparse Distillation: Speeding Up Text Classification by Using Bigger Models

Distilling state-of-the-art transformer models into lightweight student ...
research
05/23/2022

LILA-BOTI : Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition

Word-level handwritten optical character recognition (OCR) remains a cha...
research
04/15/2022

Ensemble diverse hypotheses and knowledge distillation for unsupervised cross-subject adaptation

Recognizing human locomotion intent and activities is important for cont...

Please sign up or login with your details

Forgot password? Click here to reset