On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration

11/14/2022
by   Nauman Dawalatabad, et al.
11

Pseudo-label (PL) filtering forms a crucial part of Self-Training (ST) methods for unsupervised domain adaptation. Dropout-based Uncertainty-driven Self-Training (DUST) proceeds by first training a teacher model on source domain labeled data. Then, the teacher model is used to provide PLs for the unlabeled target domain data. Finally, we train a student on augmented labeled and pseudo-labeled data. The process is iterative, where the student becomes the teacher for the next DUST iteration. A crucial step that precedes the student model training in each DUST iteration is filtering out noisy PLs that could lead the student model astray. In DUST, we proposed a simple, effective, and theoretically sound PL filtering strategy based on the teacher model's uncertainty about its predictions on unlabeled speech utterances. We estimate the model's uncertainty by computing disagreement amongst multiple samples drawn from the teacher model during inference by injecting noise via dropout. In this work, we show that DUST's PL filtering, as initially used, may fail under severe source and target domain mismatch. We suggest several approaches to eliminate or alleviate this issue. Further, we bring insights from the research in neural network model calibration to DUST and show that a well-calibrated model correlates strongly with a positive outcome of the DUST PL filtering step.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/26/2020

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training

The performance of automatic speech recognition (ASR) systems typically ...
research
09/29/2021

Uncertainty-aware Mean Teacher for Source-free Unsupervised Domain Adaptive 3D Object Detection

Pseudo-label based self training approaches are a popular method for sou...
research
03/16/2023

Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation

We study unsupervised domain adaptation (UDA) for semantic segmentation....
research
12/08/2022

Self-training via Metric Learning for Source-Free Domain Adaptation of Semantic Segmentation

Unsupervised source-free domain adaptation methods aim to train a model ...
research
11/29/2021

Semi-supervised Domain Adaptation via Sample-to-Sample Self-Distillation

Semi-supervised domain adaptation (SSDA) is to adapt a learner to a new ...
research
06/13/2022

Learning Domain Adaptive Object Detection with Probabilistic Teacher

Self-training for unsupervised domain adaptive object detection is a cha...
research
01/06/2022

Self-Training Vision Language BERTs with a Unified Conditional Model

Natural language BERTs are trained with language corpus in a self-superv...

Please sign up or login with your details

Forgot password? Click here to reset