Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition

06/14/2021
by   Vimal Manohar, et al.
0

In this paper, we introduce the Kaizen framework that uses a continuously improving teacher to generate pseudo-labels for semi-supervised training. The proposed approach uses a teacher model which is updated as the exponential moving average of the student model parameters. This can be seen as a continuous version of the iterative pseudo-labeling approach for semi-supervised training. It is applicable for different training criteria, and in this paper we demonstrate it for frame-level hybrid hidden Markov model - deep neural network (HMM-DNN) models and sequence-level connectionist temporal classification (CTC) based models. The proposed approach shows more than 10 word error rate (WER) reduction over standard teacher-student training and more than 50% relative WER reduction over 10 hour supervised baseline when using large scale realistic unsupervised public videos in UK English and Italian languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2022

Pseudo Label Is Better Than Human Label

State-of-the-art automatic speech recognition (ASR) systems are trained ...
research
10/04/2021

Spatial Ensemble: a Novel Model Smoothing Mechanism for Student-Teacher Framework

Model smoothing is of central importance for obtaining a reliable teache...
research
01/21/2021

Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning

We present a plug-in replacement for batch normalization (BN) called exp...
research
09/04/2019

Snowball: Iterative Model Evolution and Confident Sample Discovery for Semi-Supervised Learning on Very Small Labeled Datasets

In this work, we develop a joint sample discovery and iterative model ev...
research
07/25/2023

TMR-RD: Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection

Semi-supervised object detection (SSOD) can incorporate limited labeled ...
research
12/12/2020

Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection

Due to the high annotation cost of large-scale facial landmark detection...
research
03/17/2020

Teacher-Student chain for efficient semi-supervised histology image classification

Deep learning shows great potential for the domain of digital pathology....

Please sign up or login with your details

Forgot password? Click here to reset