Semi-supervised ASR by End-to-end Self-training

01/24/2020
by   Yang Chen, et al.
0

While deep learning based end-to-end automatic speech recognition (ASR) systems have greatly simplified modeling pipelines, they suffer from the data sparsity issue. In this work, we propose a self-training method with an end-to-end system for semi-supervised ASR. Starting from a Connectionist Temporal Classification (CTC) system trained on the supervised data, we iteratively generate pseudo-labels on a mini-batch of unsupervised utterances with the current model, and use the pseudo-labels to augment the supervised data for immediate model update. Our method retains the simplicity of end-to-end ASR systems, and can be seen as performing alternating optimization over a well-defined learning objective. We also perform empirical investigations of our method, regarding the effect of data augmentation, decoding beamsize for pseudo-label generation, and freshness of pseudo-labels. On a commonly used semi-supervised ASR setting with the WSJ corpus, our method gives 14.4 data augmentation, reducing the performance gap between the base system and the oracle system by 50

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Pseudo-labeling (PL) has been shown to be effective in semi-supervised a...
research
07/07/2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

We propose a semi-supervised learning method for building end-to-end ric...
research
10/27/2022

Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation

High-quality data labeling from specific domains is costly and human tim...
research
07/27/2020

Semi-Supervised Learning with Data Augmentation for End-to-End ASR

In this paper, we apply Semi-Supervised Learning (SSL) along with Data A...
research
08/09/2019

Repetitive Reprediction Deep Decipher for Semi-Supervised Learning

Most recent semi-supervised deep learning (deep SSL) methods used a simi...
research
09/19/2019

Self-Training for End-to-End Speech Recognition

We revisit self-training in the context of end-to-end speech recognition...
research
02/18/2022

R2-D2: Repetitive Reprediction Deep Decipher for Semi-Supervised Deep Learning

Most recent semi-supervised deep learning (deep SSL) methods used a simi...

Please sign up or login with your details

Forgot password? Click here to reset