Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

06/16/2022
by   Bowen Zhang, et al.
0

Recent studies have shown that the benefits provided by self-supervised pre-training and self-training (pseudo-labeling) are complementary. Semi-supervised fine-tuning strategies under the pre-training framework, however, remain insufficiently studied. Besides, modern semi-supervised speech recognition algorithms either treat unlabeled data indiscriminately or filter out noisy samples with a confidence threshold. The dissimilarities among different unlabeled data are often ignored. In this paper, we propose Censer, a semi-supervised speech recognition algorithm based on self-supervised pre-training to maximize the utilization of unlabeled data. The pre-training stage of Censer adopts wav2vec2.0 and the fine-tuning stage employs an improved semi-supervised learning algorithm from slimIPL, which leverages unlabeled data progressively according to their pseudo labels' qualities. We also incorporate a temporal pseudo label pool and an exponential moving average to control the pseudo labels' update frequency and to avoid model divergence. Experimental results on Libri-Light and LibriSpeech datasets manifest our proposed method achieves better performance compared to existing approaches while being more unified.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2022

Semi-supervised Vision Transformers at Scale

We study semi-supervised learning (SSL) for vision transformers (ViT), a...
research
08/23/2023

KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

Despite recent availability of large transcribed Kinyarwanda speech data...
research
08/31/2023

Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis

The spontaneous behavior that often occurs in conversations makes speech...
research
08/19/2021

Improving Semi-Supervised Learning for Remaining Useful Lifetime Estimation Through Self-Supervision

RUL estimation suffers from a server data imbalance where data from mach...
research
05/22/2023

Rethinking Semi-supervised Learning with Language Models

Semi-supervised learning (SSL) is a popular setting aiming to effectivel...
research
09/14/2021

Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding

Task-adaptive pre-training (TAPT) and Self-training (ST) have emerged as...
research
03/25/2021

Contrast to Divide: Self-Supervised Pre-Training for Learning with Noisy Labels

The success of learning with noisy labels (LNL) methods relies heavily o...

Please sign up or login with your details

Forgot password? Click here to reset