Improving Pseudo-label Training For End-to-end Speech Recognition Using Gradient Mask

10/08/2021
by   Shaoshi Ling, et al.
0

In the recent trend of semi-supervised speech recognition, both self-supervised representation learning and pseudo-labeling have shown promising results. In this paper, we propose a novel approach to combine their ideas for end-to-end speech recognition model. Without any extra loss function, we utilize the Gradient Mask to optimize the model when training on pseudo-label. This method forces the speech recognition model to predict from the masked input to learn strong acoustic representation and make training robust to label noise. In our semi-supervised experiments, the method can improve the model performance when training on pseudo-label and our method achieved competitive results comparing with other semi-supervised approaches on the Librispeech 100 hours experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2019

Self-Training for End-to-End Speech Recognition

We revisit self-training in the context of end-to-end speech recognition...
research
12/05/2022

SoftCTC x2013 Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

This paper explores semi-supervised training for sequence tasks, such as...
research
10/20/2022

Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses

We propose a novel method that combines CycleGAN and inter-domain losses...
research
04/02/2022

Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition

In Uyghur speech, consonant and vowel reduction are often encountered, e...
research
05/27/2022

Contrastive Siamese Network for Semi-supervised Speech Recognition

This paper introduces contrastive siamese (c-siam) network, an architect...
research
08/12/2023

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

When labeled data is insufficient, semi-supervised learning with the pse...
research
03/29/2022

Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment

Current leading mispronunciation detection and diagnosis (MDD) systems a...

Please sign up or login with your details

Forgot password? Click here to reset