Log In Sign Up

Pseudo-Labeling for Massively Multilingual Speech Recognition

by   Loren Lugosch, et al.

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems. In this work, we extend pseudo-labeling to massively multilingual speech recognition with 60 languages. We propose a simple pseudo-labeling recipe that works well even with low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised learning on a target language, generate pseudo-labels for that language, and train a final model using pseudo-labels for all languages, either from scratch or by fine-tuning. Experiments on the labeled Common Voice and unlabeled VoxPopuli datasets show that our recipe can yield a model with better performance for many languages that also transfers well to LibriSpeech.


page 1

page 2

page 3

page 4


Semi-Supervised Learning Based on Reference Model for Low-resource TTS

Most previous neural text-to-speech (TTS) methods are mainly based on su...

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

This work presents a lifelong learning approach to train a multilingual ...

Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition

Fine tuning self supervised pretrained models using pseudo labels can ef...

SoftCTC x2013 Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

This paper explores semi-supervised training for sequence tasks, such as...