Pseudo-Labeling for Massively Multilingual Speech Recognition

10/30/2021
by   Loren Lugosch, et al.
0

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems. In this work, we extend pseudo-labeling to massively multilingual speech recognition with 60 languages. We propose a simple pseudo-labeling recipe that works well even with low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised learning on a target language, generate pseudo-labels for that language, and train a final model using pseudo-labels for all languages, either from scratch or by fine-tuning. Experiments on the labeled Common Voice and unlabeled VoxPopuli datasets show that our recipe can yield a model with better performance for many languages that also transfers well to LibriSpeech.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/19/2020

Iterative Pseudo-Labeling for Speech Recognition

Pseudo-labeling has recently shown promise in end-to-end automatic speec...
10/09/2021

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

This work presents a lifelong learning approach to train a multilingual ...
01/02/2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

We introduce VoxPopuli, a large-scale multilingual corpus providing 100K...
03/09/2021

Contrastive Semi-supervised Learning for ASR

Pseudo-labeling is the most adopted method for pre-training automatic sp...
08/31/2019

Small and Practical BERT Models for Sequence Labeling

We propose a practical scheme to train a single multilingual sequence la...
11/20/2009

Likelihood-based semi-supervised model selection with applications to speech processing

In conventional supervised pattern recognition tasks, model selection is...
04/01/2021

Multiview Pseudo-Labeling for Semi-supervised Learning from Video

We present a multiview pseudo-labeling approach to video learning, a nov...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.