Large-Scale Self- and Semi-Supervised Learning for Speech Translation

04/14/2021
by   Changhan Wang, et al.
0

In this paper, we improve speech translation (ST) through effectively leveraging large quantities of unlabeled speech and text data in different and complementary ways. We explore both pretraining and self-training by using the large Libri-Light speech audio corpus and language modeling with CommonCrawl. Our experiments improve over the previous state of the art by 2.6 BLEU on average on all four considered CoVoST 2 language pairs via a simple recipe of combining wav2vec 2.0 pretraining, a single iteration of self-training and decoding with a language model. Different to existing work, our approach does not leverage any other supervision than ST data. Code and models will be publicly released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2021

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

This paper presents XLS-R, a large-scale model for cross-lingual speech ...
research
11/17/2020

Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

The goal of semi-supervised learning is to utilize the unlabeled, in-dom...
research
09/02/2021

Better Self-training for Image Classification through Self-supervision

Self-training is a simple semi-supervised learning approach: Unlabelled ...
research
12/15/2021

Textless Speech-to-Speech Translation on Real Data

We present a textless speech-to-speech translation (S2ST) system that ca...
research
09/30/2021

Semi-Supervised Text Classification via Self-Pretraining

We present a neural semi-supervised learning model termed Self-Pretraini...
research
05/22/2023

Duplex Diffusion Models Improve Speech-to-Speech Translation

Speech-to-speech translation is a typical sequence-to-sequence learning ...
research
10/29/2021

Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition

Recent advances in unsupervised representation learning have demonstrate...

Please sign up or login with your details

Forgot password? Click here to reset