Self-Supervised Representations Improve End-to-End Speech Translation

06/22/2020
by   Anne Wu, et al.
0

End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In this work, we explore whether self-supervised pre-trained speech representations can benefit the speech translation task in both high- and low-resource settings, whether they can transfer well to other languages, and whether they can be effectively combined with other common methods that help improve low-resource end-to-end speech translation such as using a pre-trained high-resource speech recognition system. We demonstrate that self-supervised pre-trained features can consistently improve the translation performance, and cross-lingual transfer allows to extend to a variety of languages without or with little tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2022

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

This paper describes the ON-TRAC Consortium translation systems develope...
research
06/02/2023

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model

Multilingual self-supervised speech representation models have greatly e...
research
12/16/2021

Lacuna Reconstruction: Self-supervised Pre-training for Low-Resource Historical Document Transcription

We present a self-supervised pre-training approach for learning rich vis...
research
06/04/2020

CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning

More than half of the 7,000 languages in the world are in imminent dange...
research
06/10/2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Recent work on speech self-supervised learning (speech SSL) demonstrated...
research
06/04/2019

Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation

Previous work on end-to-end translation from speech has primarily used f...
research
04/05/2022

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Self-Supervised Learning (SSL) models have been successfully applied in ...

Please sign up or login with your details

Forgot password? Click here to reset