One-To-Many Multilingual End-to-end Speech Translation

10/08/2019
by   Mattia Antonino Di Gangi, et al.
0

Nowadays, training end-to-end neural models for spoken language translation (SLT) still has to confront with extreme data scarcity conditions. The existing SLT parallel corpora are indeed orders of magnitude smaller than those available for the closely related tasks of automatic speech recognition (ASR) and machine translation (MT), which usually comprise tens of millions of instances. To cope with data paucity, in this paper we explore the effectiveness of transfer learning in end-to-end SLT by presenting a multilingual approach to the task. Multilingual solutions are widely studied in MT and usually rely on “target forcing”, in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language. However, when tested in speech translation, our experiments show that MT-like target forcing, used as is, is not effective in discriminating among the target languages. Thus, we propose a variant that uses target-language embeddings to shift the input representations in different portions of the space according to the language, so to better support the production of output in the desired target language. Our experiments on end-to-end SLT from English into six languages show important improvements when translating into similar languages, especially when these are supported by scarce data. Further improvements are obtained when using English ASR data as an additional language (up to +2.5 BLEU points).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2019

Multilingual End-to-End Speech Translation

In this paper, we propose a simple yet effective framework for multiling...
research
07/13/2021

Zero-shot Speech Translation

Speech Translation (ST) is the task of translating speech in one languag...
research
06/08/2023

KIT's Multilingual Speech Translation System for IWSLT 2023

Many existing speech translation benchmarks focus on native-English spee...
research
04/10/2020

Scalable Multilingual Frontend for TTS

This paper describes progress towards making a Neural Text-to-Speech (TT...
research
11/11/2019

Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning

End-to-end Speech Translation (ST) models have several advantages such a...
research
09/20/2021

MeetDot: Videoconferencing with Live Translation Captions

We present MeetDot, a videoconferencing system with live translation cap...
research
10/30/2019

ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task

This paper describes the ON-TRAC Consortium translation systems develope...

Please sign up or login with your details

Forgot password? Click here to reset