Speech Translation and the End-to-End Promise: Taking Stock of Where We Are

04/14/2020
by   Matthias Sperber, et al.
0

Over its three decade history, speech translation has experienced several shifts in its primary research themes; moving from loosely coupled cascades of speech recognition and machine translation, to exploring questions of tight coupling, and finally to end-to-end models that have recently attracted much attention. This paper provides a brief survey of these developments, along with a discussion of the main challenges of traditional approaches which stem from committing to intermediate representations from the speech recognizer, and from training cascaded models separately towards different objectives. Recent end-to-end modeling techniques promise a principled way of overcoming these issues by allowing joint training of all model components and removing the need for explicit intermediate representations. However, a closer look reveals that many end-to-end models fall short of solving these issues, due to compromises made to address data scarcity. This paper provides a unifying categorization and nomenclature that covers both traditional and recent approaches and that may help researchers by highlighting both trade-offs and open research questions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2020

Consistent Transcription and Translation of Speech

The conventional paradigm in speech translation starts with a speech rec...
research
06/03/2019

Fluent Translations from Disfluent Speech in End-to-End Speech Translation

Spoken language translation applications for speech suffer due to conver...
research
04/15/2019

Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation

Speech translation has traditionally been approached through cascaded mo...
research
11/07/2018

Towards Fluent Translations from Disfluent Speech

When translating from speech, special consideration for conversational s...
research
01/22/2021

Streaming Models for Joint Speech Recognition and Translation

Using end-to-end models for speech translation (ST) has increasingly bee...
research
09/27/2021

Integrated Training for Sequence-to-Sequence Models Using Non-Autoregressive Transformer

Complex natural language applications such as speech translation or pivo...
research
05/02/2021

Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks

End-to-end approaches for sequence tasks are becoming increasingly popul...

Please sign up or login with your details

Forgot password? Click here to reset