Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

12/06/2016
by   Alexandre Berard, et al.
0

This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding. We propose a model for direct speech-to-text translation, which gives promising results on a small French-English synthetic corpus. Relaxing the need for source language transcription would drastically change the data collection methodology in speech translation, especially in under-resourced scenarios. For instance, in the former project DARPA TRANSTAC (speech translation from spoken Arabic dialects), a large effort was devoted to the collection of speech transcripts (and a prerequisite to obtain transcripts was often a detailed transcription guide for languages with little standardized spelling). Now, if end-to-end approaches for speech-to-text translation are successful, one might consider collecting data by asking bilingual speakers to directly utter speech in the source language from target language text utterances. Such an approach has the advantage to be applicable to any unwritten (source) language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2018

End-to-End Automatic Speech Translation of Audiobooks

We investigate end-to-end speech-to-text translation on a corpus of audi...
research
11/04/2018

Towards Unsupervised Speech-to-Text Translation

We present a framework for building speech-to-text translation (ST) syst...
research
11/11/2022

Speech-to-Speech Translation For A Real-world Unwritten Language

We study speech-to-speech translation (S2ST) that translates speech from...
research
11/18/2022

Dialogs Re-enacted Across Languages

To support machine learning of cross-language prosodic mappings and othe...
research
07/09/2023

Towards cross-language prosody transfer for dialog

Speech-to-speech translation systems today do not adequately support use...
research
09/21/2020

SDST: Successive Decoding for Speech-to-text Translation

End-to-end speech-to-text translation (ST), which directly translates th...
research
03/29/2022

Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications

In speech communication, how something is said (paralinguistic informati...

Please sign up or login with your details

Forgot password? Click here to reset