Towards Fluent Translations from Disfluent Speech

11/07/2018
by   Elizabeth Salesky, et al.
0

When translating from speech, special consideration for conversational speech phenomena such as disfluencies is necessary. Most machine translation training data consists of well-formed written texts, causing issues when translating spontaneous speech. Previous work has introduced an intermediate step between speech recognition (ASR) and machine translation (MT) to remove disfluencies, making the data better-matched to typical translation text and significantly improving performance. However, with the rise of end-to-end speech translation systems, this intermediate step must be incorporated into the sequence-to-sequence architecture. Further, though translated speech datasets exist, they are typically news or rehearsed speech without many disfluencies (e.g. TED), or the disfluencies are translated into the references (e.g. Fisher). To generate clean translations from disfluent speech, cleaned references are necessary for evaluation. We introduce a corpus of cleaned target data for the Fisher Spanish-English dataset for this task. We compare how different architectures handle disfluencies and provide a baseline for removing disfluencies in end-to-end translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2019

Fluent Translations from Disfluent Speech in End-to-End Speech Translation

Spoken language translation applications for speech suffer due to conver...
research
12/11/2022

End-to-End Speech Translation of Arabic to English Broadcast News

Speech translation (ST) is the task of directly translating acoustic spe...
research
10/01/2019

Multilingual End-to-End Speech Translation

In this paper, we propose a simple yet effective framework for multiling...
research
02/13/2017

Towards speech-to-text translation without speech recognition

We explore the problem of translating speech to text in low-resource sce...
research
04/14/2020

Speech Translation and the End-to-End Promise: Taking Stock of Where We Are

Over its three decade history, speech translation has experienced severa...
research
05/22/2023

Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters

To translate speech for automatic dubbing, machine translation needs to ...
research
12/02/2019

Language Model Bootstrapping Using Neural Machine Translation For Conversational Speech Recognition

Building conversational speech recognition systems for new languages is ...

Please sign up or login with your details

Forgot password? Click here to reset