Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors

04/24/2019
by   Nicholas Ruiz, et al.
0

Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language. While the evaluation of neural machine translation systems on textual inputs is actively researched in the literature , little has been discovered about the complexities of translating spoken language data with neural models. We introduce and motivate interesting problems one faces when considering the translation of automatic speech recognition (ASR) outputs on neural machine translation (NMT) systems. We test the robustness of sentence encoding approaches for NMT encoder-decoder modeling, focusing on word-based over byte-pair encoding. We compare the translation of utterances containing ASR errors in state-of-the-art NMT encoder-decoder systems against a strong phrase-based machine translation baseline in order to better understand which phenomena present in ASR outputs are better represented under the NMT framework than approaches that represent translation as a linear model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

Sentence Boundary Augmentation For Neural Machine Translation Robustness

Neural Machine Translation (NMT) models have demonstrated strong state o...
research
10/15/2018

Robust Neural Machine Translation with Joint Textual and Phonetic Embedding

Neural machine translation (NMT) is notoriously sensitive to noises, but...
research
09/25/2019

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

In a pipeline speech translation system, automatic speech recognition (A...
research
10/22/2019

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Neural machine translation models have shown to achieve high quality whe...
research
05/31/2023

How Does Pretraining Improve Discourse-Aware Translation?

Pretrained language models (PLMs) have produced substantial improvements...
research
08/27/2021

Secoco: Self-Correcting Encoding for Neural Machine Translation

This paper presents Self-correcting Encoding (Secoco), a framework that ...
research
04/25/2018

On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

We propose a process for investigating the extent to which sentence repr...

Please sign up or login with your details

Forgot password? Click here to reset