DeepAI AI Chat
Log In Sign Up

Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors

by   Nicholas Ruiz, et al.
Fondazione Bruno Kessler

Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language. While the evaluation of neural machine translation systems on textual inputs is actively researched in the literature , little has been discovered about the complexities of translating spoken language data with neural models. We introduce and motivate interesting problems one faces when considering the translation of automatic speech recognition (ASR) outputs on neural machine translation (NMT) systems. We test the robustness of sentence encoding approaches for NMT encoder-decoder modeling, focusing on word-based over byte-pair encoding. We compare the translation of utterances containing ASR errors in state-of-the-art NMT encoder-decoder systems against a strong phrase-based machine translation baseline in order to better understand which phenomena present in ASR outputs are better represented under the NMT framework than approaches that represent translation as a linear model.


page 1

page 2

page 3

page 4


Sentence Boundary Augmentation For Neural Machine Translation Robustness

Neural Machine Translation (NMT) models have demonstrated strong state o...

Robust Neural Machine Translation with Joint Textual and Phonetic Embedding

Neural machine translation (NMT) is notoriously sensitive to noises, but...

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

In a pipeline speech translation system, automatic speech recognition (A...

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

Neural machine translation models have shown to achieve high quality whe...

How Does Pretraining Improve Discourse-Aware Translation?

Pretrained language models (PLMs) have produced substantial improvements...

Secoco: Self-Correcting Encoding for Neural Machine Translation

This paper presents Self-correcting Encoding (Secoco), a framework that ...

On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

We propose a process for investigating the extent to which sentence repr...