On the Impact of Various Types of Noise on Neural Machine Translation

05/31/2018
by   Huda Khayrallah, et al.
0

We examine how various types of noise in the parallel training data impact the quality of neural machine translation systems. We create five types of artificial noise and analyze how they degrade performance in neural and statistical machine translation. We find that neural models are generally more harmed by noise than statistical models. For one especially egregious type of noise they learn to just copy the input sentence.

READ FULL TEXT
research
06/12/2017

Six Challenges for Neural Machine Translation

We explore six challenges for neural machine translation: domain mismatc...
research
07/29/2017

Curriculum Learning and Minibatch Bucketing in Neural Machine Translation

We examine the effects of particular orderings of sentence pairs on the ...
research
02/28/2020

Robust Unsupervised Neural Machine Translation with Adversarial Training

Unsupervised neural machine translation (UNMT) has recently attracted gr...
research
10/19/2018

Impact of Corpora Quality on Neural Machine Translation

Large parallel corpora that are automatically obtained from the web, doc...
research
09/11/2020

Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation

Neural machine translation systems typically are trained on curated corp...
research
05/31/2021

Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation

While it has been shown that Neural Machine Translation (NMT) is highly ...
research
02/23/2019

Augmenting Neural Machine Translation with Knowledge Graphs

While neural networks have been used extensively to make substantial pro...

Please sign up or login with your details

Forgot password? Click here to reset