Neural Machine Translation of Text from Non-Native Speakers

08/19/2018
by   Alison Lui, et al.
0

Neural Machine Translation (NMT) systems are known to degrade when confronted with noisy data, especially when the system is trained only on clean data. In this paper, we show that augmenting training data with sentences containing artificially-introduced grammatical errors can make the system more robust to such errors. In combination with an automatic grammar error correction system, we can recover 1.5 BLEU out of 2.4 BLEU lost due to grammatical errors. We also present a set of Spanish translations of the JFLEG grammar error correction corpus, which allows for testing NMT robustness to real grammatical errors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2019

An Analysis of Source-Side Grammatical Errors in NMT

The quality of Neural Machine Translation (NMT) has been shown to signif...
research
03/12/2021

Improving Translation Robustness with Visual Cues and Error Correction

Neural Machine Translation models are brittle to input noise. Current ro...
research
05/25/2020

From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction

A great deal of historical corpora suffer from errors introduced by the ...
research
09/04/2020

Recent Trends in the Use of Deep Learning Models for Grammar Error Handling

Grammar error handling (GEH) is an important topic in natural language p...
research
11/09/2022

Grammatical Error Correction: A Survey of the State of the Art

Grammatical Error Correction (GEC) is the task of automatically detectin...
research
08/07/2020

Data Weighted Training Strategies for Grammatical Error Correction

Recent progress in the task of Grammatical Error Correction (GEC) has be...
research
04/20/2021

Grammatical Error Generation Based on Translated Fragments

We perform neural machine translation of sentence fragments in order to ...

Please sign up or login with your details

Forgot password? Click here to reset