Context-Aware Monolingual Repair for Neural Machine Translation

09/03/2019
by   Elena Voita, et al.
0

Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2022

Discourse Cohesion Evaluation for Document-Level Neural Machine Translation

It is well known that translations generated by an excellent document-le...
research
11/20/2019

Controlling Neural Machine Translation Formality with Synthetic Supervision

This work aims to produce translations that convey source language conte...
research
06/14/2018

Translations as Additional Contexts for Sentence Classification

In sentence classification tasks, additional contexts, such as the neigh...
research
10/07/2015

Using Ontology-Based Context in the Portuguese-English Translation of Homographs in Textual Dialogues

This paper introduces a novel approach to tackle the existing gap on mes...
research
08/31/2019

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

The ongoing neural revolution in machine translation has made it easier ...
research
01/26/2021

A Comparison of Approaches to Document-level Machine Translation

Document-level machine translation conditions on surrounding sentences t...
research
07/01/2021

Modeling Target-side Inflection in Placeholder Translation

Placeholder translation systems enable the users to specify how a specif...

Please sign up or login with your details

Forgot password? Click here to reset