Leveraging Discourse Rewards for Document-Level Neural Machine Translation

10/08/2020
by   Inigo Jauregi Unanue, et al.
0

Document-level machine translation focuses on the translation of entire documents from a source to a target language. It is widely regarded as a challenging task since the translation of the individual sentences in the document needs to retain aspects of the discourse at document level. However, document-level translation models are usually not trained to explicitly ensure discourse quality. Therefore, in this paper we propose a training approach that explicitly optimizes two established discourse metrics, lexical cohesion (LC) and coherence (COH), by using a reinforcement learning objective. Experiments over four different language pairs and three translation domains have shown that our training approach has been able to achieve more cohesive and coherent document translations than other competitive approaches, yet without compromising the faithfulness to the reference translation. In the case of the Zh-En language pair, our method has achieved an improvement of 2.46 percentage points (pp) in LC and 1.17 pp in COH over the runner-up, while at the same time improving 0.63 pp in BLEU score and 0.47 pp in F_BERT.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/14/2018

Modeling Coherence for Discourse Neural Machine Translation

Discourse coherence plays an important role in the translation of one te...
10/18/2020

Capturing Longer Context for Document-level Neural Machine Translation: A Multi-resolutional Approach

Discourse context has been proven useful when translating documents. It ...
05/16/2022

Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts using Multilayer Networks

Discourse cohesion facilitates text comprehension and helps the reader f...
01/26/2021

A Comparison of Approaches to Document-level Machine Translation

Document-level machine translation conditions on surrounding sentences t...
07/30/2019

English-Czech Systems in WMT19: Document-Level Transformer

We describe our NMT systems submitted to the WMT19 shared task in Englis...
10/04/2017

Discourse Structure in Machine Translation Evaluation

In this article, we explore the potential of using sentence-level discou...
10/30/2019

Fill in the Blanks: Imputing Missing Sentences for Larger-Context Neural Machine Translation

Most neural machine translation systems still translate sentences in iso...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.