Capturing Longer Context for Document-level Neural Machine Translation: A Multi-resolutional Approach

10/18/2020 ∙ by Zewei Sun, et al. ∙ 0

Discourse context has been proven useful when translating documents. It is quite a challenge to incorporate long document context in the prevailing neural machine translation models such as Transformer. In this paper, we propose multi-resolutional (MR) Doc2Doc, a method to train a neural sequence-to-sequence model for document-level translation. Our trained model can simultaneously translate sentence by sentence as well as a document as a whole. We evaluate our method and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that MR Doc2Doc outperforms sentence-level models and previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation.



There are no comments yet.


page 1

page 2

page 3

page 4

Code Repositories


The repository for the paper: Capturing Longer Context for Document-level Neural Machine Translation – A Multi-resolutional Approach

view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.