Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

09/19/2020
by   Pei Zhang, et al.
10

Many document-level neural machine translation (NMT) systems have explored the utility of context-aware architecture, usually requiring an increasing number of parameters and computational complexity. However, few attention is paid to the baseline model. In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation. Therefore, we propose a surprisingly simple long-short term masking self-attention on top of the standard transformer to both effectively capture the long-range dependence and reduce the propagation of errors. We examine our approach on the two publicly available document-level datasets. We can achieve a strong result in BLEU and capture discourse phenomena.

READ FULL TEXT
research
09/05/2018

Document-Level Neural Machine Translation with Hierarchical Attention Networks

Neural Machine Translation (NMT) can be improved by including document-l...
research
09/16/2020

Document-level Neural Machine Translation with Document Embeddings

Standard neural machine translation (NMT) is on the assumption of docume...
research
12/12/2022

P-Transformer: Towards Better Document-to-Document Neural Machine Translation

Directly training a document-to-document (Doc2Doc) neural machine transl...
research
10/01/2019

When and Why is Document-level Context Useful in Neural Machine Translation?

Document-level context has received lots of attention for compensating n...
research
06/08/2023

Improving Long Context Document-Level Machine Translation

Document-level context for neural machine translation (NMT) is crucial t...
research
02/16/2023

Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

Existing work in document-level neural machine translation commonly conc...
research
09/15/2020

Attention-Aware Inference for Neural Abstractive Summarization

Inspired by Google's Neural Machine Translation (NMT) <cit.> that models...

Please sign up or login with your details

Forgot password? Click here to reset