English-Czech Systems in WMT19: Document-Level Transformer

07/30/2019
by   Martin Popel, et al.
0

We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation. Our systems are based on the Transformer model implemented in either Tensor2Tensor (T2T) or Marian framework. We aimed at improving the adequacy and coherence of translated documents by enlarging the context of the source and target. Instead of translating each sentence independently, we split the document into possibly overlapping multi-sentence segments. In case of the T2T implementation, this "document-level"-trained system achieves a +0.6 BLEU improvement (p<0.05) relative to the same system applied on isolated sentences. To assess the potential effect document-level models might have on lexical coherence, we performed a semi-automatic analysis, which revealed only a few sentences improved in this aspect. Thus, we cannot draw any conclusions from this weak evidence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2020

Toward Making the Most of Context in Neural Machine Translation

Document-level machine translation manages to outperform sentence level ...
research
03/11/2020

Capturing document context inside sentence-level neural machine translation models with self-training

Neural machine translation (NMT) has arguably achieved human level parit...
research
10/08/2018

Split-Correctness in Information Extraction

Programs for extracting structured information from text, namely informa...
research
10/29/2019

Big Bidirectional Insertion Representations for Documents

The Insertion Transformer is well suited for long form text generation d...
research
06/07/2021

Diverse Pretrained Context Encodings Improve Document Translation

We propose a new architecture for adapting a sentence-level sequence-to-...
research
10/11/2020

Machine Translation of Mathematical Text

We have implemented a machine translation system, the PolyMath Translato...
research
01/01/2017

Aspect-augmented Adversarial Networks for Domain Adaptation

We introduce a neural method for transfer learning between two (source a...

Please sign up or login with your details

Forgot password? Click here to reset