The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task

by   Alexandra Chronopoulou, et al.

This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German<->Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing a UNMT model, which is trained with online backtranslation. Pseudo-parallel data obtained from an unsupervised statistical machine translation (USMT) system is used to fine-tune the UNMT model. We also apply BPE-Dropout to the low resource (Upper Sorbian) data to obtain a more robust system. We additionally experiment with residual adapters and find them useful in the Upper Sorbian->German direction. We explore sampling during backtranslation and curriculum learning to use SMT translations in a more principled way. Finally, we ensemble our best-performing systems and reach a BLEU score of 32.4 on German->Upper Sorbian and 35.2 on Upper Sorbian->German.


page 1

page 2

page 3

page 4


CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20

This paper presents a description of CUNI systems submitted to the WMT20...

SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task

In this paper, we introduced our joint team SJTU-NICT 's participation i...

Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

This paper describes the methods behind the systems submitted by the Uni...

Unsupervised Statistical Machine Translation

While modern machine translation has relied on large parallel corpora, a...

Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring

This paper describes CAiRE's submission to the unsupervised machine tran...

Lingua Custodia at WMT'19: Attempts to Control Terminology

This paper describes Lingua Custodia's submission to the WMT'19 news sha...

Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

We reassess the claims of human parity and super-human performance made ...