The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task

10/25/2020
by   Alexandra Chronopoulou, et al.
1

This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German<->Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing a UNMT model, which is trained with online backtranslation. Pseudo-parallel data obtained from an unsupervised statistical machine translation (USMT) system is used to fine-tune the UNMT model. We also apply BPE-Dropout to the low resource (Upper Sorbian) data to obtain a more robust system. We additionally experiment with residual adapters and find them useful in the Upper Sorbian->German direction. We explore sampling during backtranslation and curriculum learning to use SMT translations in a more principled way. Finally, we ensemble our best-performing systems and reach a BLEU score of 32.4 on German->Upper Sorbian and 35.2 on Upper Sorbian->German.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/22/2020

CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20

This paper presents a description of CUNI systems submitted to the WMT20...
10/11/2020

SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task

In this paper, we introduced our joint team SJTU-NICT 's participation i...
09/24/2021

Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

This paper describes the methods behind the systems submitted by the Uni...
09/04/2018

Unsupervised Statistical Machine Translation

While modern machine translation has relied on large parallel corpora, a...
08/16/2019

Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring

This paper describes CAiRE's submission to the unsupervised machine tran...
07/10/2019

Lingua Custodia at WMT'19: Attempts to Control Terminology

This paper describes Lingua Custodia's submission to the WMT'19 news sha...
05/12/2020

Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

We reassess the claims of human parity and super-human performance made ...