Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

by   Yuqing Tang, et al.

Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages. Previous work in multilingual pretraining has demonstrated that machine translation systems can be created by finetuning on bitext. In this work, we show that multilingual translation models can be created through multilingual finetuning. Instead of finetuning on one direction, a pretrained model is finetuned on many directions at the same time. Compared to multilingual models trained from scratch, starting from pretrained models incorporates the benefits of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is not available. We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance. We double the number of languages in mBART to support multilingual machine translation models of 50 languages. Finally, we create the ML50 benchmark, covering low, mid, and high resource languages, to facilitate reproducible research by standardizing training and evaluation data. On ML50, we demonstrate that multilingual finetuning improves on average 1 BLEU over the strongest baselines (being either multilingual from scratch or bilingual finetuning) while improving 9.3 BLEU on average over bilingual baselines from scratch.


page 1

page 2

page 3

page 4


Many-to-English Machine Translation Tools, Data, and Pretrained Models

While there are more than 7000 languages in the world, most translation ...

AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages

Reproducible benchmarks are crucial in driving progress of machine trans...

Assessing Reference-Free Peer Evaluation for Machine Translation

Reference-free evaluation has the potential to make machine translation ...

UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining

Pretrained multilingual large language models have typically used heuris...

Emergent Communication Pretraining for Few-Shot Machine Translation

While state-of-the-art models that rely upon massively multilingual pret...

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

Recent work in multilingual machine translation (MMT) has focused on the...

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

With multilingual machine translation (MMT) models continuing to grow in...

Please sign up or login with your details

Forgot password? Click here to reset