Log In Sign Up

Bilingual Dictionary-based Language Model Pretraining for Neural Machine Translation

by   Yusen Lin, et al.

Recent studies have demonstrated a perceivable improvement on the performance of neural machine translation by applying cross-lingual language model pretraining (Lample and Conneau, 2019), especially the Translation Language Modeling (TLM). To alleviate the need for expensive parallel corpora by TLM, in this work, we incorporate the translation information from dictionaries into the pretraining process and propose a novel Bilingual Dictionary-based Language Model (BDLM). We evaluate our BDLM in Chinese, English, and Romanian. For Chinese-English, we obtained a 55.0 BLEU on WMT-News19 (Tiedemann, 2012) and a 24.3 BLEU on WMT20 news-commentary, outperforming the Vanilla Transformer (Vaswani et al., 2017) by more than 8.4 BLEU and 2.3 BLEU, respectively. According to our results, the BDLM also has advantages on convergence speed and predicting rare words. The increase in BLEU for WMT16 Romanian-English also shows its effectiveness in low-resources language translation.


page 7

page 8


Cross-lingual Language Model Pretraining

Recent studies have demonstrated the efficiency of generative pretrainin...

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

Successful methods for unsupervised neural machine translation (UNMT) em...

Multi-Agent Cross-Translated Diversification for Unsupervised Machine Translation

Recent unsupervised machine translation (UMT) systems usually employ thr...

WeChat Neural Machine Translation Systems for WMT20

We participate in the WMT 2020 shared news translation task on Chinese t...

Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT

Using a language model (LM) pretrained on two languages with large monol...

Dynamic Fusion: Attentional Language Model for Neural Machine Translation

Neural Machine Translation (NMT) can be used to generate fluent output. ...

Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation

In the present study, we propose novel sequence-to-sequence pre-training...