Multilingual Neural Machine Translation with Knowledge Distillation

02/27/2019
by   Xu Tan, et al.
0

Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving. However, traditional multilingual translation usually yields inferior accuracy compared with the counterpart using individual models for each language pair, due to language diversity and model capacity limitations. In this paper, we propose a distillation-based approach to boost the accuracy of multilingual machine translation. Specifically, individual models are first trained and regarded as teachers, and then the multilingual model is trained to fit the training data and match the outputs of individual models simultaneously through knowledge distillation. Experiments on IWSLT, WMT and Ted talk translation datasets demonstrate the effectiveness of our method. Particularly, we show that one model is enough to handle multiple languages (up to 44 languages in our experiment), with comparable or even better accuracy than individual models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2021

Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?

Multilingual Neural Machine Translation (MNMT) trains a single NMT model...
research
12/06/2022

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

A common scenario of Multilingual Neural Machine Translation (MNMT) is t...
research
04/16/2021

Serial or Parallel? Plug-able Adapter for multilingual machine translation

Developing a unified multilingual translation model is a key topic in ma...
research
04/19/2023

An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models

Knowledge distillation (KD) is a well-known method for compressing neura...
research
05/23/2023

One-stop Training of Multiple Capacity Models for Multilingual Machine Translation

Training models with varying capacities can be advantageous for deployin...
research
05/25/2023

Towards Higher Pareto Frontier in Multilingual Machine Translation

Multilingual neural machine translation has witnessed remarkable progres...
research
05/23/2023

Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

Incorporating language-specific (LS) modules is a proven method to boost...

Please sign up or login with your details

Forgot password? Click here to reset