OmniKnight: Multilingual Neural Machine Translation with Language-Specific Self-Distillation

05/03/2022
by   Yichong Huang, et al.
0

Although all-in-one-model multilingual neural machine translation (MNMT) has achieved remarkable progress in recent years, its selected best overall checkpoint fails to achieve the best performance simultaneously in all language pairs. It is because that the best checkpoints for each individual language pair (i.e., language-specific best checkpoints) scatter in different epochs. In this paper, we present a novel training strategy dubbed Language-Specific Self-Distillation (LSSD) for bridging the gap between language-specific best checkpoints and the overall best checkpoint. In detail, we regard each language-specific best checkpoint as a teacher to distill the overall best checkpoint. Moreover, we systematically explore three variants of our LSSD, which perform distillation statically, selectively, and adaptively. Experimental results on two widely-used benchmarks show that LSSD obtains consistent improvements towards all language pairs and achieves the state-of-the-art

READ FULL TEXT
research
04/21/2020

Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation

Unsupervised neural machine translation (UNMT) has recently achieved rem...
research
03/05/2020

Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation

We explore best practices for training small, memory efficient machine t...
research
12/06/2022

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

A common scenario of Multilingual Neural Machine Translation (MNMT) is t...
research
05/31/2021

Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?

Recent studies on the analysis of the multilingual representations focus...
research
05/25/2023

Towards Higher Pareto Frontier in Multilingual Machine Translation

Multilingual neural machine translation has witnessed remarkable progres...
research
07/14/2021

Importance-based Neuron Allocation for Multilingual Neural Machine Translation

Multilingual neural machine translation with a single model has drawn mu...
research
02/10/2023

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Multilingual machine translation (MMT) benefits from cross-lingual trans...

Please sign up or login with your details

Forgot password? Click here to reset