On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation

05/18/2023
by   Liang Chen, et al.
0

While multilingual neural machine translation has achieved great success, it suffers from the off-target issue, where the translation is in the wrong language. This problem is more pronounced on zero-shot translation tasks. In this work, we find that failing in encoding discriminative target language signal will lead to off-target and a closer lexical distance (i.e., KL-divergence) between two languages' vocabularies is related with a higher off-target rate. We also find that solely isolating the vocab of different languages in the decoder can alleviate the problem. Motivated by the findings, we propose Language Aware Vocabulary Sharing (LAVS), a simple and effective algorithm to construct the multilingual vocabulary, that greatly alleviates the off-target problem of the translation model by increasing the KL-divergence between languages. We conduct experiments on a multilingual machine translation benchmark in 11 languages. Experiments show that the off-target rate for 90 translation tasks is reduced from 29\% to 8\%, while the overall BLEU score is improved by an average of 1.9 points without extra training cost or sacrificing the supervised directions' performance. We release the code at \href{https://github.com/chenllliang/Off-Target-MNMT}{https://github.com/chenllliang/Off-Target-MNMT} for reproduction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2017

Effective Strategies in Zero-Shot Neural Machine Translation

In this paper, we proposed two strategies which can be applied to a mult...
research
03/06/2022

Focus on the Target's Vocabulary: Masked Label Smoothing for Machine Translation

Label smoothing and vocabulary sharing are two widely used techniques in...
research
08/13/2018

Rapid Adaptation of Neural Machine Translation to New Languages

This paper examines the problem of adapting neural machine translation s...
research
08/14/2022

Fast Vocabulary Projection Method via Clustering for Multilingual Machine Translation on GPU

Multilingual Neural Machine Translation has been showing great success u...
research
09/15/2023

How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

Customizing machine translation models to comply with fine-grained attri...
research
05/20/2022

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

Zero-shot translation is a promising direction for building a comprehens...
research
06/30/2022

Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations

Multilingual Neural Machine Translation (MNMT) enables one system to tra...

Please sign up or login with your details

Forgot password? Click here to reset