Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

06/04/2019
by   Jiatao Gu, et al.
0

Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings. However, naive training for zero-shot NMT easily fails, and is sensitive to hyper-parameter setting. The performance typically lags far behind the more conventional pivot-based approach which translates twice using a third language as a pivot. In this work, we address the degeneracy problem due to capturing spurious correlations by quantitatively analyzing the mutual information between language IDs of the source and decoded sentences. Inspired by this analysis, we propose to use two simple but effective approaches: (1) decoder pre-training; (2) back-translation. These methods show significant improvement (4 22 BLEU points) over the vanilla zero-shot translation on three challenging multilingual datasets, and achieve similar or better results than the pivot-based approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2020

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Massively multilingual models for neural machine translation (NMT) are t...
research
01/27/2022

Learning How to Translate North Korean through South Korean

South and North Korea both use the Korean language. However, Korean NLP ...
research
05/25/2023

MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation

Efficient utilisation of both intra- and extra-textual context remains o...
research
09/10/2021

Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

Zero-shot translation, directly translating between language pairs unsee...
research
11/02/2021

Zero-Shot Translation using Diffusion Models

In this work, we show a novel method for neural machine translation (NMT...
research
05/17/2023

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

The language-independency of encoded representations within multilingual...
research
09/09/2022

Adapting to Non-Centered Languages for Zero-shot Multilingual Translation

Multilingual neural machine translation can translate unseen language pa...

Please sign up or login with your details

Forgot password? Click here to reset