DeepAI AI Chat
Log In Sign Up

The Missing Ingredient in Zero-Shot Neural Machine Translation

by   Naveen Arivazhagan, et al.

Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages. Despite various approaches to train such models, they have difficulty with zero-shot translation: translating between language pairs that were not together seen during training. In this paper we first diagnose why state-of-the-art multilingual NMT models that rely purely on parameter sharing, fail to generalize to unseen language pairs. We then propose auxiliary losses on the NMT encoder that impose representational invariance across languages. Our simple approach vastly improves zero-shot translation quality without regressing on supervised directions. For the first time, on WMT14 English-FrenchGerman, we achieve zero-shot performance that is on par with pivoting. We also demonstrate the easy scalability of our approach to multiple languages on the IWSLT 2017 shared task.


page 1

page 2

page 3

page 4


Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Massively multilingual models for neural machine translation (NMT) are t...

Multilingual Neural Machine Translation for Zero-Resource Languages

In recent years, Neural Machine Translation (NMT) has been shown to be m...

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

We propose a simple solution to use a single Neural Machine Translation ...

Multilingual Neural Machine Translation with Task-Specific Attention

Multilingual machine translation addresses the task of translating betwe...

Can Domains Be Transferred Across Languages in Multi-Domain Multilingual Neural Machine Translation?

Previous works mostly focus on either multilingual or multi-domain aspec...

Improving Zero-Shot Translation by Disentangling Positional Information

Multilingual neural machine translation has shown the capability of dire...

Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

Zero-shot translation, directly translating between language pairs unsee...