Consistency by Agreement in Zero-shot Neural Machine Translation

04/04/2019
by   Maruan Al-Shedivat, et al.
16

Generalization and reliability of multilingual translation often highly depend on the amount of available parallel data for each language pair of interest. In this paper, we focus on zero-shot generalization---a challenging setup that tests models on translation directions they have not been optimized for at training time. To solve the problem, we (i) reformulate multilingual translation as probabilistic inference, (ii) define the notion of zero-shot consistency and show why standard training often results in models unsuitable for zero-shot tasks, and (iii) introduce a consistent agreement-based training method that encourages the model to produce equivalent translations of parallel sentences in auxiliary languages. We test our multilingual NMT models on multiple public zero-shot translation benchmarks (IWSLT17, UN corpus, Europarl) and show that agreement-based learning often results in 2-3 BLEU zero-shot improvement over strong baselines without any loss in performance on supervised translation directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2018

Improving Zero-Shot Translation of Low-Resource Languages

Recent work on multilingual neural machine translation reported competit...
research
08/10/2023

Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages

Current research in zero-shot translation is plagued by several issues s...
research
06/24/2019

Evaluating the Supervised and Zero-shot Performance of Multi-lingual Translation Models

We study several methods for full or partial sharing of the decoder para...
research
09/21/2021

Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents

Document-level neural machine translation (DocNMT) delivers coherent tra...
research
05/20/2022

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

Zero-shot translation is a promising direction for building a comprehens...
research
06/30/2022

Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations

Multilingual Neural Machine Translation (MNMT) enables one system to tra...
research
02/01/2022

Examining Scaling and Transfer of Language Model Architectures for Machine Translation

Natural language understanding and generation models follow one of the t...

Please sign up or login with your details

Forgot password? Click here to reset