Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables

09/10/2021
by   Weizhi Wang, et al.
0

Zero-shot translation, directly translating between language pairs unseen in training, is a promising capability of multilingual neural machine translation (NMT). However, it usually suffers from capturing spurious correlations between the output language and language invariant semantics due to the maximum likelihood training objective, leading to poor transfer performance on zero-shot translation. In this paper, we introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions. The theoretical analysis from the perspective of latent variables shows that our approach actually implicitly maximizes the probability distributions for zero-shot directions. On two benchmark machine translation datasets, we demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance. Our code is available at https://github.com/Victorwz/zs-nmt-dae.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2023

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization

The multilingual neural machine translation (NMT) model has a promising ...
research
01/20/2019

Mixed Formal Learning: A Path to Transparent Machine Learning

This paper presents Mixed Formal Learning, a new architecture that learn...
research
06/04/2019

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Zero-shot translation, translating between language pairs on which a Neu...
research
12/30/2020

Improving Zero-Shot Translation by Disentangling Positional Information

Multilingual neural machine translation has shown the capability of dire...
research
05/16/2023

Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation

This paper studies the impact of layer normalization (LayerNorm) on zero...
research
05/20/2022

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

Zero-shot translation is a promising direction for building a comprehens...
research
10/12/2020

Controllable Paraphrasing and Translation with a Syntactic Exemplar

Most prior work on exemplar-based syntactically controlled paraphrase ge...

Please sign up or login with your details

Forgot password? Click here to reset