TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module

05/31/2023
by   Xingran Chen, et al.
0

Due to Multilingual Neural Machine Translation's (MNMT) capability of zero-shot translation, many works have been carried out to fully exploit the potential of MNMT in zero-shot translation. It is often hypothesized that positional information may hinder the MNMT from outputting a robust encoded representation for decoding. However, previous approaches treat all the positional information equally and thus are unable to selectively remove certain positional information. In sharp contrast, this paper investigates how to learn to selectively preserve useful positional information. We describe the specific mechanism of positional information influencing MNMT from the perspective of linguistics at the token level. We design a token-level position disentangle module (TPDM) framework to disentangle positional information at the token level based on the explanation. Our experiments demonstrate that our framework improves zero-shot translation by a large margin while reducing the performance loss in the supervised direction compared to previous works.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2023

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

The language-independency of encoded representations within multilingual...
research
10/28/2022

Improving Zero-Shot Multilingual Translation with Universal Representations and Cross-Mappings

The many-to-many multilingual neural machine translation can translate b...
research
11/03/2020

Subword Segmentation and a Single Bridge Language Affect Zero-Shot Neural Machine Translation

Zero-shot neural machine translation is an attractive goal because of th...
research
09/09/2022

Adapting to Non-Centered Languages for Zero-shot Multilingual Translation

Multilingual neural machine translation can translate unseen language pa...
research
09/04/2022

Informative Language Representation Learning for Massively Multilingual Neural Machine Translation

In a multilingual neural machine translation model that fully shares par...
research
05/20/2022

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

Zero-shot translation is a promising direction for building a comprehens...
research
03/26/2021

Turning transformer attention weights into zero-shot sequence labelers

We demonstrate how transformer-based models can be redesigned in order t...

Please sign up or login with your details

Forgot password? Click here to reset