XDLM: Cross-lingual Diffusion Language Model for Machine Translation

07/25/2023
by   Linyao Chen, et al.
0

Recently, diffusion models have excelled in image generation tasks and have also been applied to neural language processing (NLP) for controllable text generation. However, the application of diffusion models in a cross-lingual setting is less unexplored. Additionally, while pretraining with diffusion models has been studied within a single language, the potential of cross-lingual pretraining remains understudied. To address these gaps, we propose XDLM, a novel Cross-lingual diffusion model for machine translation, consisting of pretraining and fine-tuning stages. In the pretraining stage, we propose TLDM, a new training objective for mastering the mapping between different languages; in the fine-tuning stage, we build up the translation system based on the pretrained model. We evaluate the result on several machine translation benchmarks and outperformed both diffusion and Transformer baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2021

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

Successful methods for unsupervised neural machine translation (UNMT) em...
research
04/16/2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

For multilingual sequence-to-sequence pretrained language models (multil...
research
09/23/2021

Cross-Lingual Language Model Meta-Pretraining

The success of pretrained cross-lingual language models relies on two es...
research
06/15/2021

Consistency Regularization for Cross-Lingual Fine-Tuning

Fine-tuning pre-trained cross-lingual language models can transfer task-...
research
06/25/2021

ParaLaw Nets – Cross-lingual Sentence-level Pretraining for Legal Text Processing

Ambiguity is a characteristic of natural language, which makes expressio...
research
02/25/2022

The Reality of Multi-Lingual Machine Translation

Our book "The Reality of Multi-Lingual Machine Translation" discusses th...
research
06/15/2022

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

We present results from a large-scale experiment on pretraining encoders...

Please sign up or login with your details

Forgot password? Click here to reset