Code-switching pre-training for neural machine translation

09/17/2020
by   Zhen Yang, et al.
0

This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the proposed CSP randomly replaces some words in the source sentence with their translation words in the target language. Specifically, we firstly perform lexicon induction with unsupervised word embedding mapping between the source and target languages, and then randomly replace some words in the input sentence with their translation words according to the extracted translation lexicons. CSP adopts the encoder-decoder framework: its encoder takes the code-mixed sentence as input, and its decoder predicts the replaced fragment of the input sentence. In this way, CSP is able to pre-train the NMT model by explicitly making the most of the cross-lingual alignment information extracted from the source and target monolingual corpus. Additionally, we relieve the pretrain-finetune discrepancy caused by the artificial symbols like [mask]. To verify the effectiveness of the proposed method, we conduct extensive experiments on unsupervised and supervised NMT. Experimental results show that CSP achieves significant improvements over baselines without pre-training or with other pre-training methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2019

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

We present effective pre-training strategies for neural machine translat...
research
10/05/2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

Pre-training (PT) and back-translation (BT) are two simple and powerful ...
research
04/19/2019

Code-Switching for Enhancing NMT with Pre-Specified Translation

Leveraging user-provided translation to constrain NMT has practical sign...
research
08/31/2019

Explicit Cross-lingual Pre-training for Unsupervised Machine Translation

Pre-training has proven to be effective in unsupervised machine translat...
research
02/17/2022

End-to-End Training of Both Translation Models in the Back-Translation Framework

Semi-supervised learning algorithms in neural machine translation (NMT) ...
research
01/01/2023

Inflected Forms Are Redundant in Question Generation Models

Neural models with an encoder-decoder framework provide a feasible solut...
research
02/16/2023

Generalization algorithm of multimodal pre-training model based on graph-text self-supervised training

Recently, a large number of studies have shown that the introduction of ...

Please sign up or login with your details

Forgot password? Click here to reset