VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation

10/30/2020
by   Fuli Luo, et al.
0

Recent studies about learning multilingual representations have achieved significant performance gains across a wide range of downstream cross-lingual tasks. They train either an encoder-only Transformer mainly for understanding tasks, or an encoder-decoder Transformer specifically for generation tasks, ignoring the correlation between the two tasks and frameworks. In contrast, this paper presents a variable encoder-decoder (VECO) pre-training approach to unify the two mainstreams in both model architectures and pre-training tasks. VECO splits the standard Transformer block into several sub-modules trained with both inner-sequence and cross-sequence masked language modeling, and correspondingly reorganizes certain sub-modules for understanding and generation tasks during inference. Such a workflow not only ensures to train the most streamlined parameters necessary for two kinds of tasks, but also enables them to boost each other via sharing common sub-modules. As a result, VECO delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark covering text classification, sequence labeling, question answering, and sentence retrieval. For generation tasks, VECO also outperforms all existing cross-lingual models and state-of-the-art Transformer variants on WMT14 English-to-German and English-to-French translation datasets, with gains of up to 1∼2 BLEU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2022

ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation

Recent cross-lingual cross-modal works attempt to extend Vision-Language...
research
04/18/2021

mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs

Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive ...
research
10/07/2020

Cross-Thought for Sentence Encoder Pre-training

In this paper, we propose Cross-Thought, a novel approach to pre-trainin...
research
08/31/2019

Explicit Cross-lingual Pre-training for Unsupervised Machine Translation

Pre-training has proven to be effective in unsupervised machine translat...
research
04/17/2023

VECO 2.0: Cross-lingual Language Model Pre-training with Multi-granularity Contrastive Learning

Recent studies have demonstrated the potential of cross-lingual transfer...
research
05/27/2022

GIT: A Generative Image-to-text Transformer for Vision and Language

In this paper, we design and train a Generative Image-to-text Transforme...
research
05/29/2019

A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models

Undirected neural sequence models such as BERT have received renewed int...

Please sign up or login with your details

Forgot password? Click here to reset