mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs

04/18/2021
by   Zewen Chi, et al.
0

Multilingual T5 (mT5) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual text-to-text transfer Transformer with translation pairs (mT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially non-autoregressive objective for text-to-text pre-training. We evaluate the methods on seven multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2021

XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge

Cross-lingual pre-training has achieved great successes using monolingua...
research
12/31/2020

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

Multilingual machine translation enables a single model to translate bet...
research
10/30/2020

VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation

Recent studies about learning multilingual representations have achieved...
research
02/21/2023

Co-Driven Recognition of Semantic Consistency via the Fusion of Transformer and HowNet Sememes Knowledge

Semantic consistency recognition aims to detect and judge whether the se...
research
05/18/2023

mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

We present our work on developing a multilingual, efficient text-to-text...
research
04/16/2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

For multilingual sequence-to-sequence pretrained language models (multil...
research
05/19/2020

Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text

Although temporal tagging is still dominated by rule-based systems, ther...

Please sign up or login with your details

Forgot password? Click here to reset