DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

06/25/2021
by   Shuming Ma, et al.
0

While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG). NLG tasks are often based on the encoder-decoder framework, where the pretrained encoders can only benefit part of it. To reduce this gap, we introduce DeltaLM, a pretrained multilingual encoder-decoder model that regards the decoder as the task layer of off-the-shelf pretrained encoders. Specifically, we augment the pretrained multilingual encoder with a decoder and pre-train it in a self-supervised way. To take advantage of both the large-scale monolingual data and bilingual data, we adopt the span corruption and translation span corruption as the pre-training tasks. Experiments show that DeltaLM outperforms various strong baselines on both natural language generation and translation tasks, including machine translation, abstractive text summarization, data-to-text, and question generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2022

IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation

The T5 model and its unified text-to-text paradigm contributed in advanc...
research
01/18/2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

In this work, we explore joint energy-based model (EBM) training during ...
research
05/24/2020

Stronger Baselines for Grammatical Error Correction Using Pretrained Encoder-Decoder Model

Grammatical error correction (GEC) literature has reported on the effect...
research
07/29/2022

Transfer Learning for Segmentation Problems: Choose the Right Encoder and Skip the Decoder

It is common practice to reuse models initially trained on different dat...
research
09/10/2018

Unsupervised Controllable Text Formalization

We propose a novel framework for controllable natural language transform...
research
08/19/2019

Encoder-Agnostic Adaptation for Conditional Language Generation

Large pretrained language models have changed the way researchers approa...
research
05/27/2021

Self-Supervised Multimodal Opinion Summarization

Recently, opinion summarization, which is the generation of a summary fr...

Please sign up or login with your details

Forgot password? Click here to reset