ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

01/26/2020
by   Dongling Xiao, et al.
0

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2019

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Pre-training and fine-tuning, e.g., BERT, have achieved great success in...
research
01/05/2022

SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations

Recent years have seen the successful application of large pre-trained m...
research
12/09/2020

SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint

Automatic song writing aims to compose a song (lyric and/or melody) by m...
research
08/29/2021

Span Fine-tuning for Pre-trained Language Models

Pre-trained language models (PrLM) have to carefully manage input units ...
research
08/31/2021

Effective Sequence-to-Sequence Dialogue State Tracking

Sequence-to-sequence models have been applied to a wide variety of NLP t...
research
10/17/2019

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Pre-training models have been proved effective for a wide range of natur...
research
01/01/2023

Inflected Forms Are Redundant in Question Generation Models

Neural models with an encoder-decoder framework provide a feasible solut...

Please sign up or login with your details

Forgot password? Click here to reset