Improving Non-autoregressive Generation with Mixup Training

10/21/2021
by   Ting Jiang, et al.
0

While pre-trained language models have achieved great success on various natural language understanding tasks, how to effectively leverage them into non-autoregressive generation tasks remains a challenge. To solve this problem, we present a non-autoregressive generation model based on pre-trained transformer models. To bridge the gap between autoregressive and non-autoregressive models, we propose a simple and effective iterative training method called MIx Source and pseudo Target (MIST). Unlike other iterative decoding methods, which sacrifice the inference speed to achieve better performance based on multiple decoding iterations, MIST works in the training stage and has no effect on inference time. Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results for fully non-autoregressive models. We also demonstrate that our method can be used to a variety of pre-trained models. For instance, MIST based on the small pre-trained model also obtains comparable performance with seq2seq models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

Non-AutoRegressive (NAR) text generation models have drawn much attentio...
research
04/20/2017

Fast Generation for Convolutional Autoregressive Models

Convolutional autoregressive models have recently demonstrated state-of-...
research
03/08/2021

Text Simplification by Tagging

Edit-based approaches have recently shown promising results on multiple ...
research
04/04/2022

Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study

In this work, we present an extensive study on the use of pre-trained la...
research
12/30/2020

Unnatural Language Inference

Natural Language Understanding has witnessed a watershed moment with the...
research
05/06/2023

An Adversarial Non-Autoregressive Model for Text Generation with Incomplete Information

Non-autoregressive models have been widely studied in the Complete Infor...
research
04/24/2020

Data Annealing for Informal Language Understanding Tasks

There is a huge performance gap between formal and informal language und...

Please sign up or login with your details

Forgot password? Click here to reset