Non-Autoregressive Text Generation with Pre-trained Language Models

02/16/2021
by   Yixuan Su, et al.
10

Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

Non-AutoRegressive (NAR) text generation models have drawn much attentio...
research
11/25/2019

Non-autoregressive Transformer by Position Learning

Non-autoregressive models are promising on various text generation tasks...
research
08/26/2022

Nearest Neighbor Non-autoregressive Text Generation

Non-autoregressive (NAR) models can generate sentences with less computa...
research
06/01/2020

Cascaded Text Generation with Markov Transformers

The two dominant approaches to neural text generation are fully autoregr...
research
08/30/2019

Autoregressive Text Generation Beyond Feedback Loops

Autoregressive state transitions, where predictions are conditioned on p...
research
12/29/2020

A Theoretical Analysis of the Repetition Problem in Text Generation

Text generation tasks, including translation, summarization, language mo...
research
02/15/2023

Big Little Transformer Decoder

The recent emergence of Large Language Models based on the Transformer a...

Please sign up or login with your details

Forgot password? Click here to reset