Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

04/24/2023
by   Fei Huang, et al.
0

Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation. However, in a wider range of text generation tasks, existing NAR models lack proper pre-training, making them still far behind the pre-trained autoregressive models. In this paper, we propose Pre-trained Directed Acyclic Transformer (PreDAT) and a novel pre-training task to promote prediction consistency in NAR generation. Experiments on five text generation tasks show that our PreDAT remarkably outperforms existing pre-trained NAR models (+4.2 scores on average) and even achieves better results than pre-trained autoregressive baselines in n-gram-based metrics, along with 17 times speedup in throughput. Further analysis shows that PreDAT benefits from the unbiased prediction order that alleviates the error accumulation problem in autoregressive generation, which provides new insights into the advantages of NAR generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation

We study the text generation task under the approach of pre-trained lang...
research
02/16/2021

Non-Autoregressive Text Generation with Pre-trained Language Models

Non-autoregressive generation (NAG) has recently attracted great attenti...
research
10/21/2021

Improving Non-autoregressive Generation with Mixup Training

While pre-trained language models have achieved great success on various...
research
06/01/2020

Cascaded Text Generation with Markov Transformers

The two dominant approaches to neural text generation are fully autoregr...
research
04/20/2022

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond

Non-autoregressive (NAR) generation, which is first proposed in neural m...
research
11/24/2021

Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences

Autoregressive models have proven to be very powerful in NLP text genera...
research
06/01/2023

EEL: Efficiently Encoding Lattices for Reranking

Standard decoding approaches for conditional text generation tasks typic...

Please sign up or login with your details

Forgot password? Click here to reset