Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries

by   Xiaofei Sun, et al.

Long-text generation remains a challenge. The difficulty of generating coherent long texts lies in the fact that existing models overwhelmingly focus on the tasks of local word prediction, and cannot make high level plans on what to generate or capture the high-level discourse dependencies between chunks of texts. Inspired by how humans write, where a list of bullet points or a catalog is first outlined, and then each bullet point is expanded to form the whole article, we propose SOE, a pipelined system that involves of summarizing, outlining and elaborating for long text generation: the model first outlines the summaries for different segments of long texts, and then elaborates on each bullet point to generate the corresponding segment. To avoid the labor-intensive process of summary soliciting, we propose the reconstruction strategy, which extracts segment summaries in an unsupervised manner by selecting its most informative part to reconstruct the segment.The proposed generation system comes with the following merits: (1) the summary provides high-level guidances for text generation and avoids the local minimum of individual word predictions; (2) the high-level discourse dependencies are captured in the conditional dependencies between summaries and are preserved during the summary expansion process and (3) additionally, we are able to consider significantly more contexts by representing contexts as concise summaries. Extensive experiments demonstrate that SOE produces long texts with significantly better quality, along with faster convergence speed.


page 1

page 2

page 3

page 4


Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence

Generating long and coherent text is an important but challenging task, ...

PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation

Despite recent progress of pre-trained language models on generating flu...

Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization

Sentence summarization shortens given texts while maintaining core conte...

Top-Down Tree Structured Text Generation

Text generation is a fundamental building block in natural language proc...

Model Criticism for Long-Form Text Generation

Language models have demonstrated the ability to generate highly fluent ...

Improving Adversarial Text Generation by Modeling the Distant Future

Auto-regressive text generation models usually focus on local fluency, a...

Discourse-Aware Neural Rewards for Coherent Text Generation

In this paper, we investigate the use of discourse-aware rewards with re...

Please sign up or login with your details

Forgot password? Click here to reset