Generating Wikipedia by Summarizing Long Sequences

01/30/2018
by   Peter J. Liu, et al.
0

We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.

READ FULL TEXT

page 14

page 18

research
03/22/2023

XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages

Lack of encyclopedic text contributors, especially on Wikipedia, makes a...
research
04/12/2022

Generating Full Length Wikipedia Biographies: The Impact of Gender Bias on the Retrieval-Based Generation of Women Biographies

Generating factual, long-form text such as Wikipedia articles raises thr...
research
05/22/2020

A Generative Approach to Titling and Clustering Wikipedia Sections

We evaluate the performance of transformer encoders with various decoder...
research
05/20/2022

Descartes: Generating Short Descriptions of Wikipedia Articles

We introduce and tackle the problem of automatically generating short de...
research
10/25/2019

Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models

Ambiguous user queries in search engines result in the retrieval of docu...
research
06/29/2021

TWAG: A Topic-Guided Wikipedia Abstract Generator

Wikipedia abstract generation aims to distill a Wikipedia abstract from ...
research
10/21/2020

Learning to Summarize Long Texts with Memory Compression and Transfer

We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recu...

Please sign up or login with your details

Forgot password? Click here to reset