Learning to Summarize Passages: Mining Passage-Summary Pairs from Wikipedia Revision Histories

04/06/2020
by   Qingyu Zhou, et al.
0

In this paper, we propose a method for automatically constructing a passage-to-summary dataset by mining the Wikipedia page revision histories. In particular, the method mines the main body passages and the introduction sentences which are added to the pages simultaneously. The constructed dataset contains more than one hundred thousand passage-summary pairs. The quality analysis shows that it is promising that the dataset can be used as a training and validation set for passage summarization. We validate and analyze the performance of various summarization systems on the proposed dataset. The dataset will be available online at https://res.qyzhou.me.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization

We introduce WikiLingua, a large-scale, multilingual dataset for the eva...
research
04/07/2020

Query-controllable Video Summarization

When video collections become huge, how to explore both within and acros...
research
05/09/2023

WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset

Webpages have been a rich resource for language and vision-language task...
research
06/09/2022

CLTS+: A New Chinese Long Text Summarization Dataset with Abstractive Summaries

The abstractive methods lack of creative ability is particularly a probl...
research
08/15/2017

Extractive Summarization using Deep Learning

This paper proposes a text summarization approach for factual reports us...
research
08/15/2017

Gold Standard Online Debates Summaries and First Experiments Towards Automatic Summarization of Online Debate Data

Usage of online textual media is steadily increasing. Daily, more and mo...
research
11/17/2022

Summarizing Community-based Question-Answer Pairs

Community-based Question Answering (CQA), which allows users to acquire ...

Please sign up or login with your details

Forgot password? Click here to reset