NEWTS: A Corpus for News Topic-Focused Summarization

05/31/2022
by   Seyed Ali Bahrainian, et al.
0

Text summarization models are approaching human levels of fidelity. Existing benchmarking corpora provide concordant pairs of full and abridged versions of Web, news or, professional content. To date, all summarization datasets operate under a one-size-fits-all paradigm that may not reflect the full range of organic summarization needs. Several recently proposed models (e.g., plug and play language models) have the capacity to condition the generated summaries on a desired range of themes. These capacities remain largely unused and unevaluated as there is no dedicated dataset that would support the task of topic-focused summarization. This paper introduces the first topical summarization corpus NEWTS, based on the well-known CNN/Dailymail dataset, and annotated via online crowd-sourcing. Each source article is paired with two reference summaries, each focusing on a different theme of the source document. We evaluate a representative range of existing techniques and analyze the effectiveness of different prompting methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2019

SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization

This paper introduces the SAMSum Corpus, a new dataset with abstractive ...
research
11/02/2020

Liputan6: A Large-scale Indonesian Dataset for Text Summarization

In this paper, we introduce a large-scale Indonesian summarization datas...
research
06/29/2021

Topic Modeling Based Extractive Text Summarization

Text summarization is an approach for identifying important information ...
research
02/27/2018

Live Blog Corpus for Summarization

Live blogs are an increasingly popular news format to cover breaking new...
research
11/13/2019

Towards Supervised Extractive Text Summarization via RNN-based Sequence Classification

This article briefly explains our submitted approach to the DocEng'19 co...
research
04/13/2022

Learning to Revise References for Faithful Summarization

In many real-world scenarios with naturally occurring datasets, referenc...
research
09/14/2023

Investigating Gender Bias in News Summarization

Summarization is an important application of large language models (LLMs...

Please sign up or login with your details

Forgot password? Click here to reset