On the Trade-off between Redundancy and Local Coherence in Summarization

05/20/2022
by   Ronald Cardenas, et al.
0

Extractive summarization systems are known to produce poorly coherent and, if not accounted for, highly redundant text. In this work, we tackle the problem of summary redundancy in unsupervised extractive summarization of long, highly-redundant documents. For this, we leverage a psycholinguistic theory of human reading comprehension which directly models local coherence and redundancy. Implementing this theory, our system operates at the proposition level and exploits properties of human memory representations to rank similarly content units that are coherent and non-redundant, hence encouraging the extraction of less redundant final summaries. Because of the impact of the summary length on automatic measures, we control for it by formulating content selection as an optimization problem with soft constraints in the budget of information retrieved. Using summarization of scientific articles as a case study, extensive experiments demonstrate that the proposed systems extract consistently less redundant summaries across increasing levels of document redundancy, whilst maintaining comparable performance (in terms of relevancy and local coherence) against strong unsupervised baselines according to automated evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2021

Unsupervised Extractive Summarization by Human Memory Simulation

Summarization systems face the core challenge of identifying and selecti...
research
06/12/2017

Extract with Order for Coherent Multi-Document Summarization

In this work, we aim at developing an extractive summarizer in the multi...
research
01/12/2019

What comes next? Extractive summarization by next-sentence prediction

Existing approaches to automatic summarization assume that a length limi...
research
09/04/2019

An Entity-Driven Framework for Abstractive Summarization

Abstractive summarization systems aim to produce more coherent and conci...
research
05/24/2023

AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content

Long document summarization systems are critical for domains with length...
research
05/31/2019

Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization

The most important obstacles facing multi-document summarization include...
research
11/30/2020

Systematically Exploring Redundancy Reduction in Summarizing Long Documents

Our analysis of large summarization datasets indicates that redundancy i...

Please sign up or login with your details

Forgot password? Click here to reset