Text Summarization with Oracle Expectation

09/26/2022
by   Yumo Xu, et al.
1

Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document. Since most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy, different labeling algorithms have been proposed to extrapolate oracle extracts for model training. In this work, we identify two flaws with the widely used greedy labeling approach: it delivers suboptimal and deterministic oracles. To alleviate both issues, we propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels. We define a new learning objective for extractive summarization which incorporates learning signals from multiple oracle summaries and prove it is equivalent to estimating the oracle expectation for each document sentence. Without any architectural modifications, the proposed labeling scheme achieves superior performance on a variety of summarization benchmarks across domains and languages, in both supervised and zero-shot settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2018

Neural Latent Extractive Document Summarization

Extractive summarization models need sentence level labels, which are us...
research
01/06/2017

Enumeration of Extractive Oracle Summaries

To analyze the limitations and the future directions of the extractive s...
research
04/28/2022

Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization

In zero-shot multilingual extractive text summarization, a model is typi...
research
07/16/2019

STRASS: A Light and Effective Method for Extractive Summarization Based on Sentence Embeddings

This paper introduces STRASS: Summarization by TRAnsformation Selection ...
research
11/19/2020

Fact-level Extractive Summarization with Hierarchical Graph Mask on BERT

Most current extractive summarization models generate summaries by selec...
research
02/27/2019

An Editorial Network for Enhanced Document Summarization

We suggest a new idea of Editorial Network - a mixed extractive-abstract...
research
09/06/2019

Features in Extractive Supervised Single-document Summarization: Case of Persian News

Text summarization has been one of the most challenging areas of researc...

Please sign up or login with your details

Forgot password? Click here to reset