DeepAI AI Chat
Log In Sign Up

Bottom-Up Abstractive Summarization

by   Sebastian Gehrmann, et al.
Harvard University

Neural network-based methods for abstractive summarization produce outputs that are more fluent than other techniques, but which can be poor at content selection. This work proposes a simple technique for addressing this issue: use a data-efficient content selector to over-determine phrases in a source document that should be part of the summary. We use this selector as a bottom-up attention step to constrain the model to likely phrases. We show that this approach improves the ability to compress text, while still generating fluent summaries. This two-step process is both simpler and higher performing than other end-to-end content selection models, leading to significant improvements on ROUGE for both the CNN-DM and NYT corpus. Furthermore, the content selector can be trained with as little as 1,000 sentences, making it easy to transfer a trained summarizer to a new domain.


page 1

page 2

page 3

page 4


Question-Based Salient Span Selection for More Controllable Text Summarization

In this work, we propose a method for incorporating question-answering (...

Leveraging BERT for Extractive Text Summarization on Lectures

In the last two decades, automatic extractive text summarization on lect...

Conducting sparse feature selection on arbitrarily long phrases in text corpora with a focus on interpretability

We propose a general framework for topic-specific summarization of large...

Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

How can we effectively inform content selection in Transformer-based abs...

Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

Summarizing novel chapters is a difficult task due to the input length a...

Extract with Order for Coherent Multi-Document Summarization

In this work, we aim at developing an extractive summarizer in the multi...

Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

Sequence-to-sequence (seq2seq) network is a well-established model for t...