PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

10/16/2021
by   Wen Xiao, et al.
1

Recently proposed pre-trained generation models achieve strong performance on single-document summarization benchmarks. However, most of them are pre-trained with general-purpose objectives and mainly aim to process single document inputs. In this paper, we propose PRIMER, a pre-trained model for multi-document representation with focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs, and we use Gap Sentence Generation objective with a new strategy to select salient sentences for the whole cluster, called Entity Pyramid, to teach the model to select and aggregate information across a cluster of related documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on the zero-shot, few-shot, and full-supervised settings, our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins. Code and pre-trained models are released at https://github.com/allenai/PRIMER

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2020

STEP: Sequence-to-Sequence Transformer Pre-training for Document Summarization

Abstractive summarization aims to rewrite a long document to its shorter...
research
08/01/2022

Multi-Document Summarization with Centroid-Based Pretraining

In multi-document summarization (MDS), the input is a cluster of documen...
research
07/05/2022

Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control

Abstractive summarization systems leveraging pre-training language model...
research
09/10/2023

Multi-document Summarization: A Comparative Evaluation

This paper is aimed at evaluating state-of-the-art models for Multi-docu...
research
08/17/2022

An Efficient Coarse-to-Fine Facet-Aware Unsupervised Summarization Framework based on Semantic Blocks

Unsupervised summarization methods have achieved remarkable results by i...
research
12/01/2021

Controlling Conditional Language Models with Distributional Policy Gradients

Machine learning is shifting towards general-purpose pretrained generati...
research
10/12/2021

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

To capture the semantic graph structure from raw text, most existing sum...

Please sign up or login with your details

Forgot password? Click here to reset