PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

12/18/2019
by   Jingqing Zhang, et al.
0

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples.

READ FULL TEXT
research
09/09/2021

ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

Abstractive text summarization is one of the areas influenced by the eme...
research
06/11/2019

Self-Supervised Learning for Contextualized Extractive Summarization

Existing models for extractive summarization are usually trained from sc...
research
09/15/2023

Structural Self-Supervised Objectives for Transformers

This thesis focuses on improving the pre-training of natural language mo...
research
03/09/2022

Text-DIAE: Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

In this work, we propose Text-Degradation Invariant Auto Encoder (Text-D...
research
07/30/2020

Leverage Unlabeled Data for Abstractive Speech Summarization with Self-Supervised Learning and Back-Summarization

Supervised approaches for Neural Abstractive Summarization require large...
research
10/20/2020

Topic-Aware Abstractive Text Summarization

Automatic text summarization aims at condensing a document to a shorter ...
research
01/17/2023

Transformer Based Implementation for Automatic Book Summarization

Document Summarization is the procedure of generating a meaningful and c...

Please sign up or login with your details

Forgot password? Click here to reset