Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

by   Alexander R. Fabbri, et al.

Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a general method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner which makes use of characteristics of the target dataset such as the length and abstractiveness of the desired summaries. We achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional, diverse datasets. The models fine-tuned in this unsupervised manner are more robust to noisy data and also achieve better few-shot performance using 10 and 100 training examples. We perform ablation studies on the effect of the components of our unsupervised fine-tuning data and analyze the performance of these models in few-shot scenarios along with data augmentation techniques using both automatic and human evaluation.


page 1

page 2

page 3

page 4


News Summarization and Evaluation in the Era of GPT-3

The recent success of zero- and few-shot prompting with models like GPT-...

Challenges in leveraging GANs for few-shot data augmentation

In this paper, we explore the use of GAN-based few-shot data augmentatio...

Abstractive Summarization as Augmentation for Document-Level Event Detection

Transformer-based models have consistently produced substantial performa...

Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control

Abstractive summarization systems leveraging pre-training language model...

Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient Conversations

Fine-tuning pretrained models for automatically summarizing doctor-patie...

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

In this paper, we consider the challenge of summarizing patients' medica...

Using ChatGPT for Entity Matching

Entity Matching is the task of deciding if two entity descriptions refer...

Please sign up or login with your details

Forgot password? Click here to reset