Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

by   Alexander R. Fabbri, et al.

Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a general method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner which makes use of characteristics of the target dataset such as the length and abstractiveness of the desired summaries. We achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional, diverse datasets. The models fine-tuned in this unsupervised manner are more robust to noisy data and also achieve better few-shot performance using 10 and 100 training examples. We perform ablation studies on the effect of the components of our unsupervised fine-tuning data and analyze the performance of these models in few-shot scenarios along with data augmentation techniques using both automatic and human evaluation.


page 1

page 2

page 3

page 4


Robust fine-tuning of zero-shot models

Large pre-trained models such as CLIP offer consistent accuracy across a...

Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control

Abstractive summarization systems leveraging pre-training language model...

Challenges in leveraging GANs for few-shot data augmentation

In this paper, we explore the use of GAN-based few-shot data augmentatio...

Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient Conversations

Fine-tuning pretrained models for automatically summarizing doctor-patie...

CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation

Scientific extreme summarization (TLDR) aims to form ultra-short summari...

Transforming Wikipedia into Augmented Data for Query-Focused Summarization

The manual construction of a query-focused summarization corpus is costl...

Zero-Resource Multi-Dialectal Arabic Natural Language Understanding

A reasonable amount of annotated data is required for fine-tuning pre-tr...