DeepAI
Log In Sign Up

Does Pretraining for Summarization Require Knowledge Transfer?

09/10/2021
by   Kundan Krishna, et al.
0

Pretraining techniques leveraging enormous datasets have driven recent advances in text summarization. While folk explanations suggest that knowledge transfer accounts for pretraining's benefits, little is known about why it works or what makes a pretraining task or dataset suitable. In this paper, we challenge the knowledge transfer story, showing that pretraining on documents consisting of character n-grams selected at random, we can nearly match the performance of models pretrained on real corpora. This work holds the promise of eliminating upstream corpora, which may alleviate some concerns over offensive language, bias, and copyright issues. To see whether the small residual benefit of using real data could be accounted for by the structure of the pretraining task, we design several tasks motivated by a qualitative study of summarization corpora. However, these tasks confer no appreciable benefit, leaving open the possibility of a small role for knowledge transfer.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/23/2019

Multi-stage Pretraining for Abstractive Summarization

Neural models for abstractive summarization tend to achieve the best per...
09/28/2022

Downstream Datasets Make Surprisingly Good Pretraining Corpora

For most natural language processing tasks, the dominant practice is to ...
09/30/2021

Compositional generalization in semantic parsing with pretrained transformers

Large-scale pretraining instills large amounts of knowledge in deep neur...
09/21/2022

Adapting Pretrained Text-to-Text Models for Long Text Sequences

We present an empirical study of adapting an existing pretrained text-to...
02/27/2020

Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue Representation Learning

Multi-role dialogue understanding comprises a wide range of diverse task...
10/09/2021

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

Pretraining Neural Language Models (NLMs) over a large corpus involves c...
05/02/2022

POLITICS: Pretraining with Same-story Article Comparison for Ideology Prediction and Stance Detection

Ideology is at the core of political science research. Yet, there still ...