Multi-stage Pretraining for Abstractive Summarization

09/23/2019
by   Sebastian Goodman, et al.
0

Neural models for abstractive summarization tend to achieve the best performance in the presence of highly specialized, summarization specific modeling add-ons such as pointer-generator, coverage-modeling, and inferencetime heuristics. We show here that pretraining can complement such modeling advancements to yield improved results in both short-form and long-form abstractive summarization using two key concepts: full-network initialization and multi-stage pretraining. Our method allows the model to transitively benefit from multiple pretraining tasks, from generic language tasks to a specialized summarization task to an even more specialized one such as bullet-based summarization. Using this approach, we demonstrate improvements of 1.05 ROUGE-L points on the Gigaword benchmark and 1.78 ROUGE-L points on the CNN/DailyMail benchmark, compared to a randomly-initialized baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2021

Does Pretraining for Summarization Require Knowledge Transfer?

Pretraining techniques leveraging enormous datasets have driven recent a...
research
12/20/2022

Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

In long document controllable summarization, where labeled data is scarc...
research
08/08/2022

Investigating Efficiently Extending Transformers for Long Input Summarization

While large pretrained Transformer models have proven highly capable at ...
research
12/20/2022

Pretraining Without Attention

Transformers have been essential to pretraining success in NLP. Other ar...
research
10/03/2022

Probing of Quantitative Values in Abstractive Summarization Models

Abstractive text summarization has recently become a popular approach, b...
research
01/25/2023

An Experimental Study on Pretraining Transformers from Scratch for IR

Finetuning Pretrained Language Models (PLM) for IR has been de facto the...
research
06/07/2023

IUTEAM1 at MEDIQA-Chat 2023: Is simple fine tuning effective for multilayer summarization of clinical conversations?

Clinical conversation summarization has become an important application ...

Please sign up or login with your details

Forgot password? Click here to reset