Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization

12/25/2019
by   Chenguang Zhu, et al.
0

Lead bias is a common phenomenon in news summarization, where early parts of an article often contain the most salient information. While many algorithms exploit this fact in summary generation, it has a detrimental effect on teaching the model to discriminate and extract important information. We propose that the lead bias can be leveraged in a simple and effective way in our favor to pretrain abstractive news summarization models on large-scale unlabeled corpus: predicting the leading sentences using the rest of an article. Via careful data cleaning and filtering, our transformer-based pretrained model without any finetuning achieves remarkable results over various news summarization tasks. With further finetuning, our model outperforms many competitive baseline models. For example, the pretrained model without finetuning achieves state-of-the-art results on DUC-2003 and DUC-2004 datasets. The finetuned model obtains 3.2 and 2.1 Human evaluations further show the effectiveness of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2021

Demoting the Lead Bias in News Summarization via Alternating Adversarial Learning

In news articles the lead bias is a common phenomenon that usually domin...
research
01/03/2020

TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising

Text summarization aims to extract essential information from a piece of...
research
04/08/2020

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

News headline generation aims to produce a short sentence to attract rea...
research
09/08/2019

Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

Sentence position is a strong feature for news summarization, since the ...
research
10/22/2022

Salience Allocation as Guidance for Abstractive Summarization

Abstractive summarization models typically learn to capture the salient ...
research
05/20/2020

Examining the State-of-the-Art in News Timeline Summarization

Previous work on automatic news timeline summarization (TLS) leaves an u...
research
11/24/2021

Knowledge Enhanced Sports Game Summarization

Sports game summarization aims at generating sports news from live comme...

Please sign up or login with your details

Forgot password? Click here to reset