Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

12/31/2022
by   Sayar Ghosh Roy, et al.
0

Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: https://github.com/sayarghoshroy/InfoPopularity

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents

Keyphrase generation is the task of predicting a set of lexical units th...
research
07/17/2018

To Post or Not to Post: Using Online Trends to Predict Popularity of Offline Content

Predicting the popularity of online content has attracted much attention...
research
02/06/2020

Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation

With the rapid proliferation of online media sources and published news,...
research
05/17/2018

Content-based Popularity Prediction of Online Petitions Using a Deep Regression Model

Online petitions are a cost-effective way for citizens to collectively e...
research
05/12/2021

UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

We propose a cascade of neural models that performs sentence classificat...
research
08/05/2021

Spotify Danceability and Popularity Analysis using SAP

Our analysis reviews and visualizes the audio features and popularity of...
research
09/08/2019

Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

Sentence position is a strong feature for news summarization, since the ...

Please sign up or login with your details

Forgot password? Click here to reset