Examining the Role of Clickbait Headlines to Engage Readers with Reliable Health-related Information

11/25/2019 ∙ by Sima Bhowmik, et al. ∙ University of Maryland The University of Mississippi 0

Clickbait headlines are frequently used to attract readers to read articles. Although this headline type has turned out to be a technique to engage readers with misleading items, it is still unknown whether the technique can be used to attract readers to reliable pieces. This study takes the opportunity to test its efficacy to engage readers with reliable health articles. A set of online surveys would be conducted to test readers' engagement with and perception about clickbait headlines with reliable articles. After that, we would design an automation system to generate clickabit headlines to maximize user engagement.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


Ordinary people are increasingly going on the Internet to seek health-related information. A Pew Research Center study found that 72% of adult internet users in the United States searched online for information about a range of health issues [11]. Both true and false information is vying for readers’ attention on the online platforms. Research has found that misleading health-related information grabs more attention than evidence-based reports [10, 20].

When ordinary people come across misleading health-related information, many of them are likely to believe in it and take action accordingly [23], a problem health professionals have constantly been facing. The trend offers room for more research to find out strategies to engage readers with reliable health-related information.

To this end this paper plans to apply a novel approach of examining the use of clickbait headlines to engage readers with reliable health news. Clickbait, a writing and linguistic technique applied to write a headline which aims to trick readers into clicking links [6], has widely been used in unreliable health-related articles. For example, [20] found a vital role clickbait headlines played in attracting readers to click and share content on social media. The author analyzed the credibility of the 100 most popular health articles shared on social media in 2018 to find that three quarters of the top 10 shared articles were either misleading or included some false information. Clickbait headlines were the prominent feature of these viral articles. [8] also found similar results that unreliable outlets used more clickbait headlines to spread health news online than reliable outlets.

Although clickbait headlines have predominantly been used in unreliable articles, legacy news media have also been using the technique to engage readers. Clickbait itself as a practice may not necessarily be objectionable; however, when it is used to lure readers to something that fails to keep its promise, then its usage becomes questionable.

As the objective of this paper is to increase reader engagement with reliable health-related information, it will examine how audiences would respond to clickbaits when it is used with reliable articles. To test engagement with health messages and promotion, previous studies have examined the use of social networking sites like Facebook and Twitter [16, 28, 17, 3], but no studies were found that applied the clickbait technique to examine whether it helps increase engagement with reliable health-related information.

Engagement in this study has been seen as a two-way process between an outlet and its audiences where readers would share, spend time, and comment [15].

To examine the applicability of clickbaits to increase engagement, this study would employ three phases in its research design in which engagement with clickbaits would be tested through an online experiment survey in the first phase. Participants in the second phase would be asked about their perception of clickbaits. In the last phase automation would be applied to generate clickbait headlines.

The results of this study would contribute to increase engagement with reliable health-related information. Its results could be helpful for health agencies like the Centers for Disease Control and Prevention (CDC) which aims to engage more people with reliable health-related information online to help them take informed decisions.

Its results also would potentially be applied to other domains like environment, politics, and so on.

Research Plan

In order to understand how clickbait headlines work to engage readers with reliable health-related information, we would employ the following three steps.

First Step - Engagement with clickbaits

In this step we would conduct an online experiment survey in which we would provide the participants with some health-related articles with headlines both in traditional format and clickbait to measure engagement. [6, 7] studied different stylometric techniques used in clickbait articles which we would incorporate in designing different clickbait headlines for a single article. Moreover, we would also explore different clickbait types identified by [4] to find out the most effective type of technique for better engagement. Before providing participants with articles, we would conduct a pre-experiment survey to find information about demographics, age, education, and political views. In the post-experiment survey, we would ask participants after reading the articles whether they like to click, share, give reaction, discuss with others both online and off-line.

Participants in the experiment survey would be recruited from the United States using the online crowdsourcing website, Amazon Mechanical Turk (MTurk). The topics of the health-related articles for the experiment would be chosen carefully to avoid confirmation bias; for example, topics related to vaccination would not be given, because many participants are likely to have some prior confirmation bias. Instead We would include topics such as nutrition, obesity, smoking, tobacco use, and so on.

Second Step: Perception about clickbaits

In this step we would measure how participants perceive the news articles with clickbait headlines. We would ask readers whether they believe in the articles or not, the types of emotion it creates such as trust, fear, distrust, curiosity, excitement, disgust. We would also ask the participants about the usefulness of the articles.

Third Step: Automated clickbait headline generation

Our third step will focus on automatically generating a clickbait headline based on the content of the article which can ensure better user engagement. We plan to explore deep learning based natural text generation (NTG) models for headline generation. Based on the survey results, we will identify the suitable types and characteristics of the clickbait headlines that can ensure more engagement. The effective types of clickbait may vary with the topics of the health news, users’ demographic properties, and also with the temporal and local dependencies. So, the factors which are influential in determining ideal clickbait headline can be considered as features for the generative models. There are several challenges involved in this task.

Clickbait type and characteristic selection

Biyani et al. [4] identified 8 types of clickbait and all of which might not be appropriate for the health news. For example, the ambiguous or factually wrong headlines might mislead the readers which can impact negatively. Moreover, all clickbait characteristics don’t have the same contribution to make a headline effective [13]. So, finding out the suitable type and characteristics of clickbaits will be challenging because the curiosity aroused by the headline may vary from topic to topic and also may depend on the users’ demographic properties. Moreover, the temporal and locality characteristics may affect the efficacy of clickbait headlines to engage more users. Along with the textual content the ideal solution should also incorporate these local, temporal and user demographic properties as a feature to generate an effective headline which can drive more attention of the users. We are hopeful that the user study data collected in the first two phases would be useful in deciding these factors.

Generating non-misleading clickbait headlines

The contextual gap between the generated headline and the main content should not be wide so that users get immediate and relevant satisfaction after visiting the landing page. Otherwise, the news site may suffer a high bounce rate leading to the organization’s reputation risks. Even though the generated headlines will be attractive, it’s unlikely that all the readers who come across the article while surfing social media will click it to access to the full content. As most of the people perceive news just skimming through the headlines only and even they share the articles before reading the full content [12], we need to make sure that the users should not get any misconception about the content from the generated headline. So, it is important that the generated headline should reflect the original content properly. Although the task is challenging, Shu et al. [25] provided an outline of generating stylized and synthetic headlines preserving certain information of the documents using deep generative model.

Model Selection

As we are planning to generate the headline from the content, it can be viewed as a sequence generation problem [26]

. Although deep learning-based auto encoder-decoder models can generate a sequence of texts from a source text, the models need to be tuned significantly to achieve a good performance. To overcome this problem, researchers proposed some other text generation methods based on reinforcement learning and adversarial training but all the models seem to have some limitations (e.g. suffering from exposure bias problems, gradient vanishing, mode collapse problems, etc.)


. To find a good performing model which will fit our problem easily can be a challenging task. But to begin with, we are particularly interested in trying Recurrent Neural Networks (RNNs) based Variational Auto-Encoder (VAE) proposed by Shen et al

[24] where they developed a cross-aligned auto-encoder which can transfer the style of sentences preserving the content of original sentences. As our main purpose is to generate engaging headlines which contain different clickbait styles and reflect the original content, the suggested model is worthy to try out.

Dataset Preparation

To train our model we need a complete dataset containing clickbait of all types. Moreover, the clickbait headlines should be enriched with prominent clickbait features so that the generator model can learn them well. But in our best knowledge, such type of complete dataset is not available at this moment. Clickbait challenge dataset contains a sufficient number of clickbait and non-clickbait articles but the dataset contains all types of news, not particularly health-related articles. Moreover, the types of clickbait are also absent in the dataset. Dhoju et al.

[8] curated a dataset of health news articles where the clickbait nature of the headline is also present. But this dataset also lacks the information of clickbait type. Yet this dataset can be a good starting point where we can manually label the clickbait type of a small dataset and explore the style transfer process outlined by Shu et al. [25] to generate the synthesized headlines to build our desired training set.

Related Work

Previous literature has examined the practice of using clickbaits in mainstream news media. For example, [18] examined the content of four online sections of the Spanish newspaper El Pais and identified various linguistic techniques were used in headlines of these articles such as orality markers and interaction (e.g., direct appeal to the reader), vocabulary and word games (e.g., informal language, generic or buzzwords), and morphosyntax (e.g., simple structures).

Chartbeat, an analytics firm that provides market intelligence to media organizations, tested 10, 000 headlines from over 100 websites for their effectiveness in engaging users with content. The study examined 12 ‘common tropes’ in headlines – a majority of them are considered clickbait techniques – and found that some of these tropes are more effective than others [5].

Some studies also examined the role of clickbaits in engaging readers. [21] examined the context of clickbait and non-clickbait articles and found that clickbait headlines generated more engagement than non-clickbaits. This study used headlines from different topics like politics, sports, environment, but not specifically about health.

[29] found that cognitive, affective, and pragmatic elements are significantly related to click clickbaits headline for healthcare personnel.

However, [22] found that question-based headlines lead to negative attitudes about the headline. Although this study did not consider the health-related articles, its findings would nevertheless be helpful for our study to test the efficacy of this type in the context of health-related articles.

[2] investigated the reasons behind certain pieces such as advertisements, videos, news articles go more viral. The results indicate that positive content is more viral than negative content, but the relationship between emotion and social transmission is more complex than valence alone. Virality is partially driven by physiological arousal. Content that evokes high-arousal, positive or negative emotions, is more viral. Conversely, content that evokes low-arousal, or deactivating emotions is less viral.

[9] represents an automated clickbait generation tool which uses an RNN model trained on two million headlines collected from Buzzfeed, Gawker, Jezebel, Huffington Post, and Upworthy. The model is then used to produce new clickbait headlines. [25] used deep generative models to generate synthetic headlines with specific style labels and explored their utilities to help improve clickbait detection.

Some other studies also examined the automated detection of clickbait headlines using natural language processing

[6, 1, 19, 27].


  • [1] A. Anand, T. Chakraborty, and N. Park (2017) We used neural networks to detect clickbaits: you won’t believe what happened next!. In European Conference on Information Retrieval, pp. 541–547. Cited by: Related Work.
  • [2] J. Berger and K. L. Milkman (2012) What makes online content viral?. Journal of marketing research 49 (2), pp. 192–205. Cited by: Related Work.
  • [3] S. Bhattacharya, P. Srinivasan, and P. Polgreen (2017) Social media engagement analysis of us federal health agencies on facebook. BMC medical informatics and decision making 17 (1), pp. 49. Cited by: Introduction.
  • [4] P. Biyani, K. Tsioutsiouliklis, and J. Blackmer (2016) “8 amazing secrets for getting more clicks”: detecting clickbaits in news streams using article informality. In

    Thirtieth AAAI Conference on Artificial Intelligence

    Cited by: First Step - Engagement with clickbaits, Clickbait type and characteristic selection.
  • [5] C. Breaux (2015) You’ll never guess how chartbeat’s data scientists came up with the single greatest headline. Chartbeat. Retrieved from http://blog. chartbeat. com/2015/11/20/youllnever …. Cited by: Related Work.
  • [6] A. Chakraborty, B. Paranjape, S. Kakarla, and N. Ganguly (2016) Stop clickbait: detecting and preventing clickbaits in online news media. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 9–16. Cited by: Introduction, First Step - Engagement with clickbaits, Related Work.
  • [7] Y. Chen, N. J. Conroy, and V. L. Rubin (2015) Misleading online content: recognizing clickbait as false news. In Proceedings of the 2015 ACM on Workshop on Multimodal Deception Detection, pp. 15–19. Cited by: First Step - Engagement with clickbaits.
  • [8] S. Dhoju, M. Main Uddin Rony, M. Ashad Kabir, and N. Hassan (2019) Differences in health news from reliable and unreliable media. In Companion Proceedings of The 2019 World Wide Web Conference, pp. 981–987. Cited by: Introduction, Dataset Preparation.
  • [9] L. Eidnes (2015) Auto-generating clickbait with recurrent neural networks. Note: https://larseidnes.com/2015/10/13/auto-generating-clickbait-with-recurrent-neural-networks Cited by: Related Work.
  • [10] K. Forster (2017) REVEALED: how dangerous fake health news conquered facebook. Note: [Online]. Available from: https://www.independent.co.uk/life-style/health-and-families/health-news/fake-news-health-facebook-cruel-damaging-social-media-mike-adams-natural-health-ranger-conspiracy-a7498201.htmlAccessed 10 August 2019 Cited by: Introduction.
  • [11] S. Fox (2018) The social life of health information. Note: [Online]. Available from: http://www.pewresearch.org/fact-tank/2014/01/15/the-social-life-of-health-information/Accessed 22 July 2019 Cited by: Introduction.
  • [12] M. Gabielkov, A. Ramachandran, A. Chaintreau, and A. Legout (2016) Social clicks: what and who gets read on twitter?. ACM SIGMETRICS Performance Evaluation Review 44 (1), pp. 179–192. Cited by: Generating non-misleading clickbait headlines.
  • [13] J. Kuiken, A. Schuth, M. Spitters, and M. Marx (2017) Effective headlines of newspaper articles in a digital environment. Digital Journalism 5 (10), pp. 1300–1314. Cited by: Clickbait type and characteristic selection.
  • [14] S. Lu, Y. Zhu, W. Zhang, J. Wang, and Y. Yu (2018) Neural text generation: past, present and beyond. arXiv preprint arXiv:1803.07133. Cited by: Model Selection.
  • [15] H. Maksimainen and H. Michaelmas (2017) Improving the quality of health journalism: when reliability meets engagement. Reuters Institute Fellowship Paper, pp. 2017–09. Cited by: Introduction.
  • [16] B. L. Neiger, R. Thackeray, S. H. Burton, C. G. Giraud-Carrier, and M. C. Fagen (2013) Evaluating social media’s capacity to develop engaged audiences in health promotion settings: use of twitter metrics as a case study. Health promotion practice 14 (2), pp. 157–162. Cited by: Introduction.
  • [17] B. L. Neiger, R. Thackeray, S. H. Burton, C. R. Thackeray, and J. H. Reese (2013) Use of twitter among local health departments: an analysis of information sharing, engagement, and action. Journal of medical Internet research 15 (8), pp. e177. Cited by: Introduction.
  • [18] D. Palau-Sampio (2016) Reference press metamorphosis in the digital context: clickbait and tabloid strategies in elpais. com.. Communication & Society 29 (2). Cited by: Related Work.
  • [19] M. Potthast, S. Köpsel, B. Stein, and M. Hagen (2016) Clickbait detection. In European Conference on Information Retrieval, pp. 810–817. Cited by: Related Work.
  • [20] R. Raphael (2019) A shockingly large majority of health news shared on facebook is fake or misleading. Note: [Online]. Available from: https://www.fastcompany.com/90301427/a-shockingly-large-majority-of-health-news-shared-on-facebook-is-fakeAccessed 22 July 2019 Cited by: Introduction, Introduction.
  • [21] M. M. U. Rony, N. Hassan, and M. Yousuf (2017) Diving deep into clickbaits: who use them to what extents in which topics with what effects?. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 232–239. Cited by: Related Work.
  • [22] J. M. Scacco and A. Muddiman (2016) Investigating the influence of “clickbait” news headlines. Engaging News Project Report. Cited by: Related Work.
  • [23] N. Shapiro (2018) The fake news epidemic in health. Note: [Online]. Available from: https://www.thedailybeast.com/the-fake-news-epidemic-in-health?ref=authorAccessed 10 August 2019 Cited by: Introduction.
  • [24] T. Shen, T. Lei, R. Barzilay, and T. Jaakkola (2017) Style transfer from non-parallel text by cross-alignment. In Advances in neural information processing systems, pp. 6830–6841. Cited by: Model Selection.
  • [25] K. Shu, S. Wang, T. Le, D. Lee, and H. Liu (2018) Deep headline generation for clickbait detection. In 2018 IEEE International Conference on Data Mining (ICDM), pp. 467–476. Cited by: Generating non-misleading clickbait headlines, Dataset Preparation, Related Work.
  • [26] I. Sutskever, O. Vinyals, and Q. V. Le (2014) Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112. Cited by: Model Selection.
  • [27] A. Thakur (2016)

    Identifying clickbaits using machine learning

    Cited by: Related Work.
  • [28] J. Thrul, A. B. Klein, and D. E. Ramo (2015) Smoking cessation intervention on facebook: which content generates the best engagement?. Journal of medical Internet research 17 (11), pp. e244. Cited by: Introduction.
  • [29] N. S. Y. Vincent, A. Pal, and A. Y. Chua (2018) Studying healthcare personnel’s intention to click clickbaits. In Proceedings of the International MultiConference of Engineers and Computer Scientists, Vol. 1. Cited by: Related Work.