Learning from pandemics: using extraordinary events can improve disease now-casting models

by   Sara Mesquita, et al.

Online searches have been used to study different health-related behaviours, including monitoring disease outbreaks. An obvious caveat is that several reasons can motivate individuals to seek online information and models that are blind to people's motivations are of limited use and can even mislead. This is particularly true during extraordinary public health crisis, such as the ongoing pandemic, when fear, curiosity and many other reasons can lead individuals to search for health-related information, masking the disease-driven searches. However, health crisis can also offer an opportunity to disentangle between different drivers and learn about human behavior. Here, we focus on the two pandemics of the 21st century (2009-H1N1 flu and Covid-19) and propose a methodology to discriminate between search patterns linked to general information seeking (media driven) and search patterns possibly more associated with actual infection (disease driven). We show that by learning from such pandemic periods, with high anxiety and media hype, it is possible to select online searches and improve model performance both in pandemic and seasonal settings. Moreover, and despite the common claim that more data is always better, our results indicate that lower volume of the right data can be better than including large volumes of apparently similar data, especially in the long run. Our work provides a general framework that can be applied beyond specific events and diseases, and argues that algorithms can be improved simply by using less (better) data. This has important consequences, for example, to solve the accuracy-explainability trade-off in machine-learning.


Google Searches and COVID-19 Cases in Saudi Arabia: A Correlation Study

Background: The outbreak of the new coronavirus disease (COVID-19) has a...

Building a COVID-19 Vulnerability Index

COVID-19 is an acute respiratory disease that has been classified as a p...

Using Search Queries to Understand Health Information Needs in Africa

The lack of comprehensive, high-quality health data in developing nation...

This is what a pandemic looks like: Visual framing of COVID-19 on search engines

In today's high-choice media environment, search engines play an integra...

Mining of health and disease events on Twitter: validating search protocols within the setting of Indonesia

This study seeks to validate a search protocol of ill health-related ter...

Please sign up or login with your details

Forgot password? Click here to reset