Faster indicators of dengue fever case counts using Google and Twitter

by   Giovanni Mizzi, et al.

Dengue is a major threat to public health in Brazil, the world's sixth biggest country by population, with over 1.5 million cases recorded in 2019 alone. Official data on dengue case counts is delivered incrementally and, for many reasons, often subject to delays of weeks. In contrast, data on dengue-related Google searches and Twitter messages is available in full with no delay. Here, we describe a model which uses online data to deliver improved weekly estimates of dengue incidence in Rio de Janeiro. We address a key shortcoming of previous online data disease surveillance models by explicitly accounting for the incremental delivery of case count data, to ensure that our approach can be used in practice. We also draw on data from Google Trends and Twitter in tandem, and demonstrate that this leads to slightly better estimates than a model using only one of these data streams alone. Our results provide evidence that online data can be used to improve both the accuracy and precision of rapid estimates of disease incidence, even where the underlying case count data is subject to long and varied delays.


page 1

page 2

page 3

page 4


Google Searches and COVID-19 Cases in Saudi Arabia: A Correlation Study

Background: The outbreak of the new coronavirus disease (COVID-19) has a...

Flu Detector: Estimating influenza-like illness rates from online user-generated content

We provide a brief technical description of an online platform for disea...

Can Google Scholar and Mendeley help to assess the scholarly impacts of dissertations?

Dissertations can be the single most important scholarly outputs of juni...

A latent shared-component generative model for real-time disease surveillance using Twitter data

Exploiting the large amount of available data for addressing relevant so...

Providing early indication of regional anomalies in COVID19 case counts in England using search engine queries

COVID19 was first reported in England at the end of January 2020, and by...

The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Many statisticians, epidemiologists, economists and data scientists have...

Predicting delays in Indian lower courts using AutoML and Decision Forests

This paper presents a classification model that predicts delays in India...

Please sign up or login with your details

Forgot password? Click here to reset