Are official confirmed cases and fatalities counts good enough to study the COVID-19 pandemic dynamics? A critical assessment through the case of Italy

by   Krzysztof Bartoszek, et al.

As the COVID-19 outbreak is developing the two most frequently reported statistics seem to be the raw confirmed case and case fatalities counts. Focusing on Italy, one of the hardest hit countries, we look at how these two values could be put in perspective to reflect the dynamics of the virus spread. In particular, we find that merely considering the confirmed case counts would be very misleading. The number of daily tests grows, while the daily fraction of confirmed cases to total tests has a change point. It (depending on region) generally increases with strong fluctuations till (around, depending on region) 15th-22nd March and then decreases linearly after. Combined with the increasing trend of daily performed tests, the raw confirmed case counts are not representative of the situation and are confounded with the sampling effort. This we observe when regressing on time the logged fraction of positive tests and for comparison the logged raw confirmed count. Hence, calibrating model parameters for this virus's dynamics should not be done based only on confirmed case counts (without rescaling by the number of tests), but take also fatalities and hospitalization count under consideration as variables not prone to be distorted by testing efforts. Furthermore, reporting statistics on the national level does not say much about the dynamics of the disease, which are taking place at the regional level. These findings are based on the official data of total death counts up to 15th April 2020 released by ISTAT and up to 10th May 2020 for the number of cases. In this work we do not fit models but we rather investigate whether this task is possible at all. This work also informs about a new tool to collect and harmonize official statistics coming from different sources in the form of a package for the R statistical environment and presents the COVID-19 Data Hub.



There are no comments yet.


page 16

page 17

page 18

page 27

page 28

page 31

page 33

page 34


Bayesian imputation of COVID-19 positive test counts for nowcasting under reporting lag

Obtaining up to date information on the number of UK COVID-19 regional i...

Estimating SARS-CoV-2 Infections from Deaths, Confirmed Cases, Tests, and Random Surveys

There are many sources of data giving information about the number of SA...

Estimating SARS-CoV-2-positive Americans using deaths-only data

We fit a Bayesian model to data on the number of deaths attributable to ...

The role of swabs in modeling the COVID-19 outbreak in the most affected regions of Italy

The daily fluctuations in the released number of Covid-19 cases played a...

Estimating Active Cases of COVID-19

Having accurate and timely data on confirmed active COVID-19 cases is ch...

The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Many statisticians, epidemiologists, economists and data scientists have...

Bayesian Modeling of COVID-19 Positivity Rate – the Indiana experience

In this short technical report we model, within the Bayesian framework, ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.