When to Lift the Lockdown? Global COVID-19 Scenario Planning and Policy Effects using Compartmental Gaussian Processes

by   Zhaozhi Qian, et al.
University of Cambridge

The coronavirus disease 2019 (COVID-19) outbreak has led government officials and policy makers to rely on mathematical compartmental models for estimating the potential magnitude of COVID-19 patient volume, particularly at the local peak of the epidemic, in order to make containment and resource planning decisions. Now that the pandemic is already past its peak in many of the hardest hit countries, policy makers are trying to figure out the best policies for gradually easing the lockdown to resume economic and social activity while protecting public health. In this paper, we develop a model for predicting the effects of government policies on COVID-19 fatalities – the developed model (1) is flexibly able to handle model miss-specification in a data-driven fashion, (2) is able to quantify the uncertainty in its forecasts, and (3) is able to capture the effect of interventions on these forecasts. Our model is Bayesian: we use a susceptible, exposed, infected and recovered states (SEIR) model as a prior belief on the pandemic curve, and then update the posterior belief based on observed data using a Gaussian process. We incorporate the effects of policies on the future course of the pandemic by calibrating the priors using global data from many countries.



There are no comments yet.


page 1

page 2

page 3

page 4


When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

The coronavirus disease 2019 (COVID-19) global pandemic has led many cou...

Estimating functional parameters for understanding the impact of weather and government interventions on COVID-19 outbreak

As the coronavirus disease 2019 (COVID-19) has shown profound effects on...

SEIRD Model for Qatar Covid-19 Outbreak: A Case Study

The Covid-19 outbreak of 2020 has required many governments to develop m...

Impact of COVID-19 Policies and Misinformation on Social Unrest

The novel coronavirus disease (COVID-19) pandemic has impacted every cor...

India nudges to contain COVID-19 pandemic: a reactive public policy analysis using machine-learning based topic modelling

India locked down 1.3 billion people on March 25, 2020 in the wake of CO...

Is Time to Intervention in the COVID-19 Outbreak Really Important? A Global Sensitivity Analysis Approach

Italy has been one of the first countries timewise strongly impacted by ...

Quantitative evaluation of regulatory policies for reducing deforestation using the bent-cable regression model

Reducing and redressing the effects of deforestation is a complex public...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The COVID-19 global pandemic poses a threat not only to public health, but also to the stability of the healthcare infrastructure and economies around the world. Forecasting the spread of the pandemic is crucial for informing governmental response (containment strategies and social distancing measures). Models that are capable of anticipating the different phases of the pandemic in a timely manner can be used to guide these decisions and inform future policy direction. Most existing models rely on mathematical compartmental approaches (e.g., the SEIR model) for estimating the potential magnitude of COVID-19 patient volume. However, these models are sensitive to starting assumptions and thus different models provide considerably different forecasts, resulting in highly uncertain forecasts. Moreover, most of the existing mathematical models have failed to accurately forecast peaks in deaths or cases and subsequent declines because of their rigid assumptions and their inability to account for changes in government-mandated societal interventions over time.

In this paper, we develop a Bayesian model for forecasting COVID-19 cases and deaths over time. Our model uses a Gaussian process with a mean function defined through a compartmental model as a prior belief on how the pandemic curve will unfold, and then updates its posterior belief on the future forecasts based on current evidence from global disease tracking data. Our model is global in the sense that it jointly incorporates the expected effects of different policies on the pandemic curve by jointly modeling these effects across all countries affected by the pandemic. This is achieved through a hierarchical Gaussian process (GP) model where parameters used to define a nation-specific pandemic curve are shared across all nations based on country-specific indicators. Compared to existing models, our model (1) is flexibly able to handle model miss-specification in a data-driven fashion, (2) is able to quantify uncertainty in these forecasts, and (3) is able to capture the effect of interventions on these forecasts. Comparisons with existing models are provided in Table 1.

Figure 1: Exemplary illustration for policy effect forecast and scenario analysis using our model.
Approach Uncertainty Interventions Sample efficiency
None Not modeled
Fitted for one
Curve fitting Frequentist Not modeled
Fitted for one
Our model
Data-driven with
model prior
Bayesian Modeled
Joint model for
all locations
Table 1: Comparison between our model and existing models.

Most of the widely-used models for forecasting the COVID-19 pandemic are based on either of the two modeling approaches highlighted in Table 1. For instance, the Institute for Health Metrics and Evaluation (IHME) model in [1] relies on the curve fitting approach to forecast cumulative number of deaths over time, whereas the model in [7] relies on a variant of the SIR model. However, these models do not allow for analyzing counterfactual scenarios on how the COVID-19 fatalities would change under different possible policies for easing the lockdown.

The key objective of our model is to assist policy makers in assessing the potential impact of various lockdown imposing/relaxation policies on the future number of COVID-19 fatalities. The model is fed with data on daily reported COVID-19-related deaths from all countries affected with the pandemic, along with the time-line for the government policies in each of these countries. Using this data along with economic, social, demographic, environmental and public health indicators for each country, the model predicts the effect of different future policies on the expected number of new fatalities as illustrated in Figure 1. In addition to the point predictions provided by the model, uncertainty intervals are also presented to the decision-maker in order to obtain upper and lower bounds on the fatalities associated with the different policies.

Economic Indicators
GDP per capita, GNI per capita, Income share held by lowest 20
Social and Demographic Indicators
Population, Life expectancy, Birth rate, Death rate, Infant mortality rate, Land Area,
% People with basic hand-washing facilities including soap and water, Smoking prevalence,
Prevalence of undernourishment, Prevalence of overweight, Urban population,
Population density, Population ages 65 and above, Access to electricity (% of population),
UHC service coverage index, Total alcohol consumption per capita,
Air transport (passengers carried)
Environmental Indicators
Forest Area, PM2.5 air pollution (mean annual exposure in micrograms per cubic meter)
Public Health Indicators
Immunization for measles, % deaths by communicable diseases, Current health expenditure,
Current health expenditure per capita, Diabetes prevalence, Immunization for DPT,
Immunization for HepB3, Incidence of HIV, Incidence of malaria, Incidence of tuberculosis,
% deaths by CVD/cancer/diabetes/CRD , % deaths due to household and ambient air pollution,
% deaths due to unsafe water/unsafe sanitation/lack of hygiene, Physicians (per 1,000 people)
Table 2: Economic, social, demographic, environmental and health indicators for each country considered in our analysis. Data on these indicators was obtained from the World Bank (https://data.worldbank.org/).

2 Problem Setup: Forecasting the COVID-19 Pandemic

Let be the number of reported COVID-19-related deaths in a given geographical area on the day since the beginning of the outbreak. Throughout this paper, we assume that a geographical area corresponds to a country, and consider a set of countries. Each country

is characterized by a feature vector

comprising economic, social, demographic, environmental and public health indicators (all listed in Table 2). Because the number of confirmed COVID-19 cases depends greatly on the testing rates and testing strategy in each country, we use the reported daily deaths as a more concrete indicator for disease spread.

2.1 Modeling Objectives

Our key objective is to forecast the future number of COVID-19 deaths across all countries under different levels of policy stringency, i.e., the extent to which the government containment measures are restrictive. Using these forecasts to conduct scenario analyses, policy-makers can decide how to ease lock-down and containment measures over time while retaining low mortality rates by examining the effects of different possible future policies on the expected number of future deaths.

Our model is trained on a data set for countries covering a period of days, i.e.,


where is a quantitative measure of the stringency of the policy applied in country at time . A precise description of the data pertaining to variables , and is provided in Section 2.2.

For each country , our goal is to forecast the expected number of new COVID-19 deaths at a future time horizon for a given future policy measures, i.e.,


In addition to the point prediction in (2), we also estimate uncertainty intervals that cover the true number of future deaths,

, with high probability. By examining different settings of the future policy variables

, policy makers can use the predicted fatalities and the associated uncertainty measures to inform future policy direction.

The prediction in (2) is made for each country by conditioning on data for all countries. Thus, the model transfers knowledge about COVID-19 trends and policy effects across different countries based on their similarity with respect to the country-level feature vector .

2.2 Data Description

In this Section, we describe the data pertaining to the variables (country-specific features), (policy stringency), and (reported deaths) in (1).

Country-specific features. We characterize each country with the feature vector , which comprises a total of 35 economic, social, demographic, environmental and public health indicators. The list of these indicators is provided in Table 2. Data on these indicators was collated from statistical reports published by the World Bank (https://data.worldbank.org/).

COVID-19 mortality data. Data on daily reported COVID-19 deaths was collected from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [6], through which information from local government, national government, WHO websites, and third-party aggregators were used to identify data on confirmed COVID-19 deaths by day of death at the first administrative level.

Policy stringency index. We consider government policies through containment and closure indicators recorded by the Oxford Covid-19 Government Response Tracker (OxCGRT), which collects systematic information on which governments have taken which measures, and when. This data was collated by the Blavatnik School of Government at Oxford University [14].

: School closure
: Stay-at-home
: Restrictions on
gathering size
: Workplace closure
: Restrictions on domestic
or internal movement
: Public transport
: Cancellation of
public events
: Restrictions on
international travel
: Public information
Table 3: Individual policy measures used to evaluate policy stringency.

A single Stringency Index is constructed via a nine-point aggregation of the 9 containment and closure indicators listed in Table 3. The index reports a number between 0 to 100 that reflects the overall stringency of the governments response over time. This is a measure of how many of the these nine indicators (mostly around social isolation) a government has acted upon, and to what degree. This index is used to model the policy stringency variable in (1).

Figure 2: Comparison between policy directions and COVID-19 fatalities in Scandinavian countries.

Figure 2 compares the policy directions and the number of reported daily deaths

in four Scandinavian countries (Sweden, Norway, Denmark and Finland). Compared to the lockdowns and shuttered businesses in countries across the world, Sweden is an outlier as officials have advised citizens to work from home and avoid travel, but most schools and businesses have remained open. Thus, the stringency index of Sweden over the months of March, April and May have been slowly increasing towards a maximum of 60

, which is significantly less than other Scandinavian countries which adopted an 80-90 stringency since the months of February and March. Since Scandinavian countries have comparable country-specific features , this data provide us with a natural experiment for the effect of policy stringency on the spread of the disease.

Figure 3: Policy directions and COVID-19 fatalities in different countries.

3 Compartmental Gaussian Processes

We propose a (Bayesian) model that jointly captures COVID-19 fatalities and the mitigating effect of policy stringency over time across different countries. The key idea of our model is based on the usage of a 2-layer Gaussian process, with the first layer to model country-specific COVID-19 fatalities, and the second layer to share parameters across all countries.

Hierarchical Gaussian process model. We model using a Gaussian process, with country-specific mean functions and a kernel function . The input to the Gaussian process is the time dimension and the output is the number of deaths. The parameters of the mean function are modeled through another Gaussian process as follows:


The mean function shares parameters across different countries through the country-specific feature and the policy stringency . The parameter determines our prior information on how the pandemic will spread based on the country features and policy given its spread based on other “similar” countries with “similar” levels of policy stringency.

Incorporating prior information. We model the mean functions using a baseline compartmental model. In particular, we model the mean functions through a Susceptible, Infectious, and Recovered (SIR) model [9] with time-dependent parameters as follows:


where the contact rate , the incubation rate and the mortality rate are the SIR model parameters. For a population of size , the SIR model comprises three compartments: is the number of people susceptible on day , is the number of people infected on day , and is the number of people recovered on day . The SIR model describes the evolution of these factors through the following differential equations:


The model in (5) specifies our prior on the disease spread curve — the parameters of the model are learned jointly for all countries. The Gaussian process posterior further refines our belief on the disease forecast based on observed data at each new time step.

Incorporating policy effects. Unlike the standard SIR model with constant parameters, our model captures adopts a time-dependent contact rate parameter , which is modulated by policy effects over time. Since the basic reproduction number , our model can learn how the policy can change the over time as illustrated in Figure 2.

Country RMSE
Our model SIR model IHME model
United Kingdom 488 629 682
United States 1,590 1,803 723
Italy 335 462 383
Spain 291 358 304
Germany 175 198
Russia 56 57
Turkey 83 102
France 233 270 393
Brazil 291 316
Table 4: Comparison between our model and other baseline models.
Figure 4: Predict effect of the UK lockdown lifting policy on COVID-19 fatalities.

4 Preliminary Results

We validated our model using data for 70 of the time since the reporting of the first COVID-19 deaths, validation on data for 7 days and testing performance on the remaining data. The results where evaluated based on the root mean squared error (RMSE) of the different methods in 11 different countries with a significant number of COVID-19 cases. Results are provided in Table 4.

4.1 Evaluating the lockdown lifting policy in the UK

In Figure 4, we plot the predicted daily number of COVID-19 deaths under three possible policies: (1) the lockdown being abruptly lifted, (2) the lockdown continuing, and (3) the announced UK policy for gradual lockdown lifting. We evaluated the stringency index corresponding to these three policies and plotted the forecasted daily deaths starting from May 13th up until July 1st. As we can see, a sharp lifting of the lockdown would result in a second temporary rise in number of deaths, with around 200 more deaths each day compared to the announced gradual lockdown lifting policy.


[1] IHME COVID-19 health service utilization forecasting team and Christopher J. Murray. “Forecasting the impact of the first wave of the COVID-19 pandemic on hospital demand and deaths for the USA and European Economic Area countries.” medRxiv, 2020.

[2] R. Li, S. Pei, B. Chen, et al. “Substantial undocumented infection facilitates the rapid 439 dissemination of novel coronavirus (SARS-CoV2).” Science, 2020.

[3] N. M. Ferguson, D. Laydon, G. Nedjati-Gilani, et al. “Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand.” Imp Coll COVID-19 Response Team, 2020.

[4] A. J. Kucharski, T. W. Russell, C. Diamond, et al. “Early dynamics of transmission and control of COVID-19: a mathematical modelling study.” Lancet Infect Dis, 2020.

[5] J. T. Wu, et al. “Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.” pp. 689–697, The Lancet, 2020.

[6] JHU CSSE. 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE. GitHub. 2020 (https://github.com/CSSEGISandData/COVID-19).

[7] J. Lourenço, et al. “Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic.” medRxiv, 2020.

[8] C. C. McCluskey. “Complete global stability for an SIR epidemic model with delay—distributed or discrete.” Nonlinear Analysis: Real World Applications, pp. 55-59, 2010.

[9] W. O. Kermack, and A. G. McKendrick. “A contribution to the mathematical theory of epidemics.” Proceedings of the Royal Society of London, pp. 700-721, 1927.

[10] H. W. Hethcote, “The mathematics of infectious diseases.” SIAM review, pp. 599-653, 2000.

[11] R. Lemonnier, K. Scaman, and N. Vayatis. “Tight bounds for influence in diffusion networks and application to bond percolation and epidemiology.” Advances in Neural Information Processing Systems (NeurIPS), 2014.

[12] D. B. Neill, and A. W. Moore. “A fast multi-resolution method for detection of significant spatial disease clusters.” Advances in Neural Information Processing Systems (NeurIPS), 2004.

[13] D. B. Neill, and A. W. Moore. “A fast multi-resolution method for detection of significant spatial disease clusters.” Advances in Neural Information Processing Systems (NeurIPS), 2004.

[14] T. Hale, A. Petherick, T. Phillips, and S. Webster. “Variation in government responses to COVID-19.” Blavatnik School of Government Working Paper, 2020.

[15] P. Teles. “A time-dependent SEIR model to analyse the evolution of the SARS-covid-2 epidemic outbreak in Portugal.” Bull World Health Organ, 2020.