Simulator for the spatiotemporal model for Covid-19
We introduce a novel modeling framework for studying epidemics that is specifically designed to make use of fine-grained spatiotemporal data. Motivated by the current COVID-19 outbreak and the availability of data from contact or location tracing technologies, our model uses marked temporal point processes to represent individual mobility patterns and the course of the disease for each individual in a population. We design an efficient sampling algorithm for our model that can be used to predict the spread of infectious diseases such as COVID-19 under different testing and tracing strategies, social distancing measures, and business restrictions, given location or contact histories of individuals. Building on this algorithm, we use Bayesian optimization to estimate the risk of exposure of each individual at the sites they visit, the percentage of symptomatic individuals, and the difference in transmission rate between asymptomatic and symptomatic individuals from historical longitudinal testing data. Experiments using measured COVID-19 data and mobility patterns from Tübingen, a town in the southwest of Germany, demonstrate that our model can be used to quantify the effects of tracing, testing, and containment strategies at an unprecedented spatiotemporal resolution. To facilitate research and informed policy-making, particularly in the context of the current COVID-19 outbreak, we are releasing an open-source implementation of our framework at https://github.com/covid19-model.READ FULL TEXT VIEW PDF
Simulator for the spatiotemporal model for Covid-19
The novel coronavirus disease COVID-19 has spread from Wuhan, China to the rest of the world in a matter of months [27, 35]. As a result, an increasing number of countries have introduced travel restrictions, imposed social distancing measures, and even confined their citizens in their houses to slow and ideally halt the spread of COVID-19.
Against this backdrop, we see burgeoning efforts to introduce contact tracing technologies, which aim to identify and characterize disease-spreading human interactions, ultimately allowing both individuals and policymakers to make more effective decisions. Amongst them, we can distinguish between location-based technologies222 http://www.zerobase.io/, http://safepaths.mit.edu/ such as QR code "check-ins", GPS, or WiFi triangulation, and proximity-based technologies333 https://www.tracetogether.gov.sg/, https://www.pepp-pt.org/ such as Bluetooth. The promise of these contact tracing technologies is that automated and fine-grained contact monitoring of individuals may allow for:
More accurate predictions: New data sources may allow us to predict the spread of COVID-19 at an unprecedented spatiotemporal resolution. This includes when and where new individual infections may happen and how likely these events are to occur.
More effective containment and mitigation: Tracing technologies may help us to design more effective strategies to slow down or even prevent the spread of COVID-19, thus allowing authorities to gradually lift the most restrictive measures with more precision and confidence.
Data-driven insights into disease parameters: Accurate contact tracing may yield insights into the relative importance of different modalities of disease transmission and allow inference of the unknown parameters of the transmission and course of COVID-19.
To fulfill this promise, we need to utilize data-driven models and inference algorithms designed to use and benefit from contact tracing data of individuals. Unfortunately, most of the classical epidemiological literature [2, 7, 15, 25, 24] has primarily focused on developing models for general population dynamics rather than the infectious state of any given individual in the population.
More recently, there has been research on modeling individual dynamics of epidemics [1, 4, 5, 31, 30]. However, this work typically resorts to mean-field theory and thus does not characterize the dynamic infectious state of each individual over time. The work closest in spirit to ours may be by Ferguson et al. [10, 9], who use a stochastic model of individuals situated at random locations in space to analyze strategies for preventing and containing influenza. However, they use ad-hoc assumptions about site visits and as a result cannot make use of the individual contact tracing data generated by modern technologies. Finally, in the context of the current COVID-19 outbreak, there has been a flurry of work contemporary to ours [8, 11, 18, 32], including research in which Ferguson et al.  re-use the above mentioned influenza models [10, 9]. However, similarly as in Ferguson et al. [10, 9], none of these papers can make use of individual contact tracing data nor characterize the fine-grained effects of testing, contact tracing, and containment interventions over time at a local level. The present work attempts to fill this gap.
We introduce a novel modeling framework that is expressive enough to make use of the fine-grained spatiotemporal data produced by contact tracing technologies. Our model uses marked temporal point processes  to represent events in which…
individuals are exposed, asymptomatic, presymptomatic, symptomatic, hospitalized, recovered, or deceased (Epidemiology).
individuals check in at different points of interest, such as supermarkets, pharmacies, or testing site locations, and meet (and possibly infect) each other (Mobility).
health authorities test individuals, individuals receive the outcome of a test, and contacts of positively tested individuals are tracked down (Testing & Tracing).
individuals follow social distancing measures, including quarantines and curfews (Social distancing).
points of interest implement hygienic and capacity-limiting measures or closures (Business restrictions).
Within this paradigm, our model can be fully defined by way of a set of conditional intensity functions, or hazard functions. Our model is agnostic to the particular choice of conditional intensity function for mobility patterns. We further introduce a new intensity function that is able to characterize the influence that mobility patterns, as well as social distancing measures and business restrictions, have on the risk that each individual with the virus poses on their community. Moreover, we follow the recent literature on COVID-19 [11, 18] to define the specific functional form of the intensity functions that characterize the times when individuals become asymptomatic, presymptomatic, symptomatic, hospitalized, recovered, or deceased, and specify their parameter values accordingly. For testing & tracing, social distancing, and business restrictions, our model expresses a variety of real-world interventions.
Given a fixed set of individual mobility patterns as collected by contact tracing technologies, we design an efficient sampling algorithm for our model, summarized in Algorithm 1, that is able to predict the spread of COVID-19 under different testing & tracing strategies, social distancing measures, and business restrictions using Monte Carlo roll-outs. One of the key components of our sampling algorithm is the use of the superposition principle and thinning to generate faithful simulations. Ultimately, this allows us to accurately estimate the effect of a variety of interventions and what-if scenarios given a fixed set of location traces.
Moreover, if in addition to the set of individual mobility patterns, we also have access to historical aggregate longitudinal testing data, we introduce an inference procedure that uses our sampling algorithm and Bayesian Optimization (BO)  to estimate: (i) the parameters of the intensity function that characterizes the risk of exposure of each individual; (ii) the percentage of asymptomatic individuals among those infected; and (iii) the relative difference in transmission rate between asymptomatic and (pre)symptomatic individuals.
Finally, we validate our modeling framework using COVID-19 data from Tübingen, a town in the southwest of Germany, and demonstrate that it can be used to estimate the effect of testing & tracing, social distancing, and business restrictions at an unprecedented spatiotemporal resolution. To enable further research in this area as well as to facilitate scientifically informed policymaking, we release an open-source implementation of our framework at https://github.com/covid19-model.
Given a set of individuals , we track the current state of each single individual using a collection of state variables, which determine her mobility pattern, epidemiological condition and degree of testing & tracing under different social distancing and business restriction measures. We use stochastic differential equations (SDE) to realistically model (i) the stochastic nature of mobility patterns and infection events, (ii) events in continuous time, i.e. not in aggregate over days, and (iii) discrete transitions between different states – an individual either does or does not get infected, visits a site, or is selected for quarantine. To ease the exposition, we describe different types of state variables separately.
In this section, we introduce a mobility model for location-based technologies such as site check-ins, as facilitated by QR codes, GPS, or WiFi triangulation. Refer to Appendix A for a mobility model for proximity-based technologies.
Let be the set of sites individuals can check in at. For each individual , let the indicator if the individual is at site at time and otherwise. We characterize the value of the states using the following stochastic differential equation (SDE) with jumps:
where and are counting processes indicating when individual checks in and checks out at site , respectively. Moreover, we define the intensity (or rates) of these counting processes as follows:
where is the (arbitrary) rate at which the individual checks in at site and is the average duration of a check-in at site .
We build on recent variations of the Susceptible-Exposed-Infected-Resistant (SEIR) compartment models, which have been introduced in the context of COVID-19 modeling [18, 11]. More specifically, we define the epidemiological condition of each individual using the indicator state variables , , , whose meaning is specified in Table 1.
|is asymptomatic, has mild course of disease||✓||✓||-|
|is pre-symptomatic, progresses to later on||✓||✓||-|
|is recovered and resistant||-||-||-|
Then, we characterize their values and state transitions using the following stochastic differential equations (SDE) with jumps:
where indicates whether an infected individual is asymptomatic, indicates whether a symptomatic individual eventually requires hospitalization, indicates whether a symptomatic individual eventually deceases. The counting processes , , and indicate when individual transitions from susceptible to exposed (), from exposed to infected (), from pre-symptomatic infected to symptomatic infected (), from asymptomatic infected to resistant (), from symptomatic infected to resistant (), and from symptomatic infected to deceased ().
Then, we define the conditional intensity function of the counting processes , , , , , and following the recent literature on COVID-19 modeling [18, 11]. More specifically, we consider the functional form of the intensity functions to be those of log-normal time-to-event distributions () shifted to start at the time , , or become ones, respectively. Table 2 gives more details about each these distributions, including the specific study to which we refer for the distribution parameters.
Finally, we define the central conditional intensity function of the exposure counting process as follows444External factors can be characterized by adding an additional base rate to the intensity .:
is the transmission rate due to any type of infectious individuals at site . Depending on the availability of labeled and unlabeled data, one may consider different settings, such as all sites sharing the same parameter or sites of different types sharing the same parameter ;
is the relative difference in transmission rate between asymptomatic and (pre)symptomatic individuals; and
the term accounts for environmental transmission, , it accounts the fact that SARS-CoV2 may survive for some period of time on surfaces or in the air after an infected individual has left a site .
In the above, note that the conditional intensity function depends on the location-based mobility model. However, the definition is flexible to any contact tracing technology by defining the indicator . Refer to Appendix A for a conditional intensity function of a proximity-based mobility model.
In Section 3, we will introduce a procedure based on Bayesian optimization (BO) to estimate the model parameters , and from historical aggregate testing data.
|Counting process||Starts when||
Whenever literature results on COVID-19 were only reported using mean or median estimates of times with confidence intervals, we use log-normal distributions with corresponding normal parameters to define an approximate distribution, often consulting various sources, denoted by. The log-normal is commonly used to model event times in this context [17, 19].
|d||666Incubation period from , here corrected not to encompass the estimated time of pre-symptomatic infectiousness .|||
|[36, 12, 34]|
|[36, 12, 34]|
|decay of infectiousness at sites|||
|window of non-contact contamination777For computational purposes, set from by the time when rate of infection drops below 20% after leaving a site.|||
This section describes the testing and tracing procedure implemented in our model to match reality as close as possible, though in principle variable. We assume there exists a health authority that maintains a corresponding priority queue of individuals to be tested. Over time, it decides who to add to the queue according to a testing policy , , adding only symptomatic people to the queue. It tests individuals from the queue at an arbitrary rate . Moreover, every time an individual is tested, the outcome of the test is only known after a reporting delay .
Let record the number of known test outcomes by time and, for each individual , let and be the number of times the individual has been tested positive and negative, respectively, by time . Then, we characterize the counting processes and using the following SDE with jumps:
where and indicates whether an individual is tested at time according to the policy .
Every time an individual is tested positive, the health authority may decide to implement contact tracing—track down her contacts in the last days. To this end, it identifies the set of individuals who have visited a location around the same time as individual did in the last units of time before , , each individual such that
Once it has identified these contacts, it may decide to apply different contact tracing policies. In this work, we consider two different policies:
Basic contact tracing policy: The authority picks individuals from at random and decides to either add them to the testing queue or to isolate them for a period of time.
Advanced contact tracing policy: The authority picks the top individuals from
ranked by their lowest empirical survival probability
and decides to either add them to the testing queue or to isolate them for a period of time. Thus, this policy ranks by the lowest probability of not having been exposed to COVID-19, or highest probability of exposure. Disregarding second order effects and inaccuracies in estimating the empirical survival probability, advanced smart tracing can be interpreted as a greedy allocation of tests under limited resources.
In the above equation, note that the health authority cannot observe when individual became infected and thus the empirical survival probability does not match the true one. In future work, it would be interesting to consider more sophisticated contact tracing policies. This could include imposing or lifting business restrictions in different neighborhoods based on the empirical survival probabilities of individuals who live in those neighborhoods.
Continuing the setting from above, the corresponding governmental authority may decide to implement a variety of social distancing measures, from less restrictive (, isolate individuals who have tested positive) to more restrictive (, bring the entire population in a state of “lock-down” via curfews). In our model, the effect of social distancing can be faithfully characterized at an individual level by modifying the individual mobility models introduced in Section 2.1.
More specifically, if the authority decides to isolate a group for example based on age for a period of time , then it straightforward to either
By doing so, we will modify implicitly or explicitly, respectively, the conditional intensity function of the counting process , given by Eq. 4, which essentially influences the spread of COVID-19 over time. Moreover, by setting rather than , we can quantify the potential effect that social distancing may have in the spread of COVID-19 given a set of historical mobility traces, bypassing the need to repeatedly sample mobility traces to quantify the effect of different measures. This key insight is in our sampling algorithm in Section 3.1.
Furthermore, our model can faithfully characterize the effect of business restrictions that the governmental authority may also decide to impose. For instance, if the authority asks supermarkets to implement hygienic measures to reduce the probability of customers infecting each other, it is straightforward to reduce the value of the corresponding transmission rates in the conditional intensity function of the counting process , given by Eq. 4. Alternatively, if the authority asks bars to shut down for a period of time , we set their corresponding transmission rates as long as .
Finally, note that both for social distancing and business restrictions, the times and may be set dynamically based on other events of interest or for various cyclic schedules. For example, may be the time when an individual is tested positive and may be the time the authority decides positively tested individuals should stay in isolation.
In this section, we first introduce an efficient algorithm for generating simulations of our model under a given set of parameters. This sampler adheres to our model definition of section 2 and efficiently simulates the spread of COVID-19 given a fixed set of individual mobility patterns. This allows us to estimate the efficacy of various strategies of testing, contact tracing, and containment using repeated Monte Carlo roll-outs. Afterwards, we present an inference procedure that uses our sampling algorithm and Bayesian Optimization [26, 14, 3] to fit the model parameters , and , given a set of mobility traces and longitudinal testing data.
Given a fixed set of general individual mobility patterns defined via and an initial set of exposed individuals , our algorithm simulates the state of each individual in the population over a time window of interest under a given testing and tracing strategy, social distancing measures and business restrictions.
The challenge of simulating realizations from our model is the generation of valid samples from the exposure processes given the location patterns of each individual, which may dynamically change due to social distancing measures and business restrictions. In particular, once individual becomes infectious, their state changes to one of or , thereby changing the intensities of the exposure processes of other individuals who have contact with in the future. That means, previous timings sampled for might become invalid as the intensities of the point processes change.
As a result, sound simulations have to apply the principles of superposition and thinning  to generate valid samples of as states change over time. Algorithms 1 and 2 implement these principles efficiently in a global context by using a priority queue of temporal events for all individuals in the model. Refer to Algorithms 1 and 2 in Appendix B for a detailed description of the overall procedure.
Given a fixed set of individual mobility patterns and historical testing data, we use Bayesian optimization (BO) [26, 22] to infer the model parameters , and that best fit the historical testing data. BO techniques are some of the most efficient approaches in terms of the number of function evaluations required for optimization [14, 3], making them very suitable for the large-scale simulations used in our model.
Monte Carlo roll-outs of our model, we consider two types of loss functions to be minimized in the context of point process modeling for epidemics. Their applicability depends on the availability of either aggregate or fine-grained testing data. More specifically:
Cumulative Daily Squared Error: Let be the cumulative number of real positive cases by day and let be the corresponding state variables of the -th simulation. Given a time horizon in days, the loss is given by
Total Personalized Reporting Distance: Let be the set of individuals for whom we have real positive testing reports and be the set of individuals that tested positive according to the -th Monte Carlo roll-out. Let be the times such that for , respectively. Then, inspired by previous likelihood-free inference approaches in the temporal point process literature , given a time horizon in days, the loss is given by
In the absence of more fine-grained longitudinal testing for our experiments , we use the cumulative daily squared error definition in Eq. 9 and point to the second loss function for applications with contact tracing data. Furthermore, note that the BO literature typically considers the setting of function maximization, which is why we will equivalently maximize the negative of , respectively.
Bayesian optimization is a search method that models the objective function as being sampled from a Gaussian process (GP) prior, using function evaluations as observations to update the posterior of the objective given the evaluations . At each iteration of the procedure, an acquisition function defined over the current GP posterior is cheaply optimized to determine the next point of function evaluation. random realizations of our model for the proposed settings are generated in a distributed fashion, and the observed loss from Equation 9 or 10 is used to update the GP posterior by an additional observation. The acquisition function guides the search for an optimum, typically defined in such a way that high acquisition corresponds to potential improvement of the objectives, either via areas of high uncertainty or high objectives. For inference of
, we used the upper confidence bound (UCB) heuristic with parameter .
In this section, we showcase our modeling framework using real COVID-19 data from Tübingen, a town in the Southwest of Germany, and demonstrate that it can be used to estimate the effect of testing & tracing, social distancing and business restrictions at an unprecedented spatiotemporal resolution.
To facilitate the analysis of data from other locations around the world, the open-source implementation of our modeling framework includes a collection of auxiliary functions and notebooks that can be used to mimic our experimental setup from publicly available data at any desired location.
In the absence of personalized check-in data available to us as of today, we develop a realistic mobility model based on the population density within a town and availability of sites. Note that our proposed framework is readily generalizable to more complex mobility models and explicitly suitable for data from contact tracing technologies. All analyses in this Section are made using Tübingen as a case study for the effects resulting from various tracing, testing, and containment strategies related to COVID-19. However, the conclusions drawn can be used to guide recommendations in regions beyond this specific case.
Demographics We use data provided by Facebook that is part of a Disease Prevention Map888https://dataforgood.fb.com/tools/disease-prevention-maps/ to obtain valuable high-resolution population density data. The map contains temporal values of population density per geographical map tile both during the spread of COVID-19 and 45 days before as a baseline. For our experiments, we use measurements of density per tile by nighttime snapshots before the spread of the pandemic and average over a period of six days to determine the density of home locations in each map tile. We randomly distribute the individuals of each tile to six age groups according to the real demographics of the province, thereby matching the age groups of the COVID-19 case data collected by the national authorities in Germany . The generated spatial population distribution in the town is presented in Figure 3(a).
Site locations Using the openly available API of OpenStreetMap, we extract available sites of specific types for Tübingen. Specifically, the sites we consider in our experiments are divided into educational institutions (, schools, universities), social places (, restaurants, bars, cafes), bus stops, offices and supermarkets. Note that our model is not constrained to this setting and could be arbitrarily generalized by practitioners if desired. The available sites of each type can be seen in Figure 3(b). For each type of site we set a mean duration of visit reflecting the exposure time during that visit (, two hours for schools, 30 minutes for supermarkets) and a rate of visits per week, also depending on the age group (, educational sites five times a week for children, social places three times a week for seniors). Given a time horizon
for each individual and the aforementioned rates, we iteratively sample types, timings, and durations visits from an exponential distribution. Each individual visits a constrained set of sites per type with probability inversely proportional to its distance from home (, 1 school, 2 supermarkets), considering the fact that people have formed habits and keep revisiting places.
COVID-19 data For our experiments, we use reported daily cases of COVID-19 in the region of Tübingen as publicly released by the national health authorities . Furthermore, we assume a mortality rate per age group that follows the empirical mortality data for COVID-19 in the studied region of Tübingen . We fix the proportion of hospitalized cases based on estimations in previous work on COVID-19 .
Scaling For the following experiments, we chose to use a smaller representation of Tübingen for efficiency reasons, where population and sites were down-sampled by factors of 20x and 10x, respectively. To account for an urban accumulation of COVID-19 cases, we down-sampled real case numbers by 10x.
|1.1383||Rate of exposure at locations999Due to the fact that we only had synthetically generated traces, we only optimized one parameter for all site types.|
|0.3224||Proportion of asymptomatic infections|
|0.2072||Relative infectiousness of asymptomatic infected|
Predictions by our model using the parameters optimized to match COVID-19 case data before the introduction of restrictive measures. The simulation continued until April 12, 2020 with no further restrictions other than the isolation of the positively tested. In panel (a), line and shading represent mean and standard deviation across 40 independent simulations. In panels (b) and (c), lines represent means and error bars represent the standard deviation of total infections as given by the top line.
We use Bayesian optimization to estimate the epidemic parameters , and using real COVID-19 case data from Tübingen in a time window when the spread of COVID-19 occurred at a comparably uncontrolled rate. We select the period from March 10, 2020 until March 25, 2020 under the assumption that, except for the isolation of positive cases, neither social distancing measures nor business closures took measurable effect on case counts yet after two days. Officially, restrictions on free movement and business closures within the country came into effect starting March 23, 2020 .
Following the inference procedure outlined in Section 3.2, we ran 100 simulations of twenty random realizations () until observed convergence, proposing new parameters in BO after each batch of simulations. Every realization was also randomized across realizations of the synthetic mobility traces and infection seeds. We faithfully set the seed counts to approximately match the observed COVID-19 cases in Tübingen on March 10, 2020. Table 3 summarizes the inferred epidemic parameters. The estimate indicates that the observed COVID-19 cases in Tübingen are likely to have followed an epidemic where asymptomatic individuals where significantly less infectious than symptomatic individuals. This approximately aligns with other recent estimates on the range of 0.1 to 0.55 [18, 11]. In addition, optimization revealed that the number of asymptomatic infections account for over a third of all infections. This is slightly below recent estimates in the range of 0.4 - 0.5 [21, 8, 11]. Without further mentioning, all experiments in the remainder of this work are conducted using the optimized parameters listed in Table 3.
Worst-case projections for observed period of March 23 - April 12, 2020 The above conclusions were drawn as a result of inferring the parameters
that likely underlie the current COVID-19 epidemic. When simulating 40 random realizations of the model using the inferred parameters, yet continue the simulation until April 12, i.e. the present day, the mean number of infections observed at the end of simulation is 1,465. With only averaging 76 infected individuals on March 23, this is an approximate increase by a factor of 20 in only a 20-day period. The variance in these estimates is significant with standard deviations of both values of 579 and 57 people, respectively. These deviations can be expected to be reduced by eliminating the source of randomness from the random mobility traces in simulations, for instance by using real data from contact tracing technologies.
Scaled from the down-sampled version of the city used for inference to its real size, this would imply approximate total infection numbers in Tübingen of 29,000 individuals on April 12. In addition, we note that the number of positively tested individuals in this setting would not have deviated as extremely as the underlying true infection counts, matching the observed COVID-19 data. The illustration of this worst-case scenario is depicted in Figure 7.
In the next sections, we will focus on the period of time after March 23, 2020 and investigate how our modeling framework can be used to quantify the effect of different social distancing measures, business restrictions and testing & tracing strategies. In all experiments, we performed 40 independent simulations and randomized also over initially seeded infected as well as synthetic mobility traces.
Starting on March 23, 2020, highly restrictive measures were implemented by the German government all across Germany and specifically also in Tübingen by its local authorities . These include the closure of schools, universities, and social sites, as well as several sanitary measures in public locations such as supermarkets. In addition, individuals were asked to follow a series of social distancing measures, , that only groups of two people or when living in the same household were allowed to meet in public. As a result, the accelerating number of new confirmed COVID-19 cases slowed down until the present day, as shown in Figure (a)a. The informally often so-called “lockdown“ was at least effective in flattening the curve of infections.
In our model, the above measures can be implemented in a straightforward manner. In particular, when (i) setting for any social place or educational institution, (ii) reducing the inferred parameter by 50% for all remaining sites, and (iii) reducing the check-in intensities by an (unknown) percentage % for all individuals in the population, the effects of the implemented state measures after March 23 can be estimated from Monte Carlo simulations.
If we simulate the epidemic with the estimated parameters in Section 4.2 under the implemented restrictions (i)-(iii), one would intuitively expect that there is a value of such that the number of simulated positive cases match the real positive cases also after March 23, 2020, until today. Figure 11 confirms this intuition, showing that the closest match occurs for . However, it is clear that maintaining business closures and social distancing measures for an extended period of time may have undesirable economic and social consequences. Using our model simulations, we will now attempt to answer several questions. How long should the current measures continue to be in place to halt the spread of COVID-19? Can less restrictive, time-varying measures that only apply to parts of the population succeed at halting the spread?
Projections of current restrictions for various periods starting April 12, 2020. Figure 13 illustrates four scenarios, where the above-mentioned restrictions, which match the course of the epidemic in Tübingen until April 12, are extended for one, two, four, or eight weeks into the future. We observe that the measures in place up until April 12 were effective in slowing down the case counts to a point where the immediately subsequent weeks little new true infections occurred in the population. Noticeably, when extending the measures currently in place for only one or two weeks, a significant re-surge in infections starts to occur approximately one month after lifting the measures. In this scenario, infection counts would start growing again starting at the beginning of May, reaching peak infection counts in June 2020. Most importantly, we also note that when extending the current restrictions for at least one or two additional months, it is unlikely to observe another surge in infections at all across simulations. That being said, one should carefully consider the possibility that new imported cases from other locations, which are not accounted for by the specific instance of the model we used, would initiate a re-surge in infections. In the following sections, we will investigate to what extent two different social distancing strategies that only apply to parts of the population can achieve the same or similar effects as those that apply to everyone.
Social distancing of the elderly population Since people older than 60 years of age are expected to suffer more complications from COVID-19 , it has been suggested that only the people who are most endangered should be isolated, allowing the rest of the population to slowly develop herd immunity . Motivated by this suggestion, we only implement social distancing measures for people in the age groups older than 60 years, and lift all restrictions previously found to have matched the observed case development between March 23 and April 12, 2020, as for example used in the above experiments and Figures 11 and 13 More specifically, we enforce that the check-in activity at sites of the elderly population is reduced by % for a period of either one or two months. The results are depicted in Figure 16 and shown compared to a set of baseline simulation results without restrictions.
We observe that this concept of containment can be very effective in avoiding large number of hospitalizations and fatalities across the entire population, in particular when the social distancing measures of older citizens is enforced for at least eight weeks. In this simulated setting, peak number of hospitalizations were approximately halved and fatalities at least reduced by two-thirds across the entire population, showing the promise of the strategy (Figure 16(a)). In addition, we find that the number of overall infections does not change significantly under these strategies (Figure 16(b)). This implies that herd immunity could be developed at a pace that is qualitatively comparable to an uncontrolled pandemic. In doing so, the peak and overall burden on the health care system are significantly lower, making social distancing for the elderly an effective mitigation strategy.
Alternating curfews for random subgroups Rather than implementing measures for specific demographic groups, we analyze an alternative approach of dividing the population into equal subgroups with the goal of reducing both intra- and inter-group exposure events. Specifically, we consider a setting where the population is randomly split into disjoint subsets. For instance, for
, the population could be divided into groups of even and odd birth dates. Every day, the containment strategy prescribes curfews toof the random groups and only one is allowed to follow their usual daily activities. After days, every group will have been allowed to follow a regular schedule exactly once.
Our findings are summarized in Figure 17. The scale is chosen to highlight a re-surge in observed daily infected individuals that is observed when the population is split in only two random groups. Conversely, we note that for the settings of three and four disjoint groups, the goals of significantly reducing both the intra-group contact exposures between individuals at sites as well as the inter-group exposures were achieved. We do not observe a re-surge in infection counts over a twelve week window of simulations in these settings.
While Section 4.3 was used to analyze a variety of social distancing strategies for subgroups of people, we will now explore the use of contact tracing to facilitate even more granular social distancing. More specifically, starting the simulation on April 12, 2020 from the average population states observed in simulations of Figure 11, we lift all business restrictions and isolate individuals who have tested positive, as done in all experiments of this work. In addition, we then also isolate individuals that have been in contact with positively tested individuals and been selected according to the policies implemented by two different contact tracing strategies. Refer to Section 2.3 and Equation 8 for the definitions of basic contract tracing and advanced contact tracing that we use in these experiments.
In both settings, for each positively tested individual in the population, 10 or 25 individuals get selected for isolation according to the respective tracing policy and are subsequently kept in isolation for seven days. The window considered for contacts between individuals and the positively tested was three days. The policy was implemented over the entire twelve weeks of simulation. We analyze these contact tracing concepts independently of the aggregate restriction measures in Section 4.3 to avoid any obfuscation in estimating the true effect of this approach.
Figure (a)a summarizes the results for simulations that implement basic or advanced contact tracing. When comparing the panels on the left and right with 10 and 25 isolated individuals per identified positive case, respectively, we observe that
Both basic and advanced contact tracing show a significant reduction in total infected individuals compared to the baseline.
Contact tracing is effective even when the number of isolated contacts are low. Refer to the left panel of Figure (a)a, depicting the scenario when only 10 individuals with exposure contact are isolated.
Advanced contact tracing based on the empirical exposure probability of individuals is particularly effective when the number of isolated people is low, indicating that greedy allocation can be favorable.101010This observation can guide the assignment of tests to individuals based on the advanced contact tracing strategy.
Effect of compliance on the efficacy of contact tracing We conclude by investigating to what extent the use of contact tracing for social distancing is robust to the individuals’ compliance with the use of the prescribed contact tracing technologies. We simulate various scenarios assuming that a given percentage of the population cannot be isolated using contact tracing nor their contacts can be traced if tested positively because they elect not to comply with the implemented contac tracing system. The results in Figure (b)b show that a high level of compliance is necessary for contact tracing to remain an effective tool, given that no other measures are implemented during the time of simulation. That being said, the considerable flattening of the curve in a nearly uncontrolled course of the pandemic only via means of contact tracing and isolation is an impressive and unconditional proof of concept for the need for individual tracing technologies. Finally, when combined with aggregate social distancing and business restrictions, we expect contact tracing to help reduce the total number of infections even with low levels of compliance.
Due to the time-sensitive nature of COVID-19 research, we have decided to downsample the population, sites, and the real case numbers of Tübingen in our experiments. However, we believe it will be possible to reproduce our results without downsampling as well as to apply our framework to other towns or cities, given enough computational resources. To enable this, our open-source implementation includes scripts to automate the generation of the data that our modeling framework needs from external data sources.
Apart from a few notable exceptions such as South Korea and Singapore, contact tracing technology is not yet widely deployed. Therefore, we have used high-resolution population density data, site locations, and some assumptions on the weekly frequency of check-ins by individuals from different age group to generate realistic individual mobility traces. As a result, we have decided not to showcase the use of our framework to identify sites, or more generally specific urban areas, with higher risk of infection. However, once contact tracing technology becomes more popular, we believe it will be possible to use our framework to identify areas with higher risk of infection in real time.
Beyond legal compliance and gaining societal acceptance, the use of epidemic models with high spatiotemporal resolution such as ours, as well as contact tracing technology, should respect each individual’s privacy. In this context, it is important to highlight that, both during inference and contact tracing, we only need to compute the duration of contact that each individual had with an infected person. The identity of the infected person is not required. As a result, there are reasons to believe that such computations can be made in a decentralized and privacy-preserving manner [20, 28].
Finally, the predictions made by our model can only be faithfully considered when being aware of the high variance observed across random realizations. More specifically, we advise against the implementation of testing & tracing strategies, social distancing measures, or business closures based solely on the predictions made by our model.
Motivated by the rapid development of contact tracing technology and the current COVID-19 outbreak, we have introduced a spatiotemporal epidemic model that uses marked temporal processes to represent individual mobility patterns and the course of the disease for each individual in a population. Moreover, through a detailed use case using real COVID-19 data and mobility patterns of Tübingen, Germany, we have demonstrated that our model can be used to predict the spread of COVID-19 under a variety of testing & tracing strategies, social distancing measures and business restrictions at an unprecedented spatiotemporal resolution.
We hope that the release of an easy-to-use open-source implementation of our modeling framework will further facilitate research and informed policy making in the context of the current COVID-19 outbreak and help prevent the emergence of pandemics in the future. Although our model has significantly greater spatial resolution than many of those currently in use today, we recommend to exercise caution when interpreting or using its results and raise awareness of the high variance observed across random realizations.
The authors would like to thank the Robert Koch Institute, OpenStreetMaps and Facebook for providing data to make this work possible. The work presented in this paper was supported in part by the Swiss National Science Foundation under grant number 200021-182407.
A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. External Links: Cited by: §3.2, §3.2, §3.
Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pp. 2951–2959. Cited by: §1, §3.2, §3.
For each pair of individuals and , let the indicator if the individuals are in contact (, physically close to each other) at time and otherwise. Then, we characterize the value of the states using the following stochastic differential equation (SDE) with jumps:
where and are counting processes indicating when individuals and start and stop being close to each other, respectively. Moreover, we define the intensity (or rates) of these counting processes as follows:
where is the intensity (or rate) at which individuals and meet each other and is the average duration of a meeting.
We define the conditional intensity function of the counting process as follows::
is the transmission rate due to a direct contact with a pre-symptomatic (or a symptomatic) infected individual; and,
is the relative difference in transmission rate between asymptomatic and (pre)symptomatic individuals
In the above, note that the infection intensity of each individual only depends on the individual’s contacts, not others’ contacts. Moreover, similarly as in the case of location-based technologies, one could characterize external sources of case data by adding an additional baseline rate to the intensity .
The algorithms presented in this section simulates the state of each individual in the population over a time window of interest under a given testing and tracing strategy, social distancing measures and business restrictions. As introduced in the main body, given the location patterns of each individual, the challenge of simulating realizations our model is the generation of valid samples from the exposure processes . Once individual becomes infectious, their change in state affects the rates of exposure of other individuals that will have contact with in the future. This can cast previous timings sampled for invalid.
Algorithms 1 and 2 implement the principles of superposition and thinning  by using a priority queue of temporal events for all individuals. For simplicity, we omit the details about the function Interventions, which applies thinning due to social distancing and business restrictions, and point the reader to our publicly available implementation for details111111 https://github.com/covid19-model.