Bayesian uncertainty quantification for epidemic spread on networks

10/22/2017 ∙ by Zhizhong Chen, et al. ∙ 0

While there exist a number of mathematical approaches to modeling the spread of disease on a network, analyzing such systems in the presence of uncertainty introduces significant complexity. In scenarios where system parameters must be inferred from limited observations, general approaches to uncertainty quantification can generate approximate distributions of the unknown parameters, but these methods often become computationally expensive if the underlying disease model is complex. In this paper, we apply the recent massively parallelizable Bayesian uncertainty quantification framework Π4U to a model of a disease spreading on a network of communities, showing that the method can accurately and tractably recover system parameters and select optimal models in this setting.



There are no comments yet.


page 8

page 9

page 10

page 12

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A variety of mathematical approaches have been used to model the spread and virulence of a disease on a network, especially in the deterministic setting. Steady-state and bifurcation analysis has been done on the man-environment-man and SIR models [4, 12], while other work has used Lyapunov functions to determine endemic equilibria for SIRS and SEIR models [14, 10] or has considered a mean-field approach [2, 9, 16, 18]. Some work has additionally focused on how the network structure influences the spread of disease via the initial conditions and network topologies [6, 7] or on how epidemics spread on random networks [13]. Analytical results have also been obtained for the case of two competing (or promoting) diseases on a network [12, 15]. Moreover, many of the models in question have been used to design intervention policies or allocate vaccines via optimal control [10, 8] or randomized interventions [17].

Despite the wide range of applications and results, most of these models are not readily applied to scenarios which incorporate uncertainty in the system parameters or in the model output: propagating uncertain parameters through the model (or performing inference in the presence of noisy outputs) requires a more robust probabilistic treatment. While there exist approaches to uncertainty quantification (UQ) in a general setting, the complexity of these disease models (especially for large networks with intricate interdependencies) may prove these approaches infeasible. Here, we consider a recent framework for Bayesian uncertainty quantification which offsets this computational complexity through its massive parallelizability.

The specific system which will be considered is the SIR model on a graph (see Section 2), an extension of the classic SIR model which is widely studied due to its simplicity and predictiveness for several common diseases, in the case when some (or all) system parameters are unknown. Since the considered networks and corresponding models are quite complex (50 distinct communities), traditional Bayesian UQ approaches are computationally expensive. Here, we use the highly parallelizable framework 4U to approximately sample from the posterior distributions of the unknown parameters given a small set of noisy model outputs [1, 11]

. The parallelizability derives from the sampling method used, known as transitional Markov chain Monte Carlo (TMCMC), which allows many instances of the model to be run in parallel; TMCMC and its role in the

4U method are detailed in Section 3.

In Section 4, the parallelized TMCMC approach is applied to three example community networks with fixed system parameters. Using only a small set of noisy outputs generated by the models, the method is able to recover approximate distributions of the parameters which both correspond to the reference values and capture elements of the system dynamics. The 4U framework is also convenient for model selection; we additionally consider the problem of discerning the origin of an epidemic by identifying each possible starting location as a unique model. In both scenarios, the 4U approach is effective at providing accurate and computationally feasible results.

2 SIR Model

There are a number of approaches to mathematically modeling the spread of a disease through a population; here, we consider the Susceptible-Infective-Removed (SIR) model, an ordinary differential equation model which approximates the dynamics of an epidemic over continuous time. The SIR model decomposes the population into three eponymous groups: hosts who are

susceptible to the disease, hosts who are infected and contagious (the infective group), and hosts who neither susceptible nor infected (the removed group), either via gained immunity from recovery or due to a vaccine, quarantine policies, or disease-related death.

2.1 Single Population Model

Let , , and denote the size of the susceptible, infective, and removed groups, respectively, as a function of a continuous time . The SIR model makes three main assumptions. First, since the timescale on which the disease evolves is assumed to be much shorter than the scale on which the population may evolve via, e.g., births or natural deaths, the population is assumed constant, and so

for all . (Note that individuals killed by the disease are considered part of the removed group.) Second, members of the population are assumed to come into contact uniformly at random and at a constant rate – this parameter governs the rate at which an infection can spread. Finally, the infective population recovers (or is otherwise removed from the infectives via, e.g., death) at a constant rate . The general flow can thus be visualized as

yielding the following set of ordinary differential equations:


Namely, at a particular time , susceptibles and infectives come into contact at a rate , yielding transitions from susceptible to infective (implicitly assuming that contact with an infective immediately infects a susceptible – if this assumption is not desired, the chance of disease transfer can be incorporated in ). Meanwhile, infectives are removed at a rate , yielding transitions from infective to removed.

2.2 Epidemic Model on Graphs

The SIR model is easily generalized to a directed graph with vertices. Namely, let each node be a distinct population whose dynamics evolve according to (1); the directed edges are a convenient framework to dictate transfer between populations. Since each population itself has three groups (susceptible, infective, removed), three quantities are needed to describe movement. Here, we use , , and to describe the rate of movement from node to node on the susceptible, infective, and removed groups, respectively; identifying each transition rate as the weight of edge connecting to , these rates are naturally written as weighted adjacency matrices, here denoted , , and . The SIR model on a network, now a system of models corresponding to each population , can then be written as

or more succinctly in matrix form:



is a vector of ones which simplifies the notation.) It should be emphasized that

, , and are 1 vectors whose element corresponds to the population. Note that if for all , i.e., there is no movement between populations, each model reduces to the single population model (1).

3 Bayesian Uncertainty Quantification

Here, we will look at the epidemic network model of Section 2.2

in two probabilistic contexts: parameter estimation, i.e., the estimation of parameter distributions from noisy data, and model selection, i.e., the comparison of distinct models using noisy data. Both contexts require an explicit model of observational noise; we assume that the observed data

are related to the output of model and parameter set by the model prediction equation


where is a deterministic function from (the output of the epidemic network model) and

is a prediction error. The posterior distribution of the parameters given the data is then given by Bayes’ Theorem as


in terms of the prior , likelihood , and evidence of the model class, given by the multi-dimensional integral

In the context of model selection, the model is one of many models in a parametrized class

. The probability that the observed data were generated by a particular model

is also given by Bayes’ Theorem:


In particular, under the assumption of a uniform prior on models, is directly proportional to the evidence , and so model selection is “free” when the evidence is already calculated for parameter estimation [3, 19].

In order to calculate the likelihood needed for (4), we make the additional simplifying assumption that the model error

is normally distributed with zero mean and covariance matrix

, which may include additional unknown parameters. Since the model outputs are deterministic, it follows that is also normally distributed. The likelihood of the observed data is thus given by


is the weighted measure of fit between the model predictions and the measured data, denotes determinant, and the parameter set is augmented to include parameters that are involved in the structure of the covariance matrix .

3.1 Transitional Markov Chain Monte Carlo

The main computational barrier in calculating the posterior distribution of parameters given by (4) is the complex forward problem (the epidemic network model) which appears in the fitness . The 4U approach has two advantages in this respect: first, it approximately samples the posterior via transitional Markov chain Monte Carlo (TMCMC), which is massively parallelizable, and second, it leverages an efficient parallel architecture for task sharing described in Appendix Appendix A: High-performance implementations.

The TMCMC algorithm used by 4U functions by transitioning to the target distribution (the posterior ) from the prior . To accomplish this, a series of intermediate distribution are constructed iteratively:

The explicit algorithm is summarized below in Algorithm 1. It begins by taking samples from the prior distribution . For each stage of the algorithm, the current samples are used to compute the plausibility weights as

Recent literature suggests that , which determines how smoothly the intermediate distributions transition to the posterior, should be taken to make the covariance of the plausibility weights at stage smaller than a tolerance covariance value, often 1.0 [5, 11].

1:procedure TMCMC Ref. [11]
2:     BEGIN, SET
3:     Generate from prior and compute likelihood for each sample.
4:     loop:
5:     WHILE DO:
6:     Analyze samples to determine , weights , covariance , and estimator of .
7:     Resample based on samples available in stage in order to generate samples for stage and compute likelihood for each.
8:     if  then
9:          BREAK,
10:     else
12:          goto loop.      
13:     end
14:     END
Algorithm 1 TMCMC

Next, the algorithm calculates the average of the plausibility weights, the normalized plausibility weights , and the scaled covariance of the samples , which is used to produce the next generation of samples :

is calculated using the sample mean of the samples and a scaling factor , usually [5, 11].

The algorithm then generates samples by randomly selecting from the previous generations of samples such that with probability . These samples are selected independently at random, so any parameter can be selected multiple times – call the number of times is selected. Each unique sample is used as the starting point of an independent Markov chain of length generated using the Metropolis algorithm with target distribution and a Gaussian proposal distribution with covariance centered at the current value.

Finally, the samples are generated for the Markov chains, with samples are drawn from the chain starting at , yielding total samples. Then the algorithm either moves forward to generation or terminates if .

4 Results

Here, we combine the methodologies of the previous two sections by applying 4U to a selection of network SIR models. The deterministic forward model, given by (2) and solved with a 4-stage Runge-Kutta method with time step , plays the role of in the model prediction equation (3), with an output space of dimension , i.e., three sizes (of the susceptible, infective, recovered populations) for each of nodes. Each output is corrupted by Gaussian noise as

where is the population data in the position of the vector,

is a zero-mean, unit-variance Gaussian variable, and

is the level of the noise. Namely, we are assuming that the model prediction error covariance is a diagonal matrix whose nonzero entries all have the same magnitude . In order that the signal-to-noise ratio be high enough for meaningful estimation, we choose to be a fraction (or sometimes

) of the standard deviation

of all model outputs.

In the following results, we use 4U to estimate parameters (via the generation of samples from the posterior) for the SIR model on three networks: a simple 5-node tree, a two-community network modeled by two complete 20-node graphs connected by a single edge, and a loosely-connected three community network. For ease of comparison, numerical results will be presented in terms of the rescaled parameters , given by , i.e., the ratio between the estimated value and the true value. Accurate estimation will thus result in scaled parameters close to 1. The prior is assumed uniform on in the scaled parameter space.

4.1 The 5-Node Tree

The first network examined is a loosely connected community, modeled with the 5-node tree shown in Figure 1.





Figure 1: 5-node tree.

300 individuals are distributed across the network according to the following initial vectors:

Namely, the infection begins at node 1, where 100 of 110 individuals are infected. Individuals move between the nodes according to the weighted adjacency matrices , , and , describing the transition rates between nodes of the susceptible, infective, and removed populations, respectively:

The disease itself has infection rate and recovery rate . The observed data are the predictions of the nominal model at time corrupted by 1% Gaussian noise, i.e., .

Figure 2: Parameter estimation results for infection rate , recovery rate , time since infection , and noise level . Histograms for each parameter are displayed along the main diagonal of the figure. Sub-figures below the diagonal show the marginal joint density functions for each pair of parameters, while sub-figures above the diagonal show the samples used in the final stage of TMCMC. Colors correspond to probabilities, with yellow likely and blue unlikely.

The results are displayed in Figure 2, which shows strong pairwise correlations between the infection rate , the recovery rate , and the time since infection . and are positively correlated, i.e., similar outputs can be achieved by simultaneously raising both the infection rate and the recovery rate. Intuitively, a faster-spreading disease must be counteracted by quicker recovery in order for the dynamics to remain consistent. Similarly, both and are negatively correlated with ; a more infectious disease or quicker recovery would increase the speed of the system dynamics, meaning similar outputs would be observed earlier.

The recovered mean parameter values of were , respectively. To quantify the degree of uncertainty for each parameter’s posterior distribution, we compute the ratio of its standard deviation to its mean (denoting the results , , , and ); here, the system parameters , , and were recovered with high certainty, with (, , , and ) = (, , , , respectively. The true value of each system parameter was within one standard deviation of the estimated mean.

4.2 Two Complete Graphs, Loosely Connected

The second network studied is a barbell graph – two complete 20 node graphs connected by a single edge, illustrated in Figure 3. This models two populations that each have many highly interacting sub-communities; however, the two populations only mix via a single route, modeled by the one connecting edge. In this case, we impose uniform transition rates between adjacent vertices of 0.02, 0.3, and 0.05 for the susceptible, infective, and recovered populations, respectively. The infection begins at node 8 with the configuration , , , and all other nodes are fully susceptible with configuration , , .

Figure 3: Double complete graph that is loosely connected.

Unlike the previous section, wherein the observable data included noisy information about the population at every node, we will consider here the case of having information only from a limited subset of nodes; by placing the “sensors” at different locations (i.e., observing different subsets of nodes), we can test how the sensor configuration influences the parameter estimation procedure and corresponding uncertainties.

We consider three sensor configurations: the first experiment places two sensors at nodes 3 and 12, which are members of the same complete subgraph as node 8, the origin of the epidemic (see Figure 3). The second experiment gathers data at the bridge between the two complete graphs by placing sensors at nodes 20 and 21. Finally, the third experiment focuses on nodes 24 and 27, which are part of the initially healthy complete subgraph. As in Section 4.1, observable population data is noisy, with the true values corrupted by 1% Gaussian noise to approximate observation error. The same parameters , , and are used, but these experiments will additionally consider the cases of and .

(a) Nodes 3 and 12,
(b) Nodes 3 and 12,
(c) Nodes 20 and 21,
(d) Nodes 20 and 21,
(e) Nodes 24 and 27,
(f) Nodes 24 and 27,
Figure 4: Parameter estimation results for the barbell graph with infection rate , recovery rate , time since infection , and noise level . In each experiment, noisy data from two nodes were used to track the epidemic. For each pair of nodes, the experiment was run for (left), (right), and (shown in Figure 6). See Figure 2 for description of subfigures.

The results are summarized in Table 1 and displayed in Figures (a)a, (c)c, and (e)e. In all three cases, the true values for , , and are within one standard deviation of the posterior mean values. Furthermore, there exists a strong positive correlation between and , as well as a negative correlation between and both and , as in the case of the 5-node tree (Figure 2).

Node Pair (%) (%) (%) (%)
3 and 12 1.0659 14.99 1.0419 16.96 0.9773 13.48 2.5581 67.96
20 and 21 1.0263 6.31 1.0229 5.40 0.9846 5.09 1.2426 78.97
24 and 27 1.0372 13.61 1.0338 12.93 0.9833 9.36 1.6122 81.42
Table 1: Results from parameter estimation on the double complete graph at time . Reported values are the scaled parameters, e.g., , and not the parameters themselves, e.g., (see Section 4 intro).

The recovered noise standard deviation was found to have comparatively large uncertainty compared to the system parameters , , and . In particular, the uncertainty of when observing nodes 3 and 12 was much larger than in the other two experiments (see and in Table 1). Despite this comparatively large uncertainty, the true noise value was recovered to within one standard deviation in all three experiments.

Figure 5 shows the deterministic populations of the susceptible, infected, and recovered groups as a function of time for a selection of nodes involved in the experiments (since, e.g., node 12 is identical to node 3, only one is shown). Nodes 21 and 27, both contained in the initially susceptible subgraph, reach peak infective population around time . Nodes on the side of the infection origin, conversely, achieve peak infective population at . The results of Table 1, which used the true observation time , suggest that observing nodes around the time when the infective population peaks improves the accuracy of the recovered parameters. This effect can also be seen in Figure 4 in the joint marginals of the system parameters , , and : nodes 3 and 12, which peak at , have much more smeared marginals in Figure (a)a, which was sampled at the true time , than in Figure (b)b, sampled at ; meanwhile nodes 24 and 27, which peak at , have sharper distributions in Figure (e)e, with observations taken at , than in Figure (f)f, which has observations at . The distributions for nodes 20 and 21, shown in Figures (c)c and (d)d, follow well-defined trend curves in both cases, suggesting that placing one sensor in each subgraph leverages information from both sides.

(a) Susceptible Population
(b) Infective Population
(c) Removed Population
(d) Total Population
Figure 5: Time evolution of the susceptible, infective, recovered, and total populations at nodes 3, 20, 21, and 27. Nodes from different complete subgraphs have different trends and peak times.
(a) Nodes 3 and 12,
(b) Nodes 3 and 12, ,
(c) Nodes 20 and 21,
(d) Nodes 20 and 21, ,
(e) Nodes 24 and 27,
(f) Nodes 24 and 27, ,
Figure 6: Parameter estimation results for the barbell graph with infection rate , recovery rate , time since infection , and noise level at time (left) and with increased noise level at time (right). See Figure 2 for description of subfigures.

The parameter estimation results at , shown in the left column of Figure 6

, corroborate this conclusion. Though all nodes in the graph are well past peak infective population at this time, using information on two different timescales (the two subgraphs) yields much sharper marginals (see, e.g., the joint distribution of

and in Figure (c)c as compared to Figures (a)a and (e)e).

Numerical values for the and appear in Tables 2 and 3. When , the true parameters are no longer within one standard deviation of the recovered mean parameters from observations at nodes 24 and 27 (which have peak infective population at ). At the later time , nodes 3 and 12 fail to recover and , while nodes 24 and 27 fail to recover to within one standard deviation. The experiment with sensors at nodes 20 and 21 was accurate to within one standard deviation for all system parameters at all three sampled times.

Node Pair (%) (%) (%) (%)
3 and 12 1.1251 20.77 1.1358 21.76 0.9312 18.92 2.8916 57.95
20 and 21 1.0107 7.14 1.0157 6.94 1.0001 5.62 1.6263 72.04
24 and 27 1.3396 18.23 1.4500 18.21 0.8273 15.39 1.1248 47.93
Table 2: Results from parameter estimation on the double complete graph at time .
Node Pair (%) (%) (%) (%)
3 and 12 1.4207 14.07 0.7911 19.50 1.1088 10.09 1.3572 64.41
20 and 21 1.0966 19.19 1.0674 12.34 0.9506 12.18 2.3077 58.34
24 and 27 1.2645 20.77 1.1391 15.39 0.8675 14.25 3.1236 55.38
Table 3: Results from parameter estimation on the double complete graph at time .

To test the robustness of the parameter estimation to observation errors, a final set of experiments was run using time and an increased noise level , five times the previously-used . The results appear in the right column of Figure 6 and in Table 4. Again, the experiments using simulated data from nodes 20 and 21 (Figure (d)d) and nodes 24 and 27 (Figure (f)f), both of which contain nodes in the subgraph which peaks at , recover the parameters with comparatively lower uncertainty and greater accuracy. Compared to Table 1, all three experiments had significant increases in uncertainty for all parameters. Nonetheless, parameters were again recovered to within one standard deviation, and so we conclude that the Bayesian uncertainty quantification approach to SIR models on the double complete graph has significant robustness to observational noise.

Node Pair (%) (%) (%) (%)
3 and 12 1.2062 22.54 1.0189 28.87 0.9985 23.00 0.9900 36.09
20 and 21 1.0755 16.64 1.0687 13.98 0.9702 12.94 0.8687 43.74
24 and 27 1.2105 24.03 1.2163 22.88 0.8994 21.48 0.9522 38.23
Table 4: Results from parameter estimation on the double complete graph with increased noise level at time .

4.3 Three Group Network

The final network considered is a 44-node graph comprising three large sub-networks with limited interaction (Figure 7). Each sub-network has a distinct topological structure and set of nonuniform transition rates (explicit values appear in Appendix B). We again consider three sensor configurations: a 7-node set (nodes 1, 2, 3, 4, 20, 31, and 34), a 23-node set (nodes 1–7, 20–28, and 31–37), and a 35-node set (nodes 1–28 and 31–37), each in the presence of observational noise .

Figure 7: 44-node network. Group I (yellow) comprises nodes 1-19, Group II (purple) 21-30, and Group III (green) 31-44. Groups are sparsely connected.

4.3.1 Parameter Estimation

First, we attempt to recover , , and (true values , , , respectively) for a disease which starts at node 34 with , , . All other nodes are fully susceptible, i.e., , , for all .

(a) 7-node Subset, Estimating
(b) 7-node Subset, Estimating , , and
(c) 23-node Subset, Estimating
(d) 23-node Subset, Estimating , , and
(e) 35-node Subset, Estimating
(f) 35-node Subset, Estimating , , and
Figure 8: Three group parameter estimation results for estimating , , (reference values , , , respectively; left column) and , , , , and (reference values , , , , , respectively; right column). See Figure 2 for description of subfigures.
Data (%) (%) (%) (%)
7 nodes 1.0031 0.73 1.0155 0.84 1.0006 0.54 1.0460 17.25
23 nodes 1.0007 0.38 1.0010 0.34 1.0010 0.31 1.0200 6.73
35 nodes 1.0034 0.37 1.0012 0.36 0.9985 0.28 0.9612 6.86
Table 5: Numerical results for estimation of , , and for the three group network.

The results are shown in Figures (a)a, (c)c, and (e)e and Table 5. Sensors at the 7-node subset recover and

to within one standard deviation, while larger subsets recover all parameters to one standard deviation accuracy. Larger observational subsets have the additional effect of reducing uncertainty: the 7-node 95% confidence interval for the scaled infection rate

is [0.9885,1.0177], which narrows significantly to [0.9931, 1.0083] in the 23-node case. The sample stratification with respect to merits a brief remark: results were re-generated for a range of sample sizes between and , with the behavior persisting in each case.

A second experiment in this context augmented the parameter set with the initial population vector , , and of the initially infected node, but took as known the observation time . In order that the reference values of all parameters be positive, the initial population vector at node 34 was altered to , , . The scaled parameter set used a uniform prior on .

The results are shown in Figures (b)b, (d)d, and (f)f and appear numerically in Table 6. Compared to estimation of , , and , correlations between parameters are generally weaker in this context, although there do exist clear relationships (e.g., larger necessitates smaller for the infection to spread at the same absolute rate). While the 7-node sensor configuration recovers all parameters to within one standard deviation, the 27-node subset fails to recover , , and , while the 35-node subset fails to recover and ; however, in the latter two cases, the estimated mean is relatively accurate on an absolute scale, differing by less than from the true value in the 35-node case.

Data (%) (%) (%)
7 nodes 1.0009 0.50 1.0018 0.65 1.1139 15.76
23 nodes 0.9963 0.27 0.9987 0.32 0.9243 6.89
35 nodes 1.0018 0.16 1.0019 0.18 0.9896 5.36

Data (%) (%) (%)
7 nodes 1.3544 70.97 0.9749 6.64 1.3622 56.07
23 nodes 1.0965 59.62 1.0456 3.81 0.4575 94.16
35 nodes 1.1405 51.03 0.9822 2.15 1.2623 43.21
Table 6: Numerical results for estimation of , , , , and for the three group network.

4.3.2 Origin of Disease Identification

Finally, we use the model selection framework of Section 3 to attempt to identify the origin of the epidemic (recall that all observations are at the future time , and so the origin may not be clear even when included in the set of observed nodes). We initialize the disease at node 1 with the standard initial configuration , and , with all other nodes fully susceptible with , . Other parameters are identical to the previous section; as in the first set of results, , , and are not known explicitly and must be estimated from observations. Defining the model as the model under which the disease originated from node with the given initial vector, the log evidence for each model can be generated from (5).

Results are summarized in Table 7. Model , corresponding to the correct origin of the disease at node 1, was found to have a significantly larger log evidence than all other models considered. Models which placed the origin at increasingly distant points generated increasingly less accurate and more uncertain results; models and , which originate the disease in Group III, found the estimated noise to be two orders of magnitude larger than the reference value. If the correct model is not considered, the probabilities shift to for , for , and all other probabilities or smaller, suggesting that topographic proximity to the true origin is the dominant factor in the evidence.

Log Evidence (%) (%) (%)
1 -25.026643 1.0000e+00 1.0003 0.03 1.0008 0.10 0.9734 5.46
8 -440.618914 3.2402e-181 1.7582 1.65 1.0571 6.03 62.7082 5.15
14 -443.621677 1.6087e-182 0.9339 2.00 0.9297 6.15 63.2783 5.21
21 -490.980001 4.3554e-203 0.9209 3.84 0.9767 11.47 102.3819 5.73
32 -516.383502 4.0404e-214 6.7646 22.12 0.5087 11.64 134.5893 5.52
43 -515.936779 6.3159e-214 7.3225 17.19 0.5711 13.19 133.7376 5.61
Table 7: Subset of model selection results for the three-population network.

5 Discussion

In most cases considered, Bayesian uncertainty quantification via TMCMC effectively recovered SIR network model parameters such as the infection rate and recovery rate using only noisy observations from a limited set of nodes. When applied to small networks with simple topologies such as the 5-node tree of Section 4.1, the method was particularly effective, with system parameters recovered accurately (within one standard deviation of the reference value) and with little uncertainty (). The estimated noise level was comparatively less certain, partially because uncertainty in system parameters (giving rise to uncertainty in the system output) is easily conflated with observational noise.

The double complete graph, wherein only pairs of nodes were observed, provided some insight into the optimal placement of sensors in disease networks. Sensors which were close together, e.g., nodes 3 and 12 (whose connectivity was identical), produced similar noisy data, thereby affording less information about the underlying dynamics. In contrast, placing sensors on both sides of the graph to gain information about dynamics on different time scales yielded significantly less uncertainty. The 4U approach also proved robust to an increased noise level , though uncertainty in the recovered posterior means increased greatly.

Results for the three group network showed the applicability of the method to widely varying network topologies and transition rates. It additionally confirmed that increasing the number of sensors improves parameter estimation results: recovered values for the 35-node case had reduced uncertainty compared to the 7- and 23-node cases. The disease origin was also identified with near certainty via selection among models corresponding to potential starting points; as the quantities required for this procedure are intermediate values of the 4U approach, model selection in the Bayesian UQ setting is essentially free.

Since the 4U method does not makes reference to a particular model, it can function as a black-box method for highly-parallelized Bayesian uncertainty quantification in a number of other contexts previously considered computationally infeasible. It has been applied to molecular and structural dynamics models [1, 11] and ongoing work identifying aneurysms and stenoses in arterial networks. Future work will extend the framework to other epidemic and endemic models.


Part of this research was conducted using computational resources and services at the Center for Computation and Visualization, Brown University. CB was partially supported by the NSF through grant DMS-1148284. ZC, KL, and AM were partially supported by the NSF through grants DMS-1521266 and DMS-1552903.

Appendix A: High-performance implementations

[11] is a platform-agnostic task-based UQ framework that supports nested parallelism and automatic load balancing in large scale computing architectures. The software is open-source and includes HPC implementations for both multicore and GPU clusters of algorithms such as Transitional Markov Chain Monte Carlo (TMCMC) and Approximate Bayesian Computational Subset-simulation. The irregular, dynamic and multi-level task-based parallelism of the algorithms (Fig. 9, left) is expressed and fully exploited by means of the TORC runtime library [1]. TORC is a software library for programming and running unaltered task-parallel programs on both shared and distributed memory platforms. TORC orchestrates the scheduling of function evaluations on the cluster nodes (Fig. 9, right). The parallel framework includes multiple features, most prominently the inherent load balancing, fault-tolerance and high reusability. The TMCMC method within U is able to achieve an overall parallel efficiency of more than 90% on 1024 compute nodes of Piz Daint running hybrid MPI+GPU molecular simulation codes with highly variable time-to-solution between simulations with different interaction parameters.

Figure 9: Task graph of the TMCMC algorithm (left) and parallel architecture of the TORC library (right).

Appendix B: Transition Matrices for Three Group Network

To demonstrate the robustness of the method, each group has different transition rates between the subpopulations. The susceptible, infective, and removed populations have the same transition rate of 0.4 between Group I and Group II (corresponding to the edge connecting nodes 14 and nodes 21); the transition rates between Group II and Group III (connecting nodes 25 and 31 and nodes 26 and 31) are 0.2 for all populations. The rates between edges in Group I and in Group II are selected randomly from the sets and , respectively. Finally, Group III has uniform rates of 0.05.


  • [1] P. Angelikopoulos, C. Papadimitriou, and P. Koumoutsakos. Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework. J. Chem. Phys., 137:144103, 2012.
  • [2] Benjamin Armbruster and Ekkehard Beck. Elementary proof of convergence to the mean-field model for the sir process. J. Math. Biol. DOI: 10.1007/s00285-016-1086-1.
  • [3] J.L. Beck and K.V. Yuen. Model selection using response measurements: Bayesian probabilistic approach. J. Eng. Mech., 130:192–203, 2004.
  • [4] V. Capasso and R.E. Wilson. Analysis of a reaction-diffusion system modeling man–environment–man epidemics. SIAM J. Appl. Math., 57(2):327–346, 1997.
  • [5] J.Y. Ching and Y.C. Chen. Transitional markov chain monte carlo method for bayesian model updating, model class selection, and model averaging. J. Eng. Mech., 133:816–832, 2007.
  • [6] M. Draief, A. Ganesh, and L. Massoulié. Thresholds for virus spread on networks. Ann. Appl. Probab., 18(2):359–378, 2008.
  • [7] M. Draief and L. Massoulié. Epidemics and Rumours in Complex Networks. Cambridge University Press, 2010.
  • [8] E. Duijzer, W. van Jaarsveld, J. Wallinga, and R. Dekker. The most efficient critical vaccination coverage and its equivalence with maximizing the herd effect. Math Biosci., 282:68–81, 2016.
  • [9] S.C. Ferreira, C. Castellano, and R. Pastor-Satorras. Epidemic thresholds of the susceptible-infected-susceptible model on networks: A comparison of numerical and theoretical results. Phys. Rev. E, 86(4):041125, 2012.
  • [10] H. Guo, M.Y. Li, and Z. Shuai. A graph-theoretic approach to the method of global Lyapunov functions. Proc. Amer. Math. Soc., 136:2793–2802, 2008.
  • [11] P.E. Hadjidoukas, P. Angelikopoulos, C. Papadimitriou, and P. Koumoutsakos. 4U: A high performance computing framework for bayesian uncertainty quantification of complex models. J. Comput. Phys., 284:1–21, 2015.
  • [12] B. Karrer and M.E.J. Newman. Competing epidemics on complex networks. Phys. Rev. E, 84:036106, 2011.
  • [13] M.J Keeling and K.T.D. Eames. Networks and epidemic models. J. R. Soc. Interface, 2(4):295–307., 2005.
  • [14] A. Kumar, P.K. Srivastava, and Y. Takeuchi. Modeling the role of information and limited optimal treatment on disease prevalence. Journal of Theoretical Biology, 414:103–119, 2017.
  • [15] M.E.J. Newman and C.R. Ferrario. Interacting epidemics and coinfection on contact networks. PLoS ONE, 8(9):e71321, 2013.
  • [16] P. Shu, W. Wang, M. Tang, and Y. Do. Numerical identification of epidemic thresholds for susceptible-infected-recovered model on finite-size networks. Chaos, 25(6):063104, 2015.
  • [17] G. Tanaka, C. Urabe, and K. Aihara. Random and targeted interventions for epidemic control in metapopulation models. Scientific Reports, 4:5522, 2014.
  • [18] J. Truscott and N.M. Ferguson. Evaluating the adequacy of gravity models as a description of human mobility for epidemic modelling. PLoS Comput Biol, 8(10):e1002699, 2012.
  • [19] K.-V. Yuen. Bayesian Methods for Structural Dynamics and Civil Engineering, 1st edition. Wiley-Vch Verlag, 2010.