DeepAI

# Probabilistic predictions of SIS epidemics on networks based on population-level observations

We predict the future course of ongoing susceptible-infected-susceptible (SIS) epidemics on regular, Erdős-Rényi and Barabási-Albert networks. It is known that the contact network influences the spread of an epidemic within a population. Therefore, observations of an epidemic, in this case at the population-level, contain information about the underlying network. This information, in turn, is useful for predicting the future course of an ongoing epidemic. To exploit this in a prediction framework, the exact high-dimensional stochastic model of an SIS epidemic on a network is approximated by a lower-dimensional surrogate model. The surrogate model is based on a birth-and-death process; the effect of the underlying network is described by a parametric model for the birth rates. We demonstrate empirically that the surrogate model captures the intrinsic stochasticity of the epidemic once it reaches a point from which it will not die out. Bayesian parameter inference allows for uncertainty about the model parameters and the class of the underlying network to be incorporated directly into probabilistic predictions. An evaluation of a number of scenarios shows that in most cases the resulting prediction intervals adequately quantify the prediction uncertainty. As long as the population-level data is available over a long-enough period, even if not sampled frequently, the model leads to excellent predictions where the underlying network is correctly identified and prediction uncertainty mainly reflects the intrinsic stochasticity of the spreading epidemic. For predictions inferred from shorter observational periods, uncertainty about parameters and network class dominate prediction uncertainty. The proposed method relies on minimal data and is numerically efficient, which makes it attractive either as a standalone inference and prediction scheme or in conjunction with other methods.

• 2 publications
• 3 publications
• 3 publications
• 4 publications
• 3 publications
12/13/2021

### Limits of epidemic prediction using SIR models

The Susceptible-Infectious-Recovered (SIR) equations and their extension...
09/20/2022

### Seq2Seq Surrogates of Epidemic Models to Facilitate Bayesian Inference

Epidemic models are powerful tools in understanding infectious disease. ...
03/04/2015

### Quantifying Uncertainty in Stochastic Models with Parametric Variability

We present a method to quantify uncertainty in the predictions made by s...
02/15/2021

### A stochastic SIR model for the analysis of the COVID-19 Italian epidemic

We propose a stochastic SIR model, specified as a system of stochastic d...
03/04/2021

### Quantifying changes in the British cattle movement network

The Cattle Tracing System database is an online recording system for cat...
01/24/2022

### Uniformly Ergodic Data-Augmented MCMC for Fitting the General Stochastic Epidemic Model to Incidence Data

Stochastic epidemic models provide an interpretable probabilistic descri...
07/09/2020

### Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles

Across the world, scholars are racing to predict the spread of the novel...

## 1 Introduction

Mathematical models of the dynamics of directly transmitted infectious diseases can provide predictions about the future course of an ongoing epidemic and hence aid in decision making and epidemic control (e.g. siettos2013mathematical). Statistical time series methods are utilised predominantly for epidemic nowcasting, i.e., shortest-term predictions and current state assessment under not yet complete data (e.g. bastos2019modelling; mcgough2020nowcasting) and epidemic surveillance, i.e., early identification of emerging epidemics (unkel2012statistical). Mechanistic/state-space models, which are based on a mathematical description of the spreading process, allow one to make predictions about the future course of an epidemic (shaman2012forecasting; tizzoni2012real; nsoesie2013forecasting) as well as simulating intervention strategies (chao2011planning; di2021optimal; di2020impact; van2020covid). Many such models rely on a compartmentalisation of a population of individuals according to the individual’s disease status. In diseases for which there is no immunity upon recovery, each individual is either susceptible (S) or infected/infectious (I) at any given time.

One assumption common to many compartmental models is that of random mixing of individuals, or of/within subgroups of a population (e.g., kermack1927contribution; jacquez1993stochastic). While this assumption can be adequate in some instances (e.g., within households; goeyvaerts2018household), it is known that populations do not mix at random in general. Rather individuals have a finite set of contacts to whom they can pass on an infection. It is well-established that the contact network of a population significantly impacts epidemic dynamics (e.g., shirley2005impacts; yin2017impact). The importance of contact structure for epidemic dynamics has led to a close interaction between network science and mathematical epidemiology (keeling2005networks; danon2011networks; pastor2015epidemic; kiss2017mathematics) whereby the spread of an epidemic within a population is understood and modelled as a stochastic process on a network. In such a model, each individual in the population corresponds to a node in the network, and a contact that represents a potential route for disease transmission between two individuals is a link in the network.

Drawbacks of network epidemiological models typically include their high dimensionality and the inaccessibility of the exact contact network of a population. Consider a SIS-epidemic on an undirected, unweighted network with

nodes. At any given time, each node is either susceptible or infected/infectious. If the exact contact network is static and known, a complete description of the SIS dynamics is given by a continuous-time Markov-chain of dimension

(one equation for each possible network state; e.g., simon2011exact). Such a Markov-chain model is exact, but also high-dimensional even for modest values . Hence, the numerical integration of the system of equations becomes unfeasible for most real-life networks. Consequently, analytical results based on the exact system are mostly out of reach, and existing results typically rely on mean-field approximations (e.g., mata2013pair; cota2018robustness). Further, the exact contact network of a population is rarely accessible, but usually needs to be approximated either from limited observations and/or based on theoretical network models (e.g., della2020network; xue2020data).

In this study, we explore the suitability of a computationally inexpensive model to describe the stochastic process of an SIS epidemic spreading on Regular (Reg), Erdős–Rényi (ER) and Barábasi-Albert (BA) networks. The surrogate model utilised in this study was first introduced in di2020network and further expanded to include more network classes and consider the large limit in di2020pde. The core idea of the approach is a dimension reduction of the state space. In the surrogate model, the state of the epidemic at any given time is defined by the total number of infected nodes in the population. The effect of the contact structure on the spreading of the epidemic is accounted for by the model parameters. The continuous-time Markov-chain describing the SIS dynamics on the reduced state space takes the form of a Birth-and-Death (BD) process and is of dimension ; that is, it is linear in and thus feasible also for large . The parameters of the BD model correspond to recovery and infections rates. The recovery rate is network-independent and here assumed to be known. The rate at which new infections occur depends on the network, but for particular network classes it can be well described by a three-parameter model, reducing the number of free parameters of the BD model from to only three. di2020network utilised the finding that different types of networks are associated with distinct regions in the space spanned by the three parameters to infer the type of network from population-level observations. To solve this inverse problem di2020network

set up a Bayesian inference procedure and built network class specific prior distributions which then allow to identify the most likely network class from the posterior.

Here, we utilise the BD model to forecast the evolution of an on-going epidemic. We address the following questions:

• How well does the BD model capture the intrinsic stochasticity of an epidemic spreading on a network?

• How uncertain are model parameters when inferred from the kind of time-censored observations typically available in a realistic prediction scenario, and how does this uncertainty translate into prediction uncertainty?

• Can we use the BD model and Bayesian inference to provide epidemic forecasts with meaningful uncertainty information?

The manuscript is structured as follows: Section 2 introduces the BD model and Bayesian inference. We then outline the generation and evaluation of predictions using the BD model. We consider nine different combinations of networks and epidemic parameters: a small, a medium and a large epidemic on a network from each of the three aforementioned network classes (Section 3). An empirical validation of the BD model based on these nine cases is provided in Section 4.1. In Section 4.2, we evaluate the predictions obtained with the BD model for all nine cases. In particular, we study the sensitivity of network class and parameter inference on the number and timing of observations in realistic prediction scenarios and how uncertainty about network class and model parameters translates into prediction uncertainty. We conclude with a discussion including limitations of this work and future directions.

## 2 Methods

### 2.1 SIS epidemics on networks

We consider the standard SIS epidemic on a population of individuals whose contact structure is described by an undirected and unweighted network defined by its adjacency matrix with and if individuals (nodes) and are connected and otherwise. If two nodes and are connected, the disease can be transmitted from one to the other. In an SIS epidemic, each node is, at any given time, either susceptible (S) or infected/infectious (I). Thus, the epidemic state of the network at time

is described by a Boolean vector

with where if node is susceptible and if node is infected at time . Hence, there exists a total of distinct network states. The state of the network changes through two types of events: the recovery or the infection of a node. Infection and recovery are Markovian and act as homogeneous Poisson point processes with constant per-link infection rate and constant recovery rate . An infectious node can spread the infection only to neighbouring susceptible nodes. Infection dynamics thus depends on the network structure, while recovery is network-independent. A complete description of the SIS-dynamics on a given network corresponds to a Markov-chain over a state-space of dimension for which numerical integration becomes intractable even for modest values of . However, given network adjacency , epidemic parameters and , and initial conditions , realisations of the stochastic process can be readily obtained using the Gillespie algorithm (e.g., gillespie1977exact; kiss2017mathematics). This is computationally comparatively inexpensive and provides us with i.i.d. samples of the true stochastic process. Such samples serve as a reference for the validation of the BD model and for the evaluation of predictions. More precisely, we make use of the aggregated number of infected nodes in the network, i.e., , which describes the epidemic at population level.

### 2.2 BD model

Birth-and-death processes are intuitively linked to the population-level dynamics of SIS epidemics (e.g., ganesh2005effect; nagy2014approximate; devriendt2017unified). In this view, an increase in the number of infected individuals () corresponds to the ’birth of an infection’; a decrease () to the ’death of an infection’. Accordingly, the epidemic state is defined by the number of infected nodes in the network. The resulting model is, like the exact formulation (Sec. 2.1), a continuous-time Markov chain, but on a state space of dimension only.

The Kolmogorov (or Master) equation of a standard BD process is given by

 ∀k∈{0,…,N}, ˙pk(t)=ak−1 pk−1(t)−(ak+ck) pk(t)+ck+1 pk+1(t), (1)

where

denotes the probability of observing

infected nodes at time , and and denote population-level infection and recovery rate, respectively. We note that . The population-level recovery rate can be directly obtained from the node recovery rate as . The population-level infection rate

however depends on the number of links between susceptible and infected nodes (S-I links) present in the network in its current state and is thus a random variable depending on the precise network. Following

di2020network, is represented in the BD model by its expectation the time-averaged number of S-I links over the network states with infected nodes. di2020network further demonstrated that for Regular, Erdős–Rényi and Barabási-Albert networks can be well represented by a three-parameter model of the form

 ∀k∈{0,…,N}, ak(C,α,p)=C kp (N−k)p(α(k−N2)+N), (2)

with serving as a general scaling parameter, allowing to shift the peak of the curve with respect to and adjusting the flatness of the curve. The shape of the curves is network class-specific (Fig. 1). Whilst the peak is located near the centre () for Erdős–Rényi networks, it is shifted to the right for Regular networks, and to the left for Barabási-Albert networks. Accordingly, the -triplets for the different network types cluster in different regions in the three-dimensional parameter space. This observation is central to the network class inference of di2020network.

For a given , a given -triplet, and initial conditions if and otherwise, where denotes the number of infected nodes at , we can numerically integrate Eq. 1 to obtain predictions . In our experiments, we assume that the recovery rate is known. Initial conditions are provided by the last observation available. Thus, the remaining task is the inference of

from the available observational data, here, the number of infected nodes at a set of discrete times. We use Bayesian inference to estimate the posterior distribution over the parameters

given observations. The required priors were derived from extensive Gillespie simulations on different networks and with different epidemic parameters. The particular challenge for making predictions lies in the limited observational period available for inference in a realistic prediction scenario, in which observations exist only up to the current state of the epidemic.

#### 2.2.1 Bayesian inference

The detail of the inference framework can be found in di2020network. Here, we only recall the main ideas and steps of the inference procedure. We denote the population level observations by where with denotes the number of infected individuals at times . For brevity, we use to denote . We further denote the set of candidate network classes as .

In order to make predictions, we require the posterior over given the observations, i.e., , which we can write as

 π(u|y,s)=∑Θπθ(u|y,s) π(θ|y,s).

In di2020network, the goal was network class inference, i.e., obtaining the posterior over given observations , . To this end, di2020network generated network class specific priors . Precisely, they carried out a large number of Gillespie simulations during which they kept track of the number of infected nodes , the number of S-I links in the respective network states as well as the time spent in the various states. The parametric model from Eq. 2 was then fitted to the curves from the Gillespie simulations by a least-squares fit using a particle swarm algorithm (kennedy1995particle). The resulting

triplets were used to infer Gaussian kernel density estimators

(pedregosa2011scikit) for the priors . Assuming a non informative, uniform prior for network class , the prior distribution over and is given by

 π0(u,θ)=13π0,θ(u).

Employing Bayes’ rule we obtain the network class specific posterior(s) over the parameter space as

 πθ(u|y,s)∝Lu(y,s) π0,θ(u). (3)

and the posterior over the network classes as

 π(θ|y,s)=∫π(u,θ|y,s)du∝∫Lu(y,s)π0,θ(u)du,

where denotes the likelihood of the observations under the forward model from Eq. 1 which is given by

 Lu(y,s)=n−1∏i=1puki,ki+1(ti+1−ti). (4)

Following di2020network, the terms are computed using the algorithm from crawford2014estimation. The Python implementation routine estimating is available at https://github.com/BayIAnet/NetworkInferenceFromPopulationLevelData. To estimate , we draw samples from using the Metropolis–Hastings algorithm, making use of Eqs. 3 and 4.

#### 2.2.2 Prediction and uncertainty

To obtain predictions one needs to integrate Eq. 1 with the parameters obtained from the posterior . We generate and evaluate two different types of predictions. The first variant incorporates information on prediction uncertainty as encoded in and . The second variant is based on a point estimate of . The Python routine for generating the predictions will be made available at https://github.com/tzerenner/EpidemicPredictionsFromPopulationLevelData.

For brevity, in the following, we neglect the dependence on and instead consider some fixed point in time. We denote by the pushforward measure of under the forward solution operator defined by Eq. 1. The density of is then related to that of through . The conditional mean of is given by

 mθk=∫pk νθ,k(pk) dpk=∫Gk(u) πθ(u) du. (5)

When additionally integrating over all network classes , we can further obtain the conditional mean of , the pushforward measure of , as

 mk=∫pk (∑θ∈Θπ(θ) νθ,k(pk)) dpk=∫Gk(u) (∑θ∈Θπ(θ)πθ(u)) du. (6)

We estimate from samples which we draw from the posterior distributions , using the Metropolis–Hastings algorithm. Since integrating the Master equation with a large number of parameter combinations is computationally demanding, we thin the samples by including only every -th draw, such that the auto-correlation between subsequent draws is . To choose an appropriate size of the (thinned) sample, we consider its multivariate effective sample size (mESS), which is estimated by

 ˆmESS=n(det(Λn)det(Σn))1/3,

where denotes the sample covariance and

denotes the multivariate batch mean estimator of the covariance matrix in the Markov chain central limit theorem

(roy2020convergence; vats2020analyzing). The sample size is chosen to be the minimum such that which ensures a confidence level of and a tolerance level of for the expectation in the three parameter -space (vats2019multivariate). To compute the mESS and the threshold for the desired confidence and tolerance levels, we used the Python implementation available at https://github.com/Gabriel-p/multiESS.

We then proceed to integrate the Master equation with each -triplet to obtain and finally approximate the conditional mean by

 mk=∑θ∈Θ⎛⎝π(θ) ∑ipθi,kn⎞⎠, (7)

as well as the respective cumulative density over by

 Mk=∑x≤kmx. (8)

From the latter we obtain equal-tailed credible intervals for the predicted number of infected nodes. Equal-tailed intervals are defined such that the probability of being below the interval is as high as being above the interval and thus can be directly obtained from the quantiles of the cumulative density as

 Q(x)=inf{k∈{0,…,1000}:x≤Mk}. (9)

The interval for example corresponds to the 90% equal-tailed interval. When obtained from (Eq. 6), such intervals incorporate both prediction uncertainty arising from uncertainty about parameters and network class as encoded in and , respectively, as well as prediction uncertainty arising from the intrinsic stochasticity of the epidemic spreading.

The second prediction variant is based on a point estimate, i.e., a single inferred from the posterior . We first identify the most likely network class, i.e, the mode of ,

 ^θMAP=argmaxθ{π(θ|y,s)}, (10)

where MAP stands for maximum a-posteriori, and then estimate the mode of ,

 ^u^θ,MAP=argmaxu{π^θ(u|y,s)}, (11)

using a combination of global and local optimisation routines (di2020network). The predictions are obtained by integrating the Master equation with , i.e.,

 pMAP,k=Gk(^u^θ,MAP). (12)

Again, we compute the respective cumulative density over as

 PMAP,k=∑x≤kpMAP,x, (13)

from which we can obtain quantiles and equal-tailed prediction intervals. Such intervals are not credible intervals in the Bayesian sense, but solely represent the intrinsic stochasticity of the epidemic spreading. They are thus systematically narrower than the credible intervals discussed above. An illustration of the two prediction variants for one single point in time is provided in Fig. 2.

To compare uncertainty in the -space for the two prediction variants we further consider the covariance of the pushforward , . We denote the mean of the pushforward of under the forward solution operator by (Eq. 6). Its covariance is given by the outer product

 C=∫(G(u)−m)(G(u)−m)T πθ(u) du.

We estimate from the discrete samples (Eq. 7) as

 Ckl=∑θ∈Θ(1n−1n∑i=1(pk,i−mk)(pl,i−ml) π(θ)),  k=0,…N, l=0…N.

We can then evaluate

 |C|=∣∣∣∫(G(u)−m)(G(u)−m)T πθ(u) du∣∣∣, (14)

where denotes the Euclidean norm in . To accordingly evaluate the uncertainty of the predictions based on the point estimator from Eq. 11, we further evaluate

 |CMAP|=∣∣∣∫(G(u)−pMAP)(G(u)−pMAP)T πθ(u) du∣∣∣, (15)

where are the predictions from Eq. 12.

### 2.3 Performance assessment

We generate a reference for evaluating the BD model by carrying out a set of Gillespie simulations which provides us with an i.i.d. sample of the stochastic spreading of an SIS epidemic on a given network. We chose a sample size of 1000 and denote the set of 1000 epidemic trajectories obtained from the sample as . To empirically validate the BD model, we compare the obtained from the numerical integration of Eq. 1 against the reference. We evaluate the difference in expectation, i.e.,

 Δ^I(t)=ˆIs(t)−ˆIr(t)=N∑k=0k pk(t)−1000∑i=1Ir,i(t)1000, (16)

where denotes the mean over the reference at time and denotes the mean number of infected nodes at time predicted by the BD model. We further evaluate the cumulative densities from the BD model in comparison to our reference using the integrated quadratic distance (IQD) which is given by

 IQD(t)=∫∞−∞(Fs,t(x)−Fr,t(x))2dx. (17)

Here denotes the cumulative density at time predicted by the BD model,

 Fs,t(x)=∑k≤xpk(t),

and denotes the reference empirical cumulative density,

 Fe,t(x)=110001000∑i=11Ir,i(t)≤x,

where is an indicator function equal to if condition is true and 0 otherwise. A small IQD is obtained when not only the mean but also the intrinsic stochasticity of the epidemic spreading is adequately represented by the BD model. The goodness of fit between the cumulative densities is further illustrated by quantile-quantile plots. The quantile function is the inverse of the cumulative density, i.e., , and hence given by

 Qs/r,t(P)=inf{x∈{1,…,1000}:P≤Fs/r,t(x)}, (18)

for the BD model (s), and the reference (r), respectively.

## 3 Data

We consider nine different network and epidemic parameter combinations. The network parameters, network class and mean degree , and the epidemic parameters, per link infection rate and recovery rate , are summarised in Table 1. Gillespie simulations were performed on a network of nodes and initialised with five infected nodes selected at random. The time is an approximate value of the time span between initialising the simulation and reaching quasi-steady state and in the following serves as a universal time scale which allows to plot the different cases onto the same time axis. Time is unit-free here. The simulated data can be re-scaled to physically-meaningful time scales by applying an appropriate multiplicative factor to the simulation time . The parameters for the nine cases were chosen such that we obtained one large epidemic with of the population infected at quasi-steady state, one medium epidemic with to of the population infected at quasi-steady state, and one small epidemic with of the population infected at quasi-steady state for each network class.

In this study, we consider the epidemics at population-level, that is, we aim to predict the future course of the number of infected nodes. Network class and parameters of the BD model are inferred from population-level observations of the number of infected nodes at a set of discrete points in time.

## 4 Results

### 4.1 Validation of the BD model

We carry out a set Gillespie simulations during which we keep track of the number of infected nodes , the number of S-I links over time and the time spent in the observed states. For each case, we carry out in total 200 simulations half of which are initialised with five infected nodes and half with 1000 infected nodes. With this choice of initial conditions, we obtain realisations of the random variable ( #S-I links) for each , from which we compute the expectations following di2020network as

 ^ak=τ∑ii tik∑itik,   k=1,…,N, (19)

where denotes the total lifetime of all network states with infected nodes and S-I links.

We then proceed to fit the parameters of the parametric model from Eq. 2 to the curves by a least-squares fit using a particle swarm algorithm. Table 2 lists the resulting -triples for each case along with the root mean square error (RMSE) between parametric curves and .

Figure 1 shows the empirical curves as well as the fitted curves for the nine cases. The top panels illustrate the good agreement between the parametric model and . The bottom panels of Fig. 1 show relative errors. It is not surprising that the relative error is largest for close to zero or close to , i.e., when the number of S-I links is small either because only very few nodes are infected or because almost the entire population is infected. The relative errors are lowest for the ER network class followed by the Regular networks. For the BA network class, we obtain larger errors.

We simulate each case by integrating the Master equation (Eq. 1) with the empirical as well as the parametric . We initialise simulations with and 80 infected nodes at time , the latter two values corresponding to two and four cycles of doubling from the initially five infected nodes. A set of 1000 Gillespie simulations of each case serves as a reference. Figure 3 shows the difference in the expected number of infected nodes between BD model and reference. Figure 4 shows the integrated quadratic distance (IQD) between the cumulative densities from BD model and reference. The errors remain small throughout the simulations with the empirical (top panels), which confirms the suitability of the BD model to describe population-level infection rates in SIS epidemics on Reg, ER and BA networks. Not only is the expectation well captured by simulations with , but also the intrinsic stochasticity of the epidemic despite the mean-field approximation. We observe the largest errors for the BA network class and during the growth phase when simulations are initialised with (Fig. 4

a). The BA network class exhibits a higher degree of heterogeneity than Regular and ER networks. Hence, a larger variance in the number of S-I links at a given

is expected. Therefore, the mean-field approximation might be less well suited for that type of network than for Regular and ER networks.

The simulations with the BD model with the parametric (bottom panels) exhibit larger errors compared to the simulations with . Ranking the different cases studied, the BD model achieves the lowest errors for the ER network class, followed by Regular networks and the BA networks. Again, errors are largest when the simulations are initialised with which appears to be caused by the larger relative errors of the parametric -model for small . The overestimation of at small by the parametric model for the BA network class (Fig. 1c) causes the number of infected nodes to increase too fast in the BD simulations, which leads to an over-estimation of the number of infected nodes during the growth phase (Figs. 3d, 5h). Conversely, the underestimation of for small for the Regular networks (Fig. 1b) causes the number of infected nodes to increase too slowly in the BD model simulations, and hence an under-estimation of the number of infected nodes during the growth phase (Figs. 3d, 5a). The errors peak during the growth phase and then decay until reaching an approximately constant value in the quasi-steady state.

For the majority of the cases, the quasi-steady state is well captured in both mean and variation around the mean, with the only exception being the small epidemics on Regular and BA networks. When the BD model is initialised at , it starts from with a state from which some of the small epidemics will eventually die out and only some will eventually converge to the quasi-steady state. Due to the over-estimation of for small for the BA networks in the parametric model, the probability of an epidemic to proceed to the quasi-steady state from is over-estimated in the BD model. Hence, the expected number of infected nodes in the BD model is too large. For Regular networks, the opposite holds and the expected number of infected nodes is too small. When initialised with the errors are smaller, but the temporal pattern of the errors persists, i.e., errors peak during the growth phase and then decay until the quasi-steady state is reached (Figs. 3e, 4e). When initialised with , we find that both the growth phase as well as the quasi-steady state is well captured by the BD model (Figs. 3f, 4f, 5c,g,j).

### 4.2 Predictions

#### 4.2.1 Network class inference

di2020network demonstrated that one can reliably recover the class of the underlying network from population-level observations, when observations of the full epidemic trajectory from an early stage up to quasi-steady state are available. When aiming to predict the future evolution of an ongoing epidemic, the inference of network class and parameters can only utilise observations of the epidemic up to its current state. Therefore, the question arises as to when one has sufficient information during an epidemic to reliably predict its further course. We therefore carry out a sensitivity analysis by inferring the posterior distribution from observation data sets covering different time windows during the evolution of the epidemic and incorporating different numbers of observations. For this analysis, we consider the medium epidemics (Table 1).

The results are summarised in Fig. 6

. As expected, in general, the longer the observational period, the higher the (average) posterior probability of the true underlying network class is. When the observational period ranges from an early stage of the epidemic up to quasi-steady state (50 to quasi-steady state), ten out of ten realisations on the BA network and seven out of ten realisations on Regular and ER networks are classified correctly. While BA networks can be clearly separated when sufficient data is available, distinguishing between Regular and ER networks appears challenging. For some realisations, the respective trajectories largely overlap (Fig.

7a).

When the observation time span is shortened, the rate of correct classifications decreases (Fig. 6). As demonstrated in Fig. 7b, epidemics spreading on networks from the different classes may exhibit a similar shape during the earlier stage and only diverge later on. Thus, inferring the underlying network classes from population-level observations of a single realisation requires a sufficient observational time span. Increasing the number of observations from 10 to 100 does not have any visible effect on the classification. Ten observations provide a sufficient description of the epidemic trajectory.

We note that for the BA networks, classification accuracy increases when the very early stage of the epidemic up to is excluded from the observational data set. In Fig. 6e,f, this is most obvious when comparing the posterior probabilities obtained for the observation intervals , and with those obtained for , and , respectively. We believe this to be caused by the relatively large error of the parametric -model for small for the BA network class (Fig. 1c). For small , the average number of S-I links and hence are over-estimated by the model. Hence, the initial spreading of the epidemic on the network is expected to proceed significantly slower than the BD model with optimal -triplet would suggest. Further, we note that because Regular and ER networks are comparably close in the -space, confusion between Regular and ER networks is comparably likely, but predictions are also expected to be comparably robust to confusion between Regular and ER network classes.

#### 4.2.2 Epidemic trajectories

Figure 8 shows the predictions incorporating the uncertainty encoded in the posterior distribution(s). Shown are three realisations of the medium epidemics on Regular, ER and BA networks. Additional Figures for all ten realisations of the nine cases from Table 1 are provided in the Supplementary Material. The dots indicate the observations . The grey-shaded areas indicate the predictions. Trajectories of 100 realisations of Gillespie simulations initialised at the network state associated with the last observational data point serve as reference.

For the majority of cases and realisations, the 90%-credible interval (CI) contains the reference. For some cases/realisations, the reference lies just outside 90%-CI (e.g., Fig. 8a realisation 9). For predictions of the medium epidemics initialised after five cycles of doubling (), we find the reference to lie just outside the 90%-CI for one out of ten realisations for the ER network, while for the Reg and BA networks, it lies partly outside of the 90%-CI for 3 out of 10 realisations each. The 90%-CI spans a range of up to and tends to be larger for the small and medium epidemics than for the large epidemics.

When parameters and network class are inferred from observations up to five cycles of doubling (), prediction uncertainty is dominated by uncertainty about model parameters and network class. When inference is based on observations up to and including the quasi-steady state, uncertainty on parameters and network class is negligible and the uncertainty on the future course of the epidemic is dominated by the intrinsic stochasticity of the process (dotted brown lines in Fig. 8). Hence, the magnitude of the prediction uncertainty is sensitive to the observational time span available for inference. In any realistic setting, observations are of course not available beyond the point from which one aims to predict the future course of an epidemic. As illustrated in Fig. 8b,d and f, parameter and network class uncertainty is, as expected, reduced when a longer observational time span is available. For predictions initialised at 6 cycles of doubling (), prediction uncertainty due to intrinsic stochasticity and parameter/network class uncertainty is similar in magnitude and it depends on the particular case and realisation if prediction uncertainty is dominated by either one.

For the predictions of epidemics on BA networks, the reference tends to lie in the lower half of the 90%-CI whereas for Regular networks, it tends to lie in the upper half of the 90%-CI. Relatively large credible intervals are associated with relatively large uncertainty about the network class. As illustrated in Fig. 7b, epidemics on different networks with a similar trajectory during the early stage eventually diverge, with the epidemics on the BA networks converging to the lowest level of infection during quasi-steady state followed by ER and Regular networks. Thus, credible intervals obtained from observations of the beginning of one of the BA trajectories from Fig. 7b are expected to contain the true BA trajectories at their lower end. This behaviour can also be understood from the corresponding -curves. Curves for networks from different classes that are similar for small will diverge for larger , with the curve corresponding to the BA network having the smallest peak, typically followed by ER and finally the Regular network with the highest peak.

Figure 9 shows the point estimate-based predictions. The prediction intervals here do not account for network class and parameter uncertainty, but only represent the intrinsic stochasticity of the epidemic spreading. Accordingly, the prediction intervals are systematically narrower than the credible intervals. The width of the prediction intervals is consistent with the spread of the trajectories from the Gillespie simulations. For some cases and realisations, the point estimate-based predictions provide a near perfect fit to the reference (e.g., Fig. 9a realisation 1, e realisation 1). However for some cases/realisations prediction and reference differ by up to (e.g., Fig. 9c realisation 6). For epidemics on BA networks, the number of infected nodes is over-estimated in the point estimate-based predictions if the network is falsely identified as ER or Regular network. For epidemics on Regular networks, the number of infected nodes is under-estimated if the network is falsely identified as ER or BA. The reason for this is the same as for the tendencies of the reference to occur in different parts of the credible intervals for the different network classes discussed in the above paragraph. When inference is based on observations up to six cycles of doubling (), the errors of the point estimate-based prediction are visibly reduced (see also the Supplementary Material). Hence, when longer observation time spans are available also point estimate-based predictions are potentially useful.

Finally, in Fig. 10 we consider the uncertainty in the -space as described by the covariance of the pushforward measure around the two different predictions and . Shown is the medium epidemic on the ER network. As the predictions based on the conditional mean incorporate the uncertainty about the predicted that stems from uncertainty about network class and model parameters it is systematically wider (Fig. 2). This width reflects the width of the pushforward and thus leads to lower values of compared to the point-estimate based predictions. Further, the predictions incorporating uncertainty exhibit less variation of among the different realisations than the point estimate-based predictions. As already discussed alongside Figs. 8 and 9, we find the uncertainty to be systematically lower when predictions are based on longer observational periods. The longer the available observation period, the narrower the posterior and the smaller the difference between the two types of predictions and their respective uncertainty in the -space.

## 5 Discussion

We have explored a modelling and inference framework for forecasting SIS epidemics spreading on networks. The surrogate model is based on a BD process. The effect of the contact structure has been condensed into a birth-rate parameter, which is proportional to the average number of SI-links for a given number of infected nodes. Our empirical validation has confirmed that the BD model is well suited to describe the evolution of an SIS epidemic on a network (Figs. 3 d-f, 4 d-f). Both the expectation and the intrinsic stochasticity of the epidemic trajectories are well reproduced even though our model formulation contains a mean-field approximation. The parametric model for the number of SI-links, which has been introduced to enable the inference of network class and model parameters, is suitable for the range (Fig. 1). Hence, simulations with the BD model with the parametric should not be initialised with fewer than infected nodes (Fig. 3 a-c, 4 a-c).

Network class and epidemic parameters can be reliably inferred when observations are available from an early stage of the epidemic up to the quasi-steady state. However, in realistic prediction scenarios, observations are only available up to the current state of the epidemic. The accuracy of the network class inference is sensitive to the observational time span (Fig. 6). Uncertainty increases as observational time span is reduced. This is because epidemics, though spreading on networks from distinct classes, can exhibit very similar trajectories through their earlier stages and only diverge when approaching the quasi-steady state (Fig. 7). As discussed in allen2021predicting for instance, the uncertainty of the future course of an emerging epidemic during its early stages is dominated by the intrinsic stochasticity of disease transmission. It is thus no surprise that observations from an early stage of the epidemic appear not to contain sufficient information about the network class.

In predictions based on observations up to and initialised at , the prediction uncertainty is dominated by the uncertainty of the parameters/network classes. In predictions based on observations up to and initialised at , the prediction uncertainty stems in about equal proportions from parameter/network class uncertainty and intrinsic stochasticity of the epidemic spreading (Fig. 8). Thus, and especially for shorter observational periods and hence predictions initialised early during the epidemic, considering parameter uncertainty is crucial for providing meaningful information about prediction uncertainty (see also castro2020turning; wilke2020predicting). The results suggest that for most cases the credible intervals obtained provide reliable uncertainty information for the epidemic forecasts (see also the Supplementary Material). If longer observational time spans are available, point estimate-based predictions are potentially useful as well (Fig. 9).

Our study differs from other approaches in network inference in so far as our aim here is not to infer the existence, or otherwise, of links but rather to infer the most likely network class that led to the observed population-level data resulting from an epidemic spreading on it. As a result, the data needed for inference does not contain node- or link-level information. There are both advantages and disadvantages to such an approach. On the one hand, the computation of the likelihood in our case is more straightforward and the data needed for inference is modest. On the other hand, if more detailed data is available, the proposed model will not be able to capture it nor benefit from it. However, more complex models will need large quantities of detailed data (i.e., in the case of cascades, the data needs to contain cascades starting from, or involving, as many nodes as possible, gomez2012inferring) to produce acceptable results with large computational burden. The choice of model and inference will depend on the context.

There are many directions in which the current model and inference scheme can be developed. First, we only explored three network classes where the key difference was degree heterogeneity. However, networks displaying degree-degree correlations, clustering, spatial structure or some type of meso-scale structure, such as communities, may be of interest as they are more representative of real-world scenarios. Equally, from a theoretical viewpoint, lattices could be considered. This is a non-trivial task and depending on which network property or combination of properties we choose to model, it may turn out that the birth-rates of the BD process will no longer be parabola-like and the proposed parametric -model may no longer provide a satisfactory fit. However, we expect that more complex models will be able to capture the birth-rates in the BD process resulting from such more exotic networks. Another natural extension would be to consider more complex epidemic models, such as SIR, where the corresponding BD model will now have equations and the birth-rates of the BD process will define a surface rather than a curve. However, and perhaps more interestingly, the excellent agreement between the exact and surrogate model, leads us to believe that a rigorous proof that quantifies the error between the exact and BD models may be possible. For example, it is clear that as a Regular network becomes more densely connected, and in the limit of number of links going to , the BD model becomes exact.

## Acknowledgements

All authors acknowledge support from the Leverhulme Trust through the Research Project Grant RPG-2017-370.