Stochastic Epidemic Models inference and diagnosis with Poisson Random Measure Data Augmentation

by   Benjamin Nguyen-Van-Yen, et al.

We present a new Bayesian inference method for compartmental models that takes into account the intrinsic stochasticity of the process. We show how to formulate a SIR-type Markov jump process as the solution of a stochastic differential equation with respect to a Poisson Random Measure (PRM), and how to simulate the process trajectory deterministically from a parameter value and a PRM realisation. This forms the basis of our Data Augmented MCMC, which consists in augmenting parameter space with the unobserved PRM value. The resulting simple Metropolis-Hastings sampler acts as an efficient simulation-based inference method, that can easily be transferred from model to model. Compared with a recent Data Augmentation method based on Gibbs sampling of individual infection histories, PRM-augmented MCMC scales much better with epidemic size and is far more flexible. PRM-augmented MCMC also yields a posteriori estimates of the PRM, that represent process stochasticity, and which can be used to validate the model. If the model is good, the posterior distribution should exhibit no pattern and be close to the PRM prior distribution. We illustrate this by fitting a non-seasonal model to some simulated seasonal case count data. Applied to the Zika epidemic of 2013 in French Polynesia, our approach shows that a simple SEIR model cannot correctly reproduce both the initial sharp increase in the number of cases as well as the final proportion of seropositive. PRM-augmentation thus provides a coherent story for Stochastic Epidemic Model inference, where explicitly inferring process stochasticity helps with model validation.



There are no comments yet.


page 1

page 2

page 3

page 4


Uniformly Ergodic Data-Augmented MCMC for Fitting the General Stochastic Epidemic Model to Incidence Data

Stochastic epidemic models provide an interpretable probabilistic descri...

A Point Mass Proposal Method for Bayesian State-Space Model Fitting

State-space models (SSMs) are often used to model time series data where...

Likelihood-based Inference for Partially Observed Epidemics on Dynamic Networks

We propose a generative model and an inference scheme for epidemic proce...

Stochastic Modeling of an Infectious Disease Part III-A: Analysis of Time-Nonhomogeneous Models

We extend our BDI (birth-death-immigration) process based stochastic mod...

Pólygamma Data Augmentation to address Non-conjugacy in the Bayesian Estimation of Mixed Multinomial Logit Models

The standard Gibbs sampler of Mixed Multinomial Logit (MMNL) models invo...

Exact and computationally efficient Bayesian inference for generalized Markov modulated Poisson processes

Statistical modeling of point patterns is an important and common proble...

On Data Augmentation in Point Process Models Based on Thinning

Many models for point process data are defined through a thinning proced...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


We present a new Bayesian inference method for compartmental models that takes into account the intrinsic stochasticity of the process. We show how to formulate a SIR-type Markov jump process as the solution of a stochastic differential equation with respect to a Poisson Random Measure (PRM), and how to simulate the process trajectory deterministically from a parameter value and a PRM realisation.

This forms the basis of our Data Augmented MCMC, which consists in augmenting parameter space with the unobserved PRM value. The resulting simple Metropolis-Hastings sampler acts as an efficient simulation-based inference method, that can easily be transferred from model to model.

Compared with a recent Data Augmentation method based on Gibbs sampling of individual infection histories, PRM-augmented MCMC scales much better with epidemic size and is far more flexible.

PRM-augmented MCMC also yields a posteriori estimates of the PRM, that represent process stochasticity, and which can be used to validate the model. If the model is good, the posterior distribution should exhibit no pattern and be close to the PRM prior distribution. We illustrate this by fitting a non-seasonal model to some simulated seasonal case count data.

Applied to the Zika epidemic of in French Polynesia, our approach shows that a simple SEIR model cannot correctly reproduce both the initial sharp increase in the number of cases as well as the final proportion of seropositive.

PRM-augmentation thus provides a coherent story for Stochastic Epidemic Model inference, where explicitly inferring process stochasticity helps with model validation.

1 Introduction

Stochasticity plays an important role in infectious disease dynamics, but statistical inference of stochastic models is difficult. When trying to estimate the parameters of a complex stochastic epidemic model from some data , the data are typically an incomplete observation of the process, and the observed data likelihood is intractable [1]

. To work around this problem, one can decide to not use the likelihood (ABC, deep learning), to approximate the model (linear noise approximations), to estimate the likelihood (PMCMC

[2, 3]), or to rephrase the problem.

Data Augmentation (DA) introduces the alternative problem of estimating the joint posterior , instead of , where is some latent variable chosen so that the joint likelihood is tractable. The original problem then finds its solution by marginalizing over , . Data Augmentation has proven effective with MCMC [4, 5, 6], for dealing with granular, subject-level data, and complex models.

An important difficulty with DA is to explore the high dimensional augmented state space effectively with the MCMC proposal. Each class of models requires designing specific proposals, and typically it is necessary to manually tune the proposal for the specific problem at hand. For instance in BEAST [5, 7]

, for which the latent state is the reconstructed phylogeny, many different moves are implemented to explore tree space effectively, and one is chosen at every iteration following certain probabilities. For a given model and dataset, one should manually tune those probabilities to achieve good mixing.

Additionally, it is challenging to design a proposal such that mixing speed scales well with population size, which becomes necessary when dealing with even moderately large epidemics.

We propose a new data augmentation scheme to fit simple models to low granularity data – like incidence or prevalence data. This scheme is directly applicable to the large class of Markov jump processes.

Pure jump Markov processes are widely used for modelling infectious diseases, ecological and evolutionary systems. It is possible to represent them as solutions of Poisson-driven stochastic differential equations, which are integral equations with respect to a Poisson Random Measure (PRM). This indicates a natural data augmentation parametrization with the parameters of the process, and the discrete measure that the process is integrated against. A value of and a realisation of the PRM are enough to deterministically simulate the trajectory of the process, and so the joint likelihood of and is easily computed as the probability of observing given the trajectory . Throughout the article, we will call this method PRM augmentation.

Different methods have already been proposed for the inference of Markov jump processes, some specific to epidemic models [4, 8, 9, 10, 11], and some more general [12, 13, 14].

In this article, we present PRM augmentation, used with Metropolis-Hastings sampling [15, 16], and discuss its advantages and drawbacks, in terms of ease of use, speed, and insights, and illustrate this on some synthetic and real datasets.

PRM augmentation relies on a generalization of uniformization [17, 12, 14] that is directly applicable to any Markov jump process, with writing the Stochastic Differential Equation (SDE) being the only mathematical work needed. As such, it is a simulation-based method, which can be used easily to compare different models.

We show how PRM augmentation compares to a recent data augmentation method [11], and exhibits better complexity with respect to epidemic size.

Additionally, the natural distinction that is made between the mechanistic part, estimated through , and the stochastic part, estimated through , provides a way to evaluate model fit and to propose model improvements, similarly to bayesian latent residuals [18], via the posterior estimates.

To demonstrate this, we apply the method to simulated data, and to 2013-2014 Zika data from French Polynesia.

2 Theory

2.1 Integration of a Markov pure jump process w.r.t. a Poisson Random Measure

Continuous time Markov chains, or Markov pure jump processes, are stochastic processes whose state is constant in between jumps, and jump only at exponential times, and that have the Markov property. The trajectories of such processes are right continuous, left limit, and piecewise constant. A complete presentation can be found in

[19]. Such processes are widely used in stochastic modelling, and notably for instance in pomp [20] or BEAST [5].

Poisson random measures (PRM) are a generalization of Poisson processes to more general spaces, as random variables that take discrete measure values. For a PRM, the number of points found in any measurable subset

follows a Poisson distribution, and the number of points found in two disjoint measurable subsets are independent.

A Markov pure jump process can be formulated as the solution to a Poisson driven stochastic differential equation (SDE). This is an established result [21], and can be seen as a generalization of uniformization [17]. We provide a proof in the appendix LABEL:app:proof that solutions to the equation that interests us here are càdlàg and Markov, and have the right infinitesimal generator.

The SDE hints at a natural deterministic algorithm to simulate a trajectory of the process exactly, given a realisation of the PRM. Examine the points of ordered by time ; for each successive point, compute the rate of the event corresponding to that point ; if the value of the point is below the rate, then accept the event and update the state accordingly, and if not reject the event (the state remains the same). This iteratively draws a trajectory, with jumps that can only be positioned at points of , as illustrated in sim_exact_prm, and following algorithm 1.

A value of and a value of uniquely determine the trajectory of the process, and thus are enough to compute the likelihood of observing the data .

The a priori independence of and helps mixing in the case of weakly informative data [22], provided we are able to simulate the process efficiently.

2.2 Simulation of a Markov Jump Process

We consider a Markov Jump Process (MJP) with a state space , and a finite number of events . The -th event happens with a rate , and when it happens, the state changes from to . Let be discrete measures on . integr is the Poisson-driven SDE defining the process.


This equation specifies a way to simulate the process. Each point of the is a potential event point. Consider all points of the ordered by time, and for each one, compare the event rate to the coordinate of the point. If the point is below the rate, then the event happens, and we update the state of the system and the rates. Otherwise nothing happens, and we continue on to the next point. This yields the simplified algorithm 1.

input :  The discrete measures , the rates , the increments , and the initial condition
output : Trajectory
Initialize , ;
Iterate over the points of ordered by time, ;
for  = to  do
       if  then
       end if
end for
Algorithm 1 Simplified algorithm for exact simulation of a MJP

There is an obvious problem with algorithm 1 however. The have an infinite number of points on every time interval, so we cannot order all of their points by time. Instead we split into rectangles , and define each time column. In each rectangle , the number of points of is finite almost surely, and can be written , with the points ordered by time. Equation integr:pieces is equivalent to integr


We never need the rectangles that are above the rate. We can use this fact to simulate the process with finite memory, by lazily drawing points for the rectangles only when they are needed. This exact algorithm is given in the appendix LABEL:app:implem, as algorithm LABEL:algo:app:exact.

Figure 1: Exact simulation of a SIR process with respect to an infection and a recovery measure. The  crosses are the points of the infection and recovery measures respectively. The  curves are the rates of infection and recovery respectively. When a point is under the rate curve, then the event happens. When an infection happens, (and the rates) increase, and when a recovery happens, (and the rates) decrease. Initially there is one infected and the whole population is susceptible.

This exact algorithm is in the number of points that are considered. As a consequence, an important factor in its efficiency is the proportion of points that are considered but rejected. This is also true of classical uniformization, for which a constant upper bound must be chosen in advance. In contrast, here, the algorithm automatically adapts the upper bound to the variations of the rates, so that the rejection rate can be kept small.

What’s more, we can further reduce the complexity by using an approximate simulation algorithm . To do that, we take inspiration from tau-leaping algorithms, and treat the rates as if they were constant on each time interval , . We then count the points below the rate in each interval. We store the number of points present in each for each event to make this faster. The only points left to consider one by one are the ones from the single such that is in . This approximate algorithm is written out in the appendix LABEL:app:implem, as algorithm LABEL:algo:app:approx.

Once equipped with efficient simulations, we can use this PRM augmentation scheme for parameter inference with MCMC.

2.3 PRM-augmented MCMC

We want to estimate the posterior distribution of the parameter values and of the discrete measure of the model , given the data , that is

is the prior distribution placed on , and is the PRM distribution.

We can use the Metropolis-Hastings algorithm [15, 16], in the same way as for a deterministic system. At every iteration of the MCMC, one draws new values and from the proposal, simulates the corresponding trajectory with or , , computes the likelihood to observe the data , then accepts or rejects the move according to the Metropolis-Hastings ratio.

For a PRM, points in disjoint subsets are independent. Thus, if is a PRM sample, we can choose a subset of , erase its points and redraw new points from the PRM process, and obtain a new (correlated) PRM sample. This proposal is reversible with respect to the PRM prior, and thus can be used for MCMC. More details are given in the appendix LABEL:app:mcmc. can be used with any proposal for the parameters, with density to form a Metropolis-Hastings proposal, as shown in algorithm 2.

input : ,
output : ,
(or );
if  then
       return ;
       return ;
end if
Algorithm 2 PRM Metropolis-Hastings proposal

2.4 Stochastic differential equation for a SIR model

To illustrate our method, we show how to apply it to the classical SIR model with case count data, and give both the deterministic sir:ode and stochastic sir:prm formulations of the model here, to show the correspondence between them.

is the number of susceptible hosts, the number of infected hosts, the number of removed hosts, and the total number of cases, observed upon infection, up to time .

is the effective contact rate, and is the rate of recovery (inverse duration of infection), while is the reporting probability, that is the probability that when an infection happens, we observe it.


To make the notations less heavy in the SDE, we define the total rate of infection . Also let and be independent PRMs on with intensity the Lebesgue measure, for the events of infections and recoveries respectively.

The equations in sir:ode and sir:prm and their terms are written in the same order so as to show the correspondence between them.


We also use more complex versions of the SIR model, by making the effective contact rate seasonal, by adding host demography, immigration, immunity loss (SIRS) or a latent class (SEIR). To obtain the corresponding SDE, we only need to add a new PRM for every event, and adapt the event rates.

For example, to add host demography, we add the four events of birth of susceptible, death of susceptible, death of infectious, and death of removed, with rates , , , and , and the four independent PRMs, , , respectively.

To make the model seasonal, we would replace by .

The equations for these different models are given in the appendix LABEL:app:results.

To perform inference on the model, we need an observation model to define the likelihood. For incidence data, the observations are the number of cases in each time interval

, and for inference we consider that the case count in each interval is negative binomial. The additional degree of freedom compared to the more natural Poisson distribution helps the fitting on real data. For prevalence data, we consider that the prevalence observed at time

follows the binomial distribution

. In both cases, the observations are independent conditionally on the trajectory, and the full likelihood is simply the product of the likelihoods.

We will also apply PRM-augmented MCMC to the simple SIR model with no seasonality, host vitality, or immigration, with prevalence data, in which case the observations follow the binomial distribution .

3 Materials and Methods

3.1 OCaml implementation

The algorithm is implemented in the OCaml language [23], and the project repository is at

3.2 Comparison of inference methods

For the comparisons, we simulate daily prevalence data with the simple SIR model over a one month duration, with a population size taking values , , , and . We estimate the base effective contact rate , and the initial conditions , and .

We start the MCMC from the target parameter values, to not take into account differences in convergence speed between methods.

The different methods we compare are the PRM-augmented MCMC with exact simulations, the PRM-augmented MCMC with approximate simulations, and a subject-level data augmentation MCMC with Gibbs sampling presented in [11].

For the PRM-augmented MCMC, we first sample from its prior for iterations, then continue with our custom proposal for iterations.

For the Gibbs sampler we perform iterations. Its moves in process space consist of redrawing the infectious history of one subject, conditional on the data and of the histories of the other subjects. To maintain mixing as increases, we keep the number of histories redrawn per iteration proportional to for and for .

As a measure of computational complexity for the different estimation procedures, we compute the amount of computational time it takes to obtain one effective sample from the posterior.

The multi-dimensional effective sample size [24] is defined as

where is the number of samples, the number of dimensions estimated, the determinant, the covariance structure of the posterior, and the asymptotic covariance matrix of the Markov chain. can be estimated with the sample covariance matrix, and with the batch means covariance matrix.

The space we explore is infinite dimensional so this definition is not directly applicable. We use the values the process takes at the datapoints , instead. The trajectories of the simple SIR model live in the 2-simplex, so for datapoints, the total dimension is then for , the initial conditions and the .

3.3 SEIRS seasonal model simulation and inference

years of data are generated with a deterministic SEIRS model with seasonality and vitality. On infection, the hosts go through the Susceptible, Exposed, Infectious, Removed, and then back to the Susceptible class. The equations of the model and the parameter values used are given in the appendix LABEL:app:diagnosis. Parameter values are chosen so that the data displays a clear periodic signal of period year, from the attractor of the system. We also use a constant SEIRS model, that is, the same model, but where is constant ().

The data is then inferred with the stochastic seasonal model, and with the stochastic constant model by PRM-augmented MCMC.

3.4 French Polynesia Zika data

The French Polynesia Zika dataset is composed of weekly case counts for Moorea island and of seroprevalence data from a cohort of people from the archipelago. The case count data has been made available in [25, 26], and the seroprevalence data in [27, 26].

We fit a SEIR model with immigration to the data with PRM-augmented MCMC. The seroprevalence likelihood is binomial with probability the removed proportion.

The details of the model and the prior distributions are given in the appendix LABEL:app:zika.

The PRM-augmented MCMC is run for iterations for convergence and adaptation of the covariance matrix [28], then for iterations, from which samples are kept.

4 Results

4.1 Efficiency of PRM-augmented MCMC

Figure 2: Comparison of execution time for exact and approximate simulations. Execution time (in minutes) as a function of host population size, on a log-log scale. In , inference with the approximate simulations, and in , with the exact simulations.

The comparison between exact and approximate simulations in complex:exact shows that both simulation schemes scale in the same order on population size, but that the approximate scheme is much more economical. In this example, for a population of , a million iterations took a bit less than hours with the approximate method, but close to hours with the exact method. As population size increases, stochasticity plays a less important role, and it becomes increasingly advantageous to use the approximate simulations. For small populations and with high time resolution however, it might be better to use exact simulations to avoid the bias caused by approximation.

As both schemes rely on the same data structure for , it is easy to switch from one to the other, and would be straightforward to implement switching when required during estimation.

Figure 3: Comparison of complexity as a function of population size of PRM-augmentation and subject level augmentation. In , PRM-augmentation, and in , subject level augmentation. (a) Complexity measured as number of iterations per effective sample (b) Complexity measured as CPU time in minutes per effective sample

The PRM-augmented MCMC also proved much more efficient than the subject level data augmentation method described in [11], as is shown in benchmark. The Gibbs sampler from [11] mixes much better than PRM-augmented MCMC when only looking at the mESS 3.2 by iteration, yielding around effective sample every iterations, against every iterations. However, every one of its iterations is much more costly. In theory, the complexity for an iteration of subject level augmentation scales with the square of the population size. Resampling a subject’s history grows linearly with the number of events, and the number of histories to redraw per iteration also grows linearly with population size to maintain mixing. This quickly becomes problematic, even with moderate population sizes.

In contrast, the mixing in the case of PRM-augmentation doesn’t vary much with population size, only the cost of an iteration does. In practice, a linear regression indicates an exponent of

for subject level augmentation, and for PRM-augmentation.

Example MCMC estimation for a population of hosts is presented in the Appendix LABEL:app:compare and app:benchmark:estim.

4.2 Model diagnosis

To see the Markov jump process as the solution of an SDE with respect to a PRM is to create a separation between the noise driving the process, contained in the PRM, and the mechanism of the process, described by the parameters of the process. By inferring and jointly, then, we are hoping to capture the mechanistic part in , and the noise part in , provided that the process explains the data well.

act as Bayesian latent residuals [18]. If the posterior distribution of is very different from its prior distribution of a standard PRM (intensity one), then our hypotheses about the noise in the model are wrong. A pattern found in the posterior for means that the pattern in the data is not entirely captured by the mechanism proposed, or said another way, that the model is underfitting the data. The nature of the pattern in the posterior for can then provide clues to improve the model to capture the pattern.

The posterior samples for can thus be used both to evaluate model fit, and to improve the model.

We illustrate this with an example on the SEIRS model, in a seasonal and a constant version (see 3.3, page 3.3). We generate some data with the seasonal model, then compare the fit of this true seasonal model to the data, to the fit of the constant model to the data.

We can see in , (a) that we are able to fit the constant model to the seasonal data. To evaluate the fit, we can look at the posterior samples of . We define the point density of for event and for a time slice as the number of points by unit of volume, . The posterior estimates (diagnosis, (b)) for the true seasonal model are very similar to the prior. However for the constant model, the infection point density rises at the start of every epidemic season, and falls at the ends. For the constant model to reproduce the data, it is necessary to have more infections than expected at the start of the epidemic season, and less than expected at the end. That is, the seasonal signal (diagnosis, (c)) is captured in the posterior for , for us to notice and then include into the model.

Figure 4: Comparison of the fit of the seasonal and the constant model to simulated seasonal data. In , the simulated daily case count data. In , in the first column, simulations with the true value, and sampled from its prior distribution. In , in the second column, posterior samples for the seasonal model. In , in the last column, posterior samples for the constant model. (a) Daily case counts in the data (), and from MCMC samples. (b) Number of points by unit volume by slice of time for infection events in the posterior samples for . The solid line is the mean, and the envelope the square deviation of the distribution. (c) Auto-correlation function of the function in (b) (Mean, and square deviation envelope). The average effective contact rate , the initial conditions , and the discrete measure are estimated. For both models, samples are kept for estimation, out of iterations after convergence.

This example is artificial, since the seasonal signal is already very obvious in the data and would be included in the model from the start.

However, it leads to two interesting remarks. First, even though the constant model explains the seasonal data very badly, we are still able to fit it to the seasonal data. This shows that the MCMC is able to reach a priori very unlikely discrete measure values, and so that we are able to explore the discrete measure space well. Second, if we look at the estimates for the other events, we can see that recovery is nearly the symmetric of infection. The model doesn’t really make a difference between more infections and less recoveries, or variation in infection rate, and variation in recovery rate. This shows that this procedure is not a replacement for actually including time-varying parameters [29], with a prior chosen to reflect how likely we expect those variations to be.

4.3 French Polynesia Zika epidemic

During the Zika epidemic in French Polynesia, in 2013-2014, the case counts indicate that the epidemic progressed very fast, but seroprevalence studies show that only around of the population got infected.

Figure 5:

Fit of the SEIR model to the Zika data from Moorea island, 2013-2014. In , the data. In , posterior samples for the stochastic model. The first panel represents the weekly number of cases, in the data, and in the posterior samples. The following panels represent the average and standard deviation of the density of points by unit of volume, in the

posterior samples, for the events of infection, , leaving the exposed class, , and recovery, . samples were kept from iterations after convergence.

The fit of a simple SEIR model with PRM-augmented MCMC shows that the model cannot correctly capture the data, if we restricut ourselves to realistic incubation and infectious periods, as shown in moorea. The posterior distribution contains peaks of point density at the beginning of the epidemic, most notably for the event of becoming infectious, and for the event of infection. This means that trajectories of the model that fit to the data have more infections and shorter incubation periods than expected by intrinsic noise, only at the start of the epidemic. There is no reason to expect that this could happen. Said another way, in the initial period, cases happen faster than the model can follow. More realistically, this could also be explained by an increase in the reporting probability as the epidemic becomes noticed, or also by population structure [26, 30].

5 Discussion

Epidemiological data is ever more abundant and complex, and can help us better understand the dynamics of infectious diseases, answer important theoretical questions and address public health problems. Existing inference methods, however, are quickly becoming a limiting factor, because of a prohibitive computational cost, a difficulty in applying the method to arbitrary models, or both. The search for new methods progresses in different directions, to build methods that are both faster and more easily transferable to different models.

For MCMC, one such advance has been the development of adaptive methods [31, 32, 28]. They can relieve the user from the burden of designing proposals taking into account the particular structure of the model, as the adaptive method aims to discover that structure automatically.

The adoption of these methods for complex models takes time. In the case of stochastic processes, the problem is complicated by the very large (infinite dimensional) latent variable space to explore.

As a consequence, a large part of the data augmentation methods for stochastic processes tend to concern themselves with Gibbs sampling, or Metropolis-Hastings component-wise updates when Gibbs sampling cannot be achieved.

For Markov jump processes, an efficient sampler based on uniformization [17] was proposed in [12, 14]. They make a clever use of uniformization to apply a method from Markov chains to MJPs. This method does not scale well to large state spaces however, and so it is not practical in the case of stochastic epidemic models.

Instead, data augmentation methods targeted specifically at stochastic epidemic models have also been developed these last 20 years [4, 8, 6, 10, 11]. They typically use component-wise updates, the effectiveness of which depends crucially on the degree of posterior independence of the different components. In these methods, the latent variables represent the subject-level disease histories. This is a centered parametrization [22], as the latent variables are a sufficient statistic for . By opposition, one can also use a non-centered parametrization, in which the latent variables and are a priori independent, or phrased another way, is an ancillary statistic for .

In the centered case, when the data brings little information, and are very correlated and separate updates for and must be very conservative. In that case, non-centering can be much more effective, as and will be nearly independent. If instead the data is very informative, then the reverse is true, and the centered parametrization will mix better than the non-centered one.

This question of centering and non-centering is crucial for mixing speed in the case of component-wise updates, and has already been discussed in the literature around stochastic epidemic models [8].

For typical stochastic epidemic models, it is often possible to re-parametrize the model in terms of uniform random variates, for instance, and thus to obtain a non-centered parametrization. This can be the basis for non-centered simulation-based MCMC [10]. However, the parametrization is ad hoc and must be found by the modeller. In constrast, we believe PRM-augmentation provides a canonical non-centered parametrization, and thus it is a good choice for low-granularity data like we have shown here. For the case when the data is more informative, it should be possible to design joint adaptive proposals that could move reasonably far in trajectory space.

Indeed, the increased efficiency compared with the subject-level data augmentation method from [11] can be in part attributed to this. The subject-level augmentation is a centered parametrization, and the cost of using Gibbs sampling to obtain new histories is that only one history can be resampled at a time. Another method of conditional resimulation that should scale better with population size is however presented in [33].

PRM-augmented MCMC presents an interesting tradeoff for moderately large population sizes, and a possible alternative to linear noise approximations [34] or PMCMC [2]. At the same time, the method is easily used in practice, as it is akin to a simulation-based method, that one can directly transpose to new models, with no lengthy and error-prone mathematical derivations or implementation.

It would be interesting to make a quantitative comparison with PMCMC, but it is difficult to do this in a fair manner, as the efficiency of PMCMC crucially depends on the number of particles used.

However, we would like to argue that the main advantage of PRM-augmentation is that it is in a sense a natural parametrization of the model, which makes our assumptions about the nature of the stochasticity very clear. It leads to a clear separation between the process mechanism, desribed by , and the process noise, described by , that facilitates interpretation and is very useful for model diagnosis (section 4.2). The procedure is identical to the one presented in [18, 35], with two differences. First, the samples are obtained as part of the inference, and not after. That way, we don’t need to be able to compute (and in our case, is not injective). Second, and most importantly, we have a clear meaning for what the latent residuals represent, so that it is easy to interpret deviations from the prior.

In conclusion, PRM-augmented MCMC is a practical method for the inference of moderate epidemics where stochasticity cannot be ignored. As a simulation-based method, one can easily apply it to many different models, and the PRM samples obtained make it possible to diagnose, compare, and propose better models.

Declaration of Competing Interest

The authors declare no conflict of interest.


BN and BC have been supported by a grant from Agence Nationale de la Recherche for the PANIC project (Targeting PAthogen’s NIChe: a new approach for infectious diseases control in low-income countries: ANR-14-CE02-0015-01). BN is a doctoral student at the Centre de Recherche Interdisciplinaire, in the Université de Paris.


  • [1] Philip D. O’Neill. A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods. Mathematical Biosciences, 180(1-2):103–114, November 2002.
  • [2] Christophe Andrieu, Arnaud Doucet, and Roman Holenstein. Particle Markov chain Monte Carlo methods: Particle Markov Chain Monte Carlo Methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3):269–342, June 2010.
  • [3] Pierre Del Moral, Ajay Jasra, Anthony Lee, Christopher Yau, and Xiaole Zhang. The Alive Particle Filter and Its Use in Particle Markov Chain Monte Carlo. Stochastic Analysis and Applications, 33(6):943–974, November 2015. Publisher: Taylor & Francis _eprint:
  • [4] S. Cauchemez, F. Carrat, C. Viboud, A. J. Valleron, and P. Y. Boëlle. A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data. Statistics in Medicine, 23(22):3469–3487, November 2004.
  • [5] Alexei J Drummond and Andrew Rambaut. BEAST: Bayesian evolutionary analysis by sampling trees. BMC evolutionary biology, 7(1):214, 2007.
  • [6] Chris P. Jewell, Theodore Kypraios, Peter Neal, and Gareth O. Roberts. Bayesian analysis for emerging infectious diseases. Bayesian Analysis, 4(3):465–496, September 2009.
  • [7] Remco Bouckaert, Joseph Heled, Denise Kühnert, Tim Vaughan, Chieh-Hsi Wu, Dong Xie, Marc A Suchard, Andrew Rambaut, and Alexei J Drummond. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol, 10(4):e1003537, 2014.
  • [8] Peter Neal and Gareth Roberts. A case study in non-centering for data augmentation: Stochastic epidemics. Statistics and Computing, 15(4):315–327, October 2005.
  • [9] Fei Xiang and Peter Neal. Efficient MCMC for temporal epidemics via parameter reduction. Computational Statistics & Data Analysis, 80:240–250, December 2014.
  • [10] Peter Neal and Chien Lin Terry Huang. Forward Simulation Markov Chain Monte Carlo with Applications to Stochastic Epidemic Models. Scandinavian Journal of Statistics, 42(2):378–396, 2015.
  • [11] Jonathan Fintzi, Xiang Cui, Jon Wakefield, and Vladimir N. Minin. Efficient Data Augmentation for Fitting Stochastic Epidemic Models to Prevalence Data. Journal of Computational and Graphical Statistics, 26(4):918–929, October 2017.
  • [12] Vinayak Rao and Yee Whye Teh. Fast MCMC Sampling for Markov Jump Processes and Extensions.

    The Journal of Machine Learning Research

    , 14:26, 2013.
  • [13] Andrew Golightly and Darren J. Wilkinson. Bayesian inference for Markov jump processes with informative observations. Statistical Applications in Genetics and Molecular Biology, 14(2):169–188, 2015.
  • [14] Boqian Zhang and Vinayak Rao. Efficient parameter sampling for Markov jump processes. arXiv:1704.02369 [stat], April 2017. arXiv: 1704.02369.
  • [15] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21(6):1087–1092, June 1953. Publisher: American Institute of Physics.
  • [16] W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97–109, April 1970. Publisher: Oxford Academic.
  • [17] Arne Jensen. Markoff chains as an aid in the study of Markoff processes. Scandinavian Actuarial Journal, 1953(sup1):87–91, January 1953.
  • [18] M. S. Y. Lau, G. Marion, G. Streftaris, and G. J. Gibson. New model diagnostics for spatio-temporal systems in epidemiology and ecology. Journal of The Royal Society Interface, 11(93):20131093–20131093, February 2014.
  • [19] Stewart N. Ethier and Thomas G. Kurtz. Markov Processes: Characterization and Convergence. John Wiley & Sons, September 2009.
  • [20] Aaron A. King, Dao Nguyen, and Edward L. Ionides. Statistical Inference for Partially Observed Markov Processes via the R Package pomp. Journal of Statistical Software, 69(1):1–43, March 2016. Number: 1.
  • [21] Nobuyuki Ikeda and Shinzo Watanabe. Stochastic differential Equations and diffusion processes. Number 24 in North-Holland mathematical Library. North-Holland [u.a.], Amsterdam, 2. ed edition, 1989. OCLC: 20080337.
  • [22] Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sköld. A General Framework for the Parametrization of Hierarchical Models. Statistical Science, 22(1):59–73, February 2007.
  • [23] OCaml, an industrial strength programming language.
  • [24] Dootika Vats, James M. Flegal, and Galin L. Jones. Multivariate Output Analysis for Markov chain Monte Carlo. arXiv:1512.07713 [math, stat], December 2015. arXiv: 1512.07713.
  • [25] H. P. Mallet, Anne-Laure Vial, and Didier Musso. Bilan de l’epidemie a virus Zika en Polynesie Francaise, 2013–2014. Bulletin d’information sanitaires, épidémiologiques et statistiques, pages 20–21, 2015.
  • [26] Clara Champagne, David Georges Salthouse, Richard Paul, Van-Mai Cao-Lormeau, Benjamin Roche, and Bernard Cazelles. Structure in the variability of the basic reproductive number (R0) for Zika epidemics in the Pacific islands. eLife, 5:e19874, November 2016.
  • [27] Maite Aubry, Anita Teissier, Michael Huart, Sébastien Merceron, Jessica Vanhomwegen, Claudine Roche, Anne-Laure Vial, Sylvianne Teururai, Sébastien Sicard, Sylvie Paulous, Philippe Desprès, Jean-Claude Manuguerra, Henri-Pierre Mallet, Didier Musso, Xavier Deparis, and Van-Mai Cao-Lormeau. Zika Virus Seroprevalence, French Polynesia, 2014–2015. Emerging Infectious Diseases, 23(4):669–672, April 2017.
  • [28] Matti Vihola. Robust adaptive Metropolis algorithm with coerced acceptance rate. Statistics and Computing, 22(5):997–1008, September 2012.
  • [29] Bernard Cazelles, Clara Champagne, and Joseph Dureau. Accounting for non-stationarity in epidemiology by embedding time-varying parameters in stochastic models. PLOS Computational Biology, 14(8):e1006211, August 2018.
  • [30] Adam J. Kucharski, Sebastian Funk, Rosalind M. Eggo, Henri-Pierre Mallet, W. John Edmunds, and Eric J. Nilles. Transmission Dynamics of Zika Virus in Island Populations: A Modelling Analysis of the 2013–14 French Polynesia Outbreak. PLOS Neglected Tropical Diseases, 10(5):e0004726, May 2016.
  • [31] Heikki Haario, Eero Saksman, and Johanna Tamminen. An adaptive Metropolis algorithm. Bernoulli, 7(2):223–242, April 2001.
  • [32] Gareth O. Roberts and Jeffrey S. Rosenthal. Examples of Adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2):349–367, January 2009.
  • [33] Pooley C. M., Bishop S. C., and Marion G. Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes. Journal of The Royal Society Interface, 12(107):20150225, June 2015.
  • [34] Jonathan Fintzi, Jon Wakefield, and Vladimir N. Minin. A linear noise approximation for stochastic epidemic models fit to partially observed incidence counts. arXiv:2001.05099 [q-bio, stat], January 2020. arXiv: 2001.05099.
  • [35] Max S. Y. Lau, Bryan T. Grenfell, Colin J. Worby, and Gavin J. Gibson. Model diagnostics and refinement for phylodynamic models. PLOS Computational Biology, 15(4):e1006955, April 2019.