We present a new Bayesian inference method for compartmental models that takes into account the intrinsic stochasticity of the process. We show how to formulate a SIR-type Markov jump process as the solution of a stochastic differential equation with respect to a Poisson Random Measure (PRM), and how to simulate the process trajectory deterministically from a parameter value and a PRM realisation.
This forms the basis of our Data Augmented MCMC, which consists in augmenting parameter space with the unobserved PRM value. The resulting simple Metropolis-Hastings sampler acts as an efficient simulation-based inference method, that can easily be transferred from model to model.
Compared with a recent Data Augmentation method based on Gibbs sampling of individual infection histories, PRM-augmented MCMC scales much better with epidemic size and is far more flexible.
PRM-augmented MCMC also yields a posteriori estimates of the PRM, that represent process stochasticity, and which can be used to validate the model. If the model is good, the posterior distribution should exhibit no pattern and be close to the PRM prior distribution. We illustrate this by fitting a non-seasonal model to some simulated seasonal case count data.
Applied to the Zika epidemic of in French Polynesia, our approach shows that a simple SEIR model cannot correctly reproduce both the initial sharp increase in the number of cases as well as the final proportion of seropositive.
PRM-augmentation thus provides a coherent story for Stochastic Epidemic Model inference, where explicitly inferring process stochasticity helps with model validation.
Stochasticity plays an important role in infectious disease dynamics, but statistical inference of stochastic models is difficult. When trying to estimate the parameters of a complex stochastic epidemic model from some data , the data are typically an incomplete observation of the process, and the observed data likelihood is intractable 
. To work around this problem, one can decide to not use the likelihood (ABC, deep learning), to approximate the model (linear noise approximations), to estimate the likelihood (PMCMC[2, 3]), or to rephrase the problem.
Data Augmentation (DA) introduces the alternative problem of estimating the joint posterior , instead of , where is some latent variable chosen so that the joint likelihood is tractable. The original problem then finds its solution by marginalizing over , . Data Augmentation has proven effective with MCMC [4, 5, 6], for dealing with granular, subject-level data, and complex models.
An important difficulty with DA is to explore the high dimensional augmented state space effectively with the MCMC proposal. Each class of models requires designing specific proposals, and typically it is necessary to manually tune the proposal for the specific problem at hand. For instance in BEAST [5, 7]
, for which the latent state is the reconstructed phylogeny, many different moves are implemented to explore tree space effectively, and one is chosen at every iteration following certain probabilities. For a given model and dataset, one should manually tune those probabilities to achieve good mixing.
Additionally, it is challenging to design a proposal such that mixing speed scales well with population size, which becomes necessary when dealing with even moderately large epidemics.
We propose a new data augmentation scheme to fit simple models to low granularity data – like incidence or prevalence data. This scheme is directly applicable to the large class of Markov jump processes.
Pure jump Markov processes are widely used for modelling infectious diseases, ecological and evolutionary systems. It is possible to represent them as solutions of Poisson-driven stochastic differential equations, which are integral equations with respect to a Poisson Random Measure (PRM). This indicates a natural data augmentation parametrization with the parameters of the process, and the discrete measure that the process is integrated against. A value of and a realisation of the PRM are enough to deterministically simulate the trajectory of the process, and so the joint likelihood of and is easily computed as the probability of observing given the trajectory . Throughout the article, we will call this method PRM augmentation.
In this article, we present PRM augmentation, used with Metropolis-Hastings sampling [15, 16], and discuss its advantages and drawbacks, in terms of ease of use, speed, and insights, and illustrate this on some synthetic and real datasets.
PRM augmentation relies on a generalization of uniformization [17, 12, 14] that is directly applicable to any Markov jump process, with writing the Stochastic Differential Equation (SDE) being the only mathematical work needed. As such, it is a simulation-based method, which can be used easily to compare different models.
We show how PRM augmentation compares to a recent data augmentation method , and exhibits better complexity with respect to epidemic size.
Additionally, the natural distinction that is made between the mechanistic part, estimated through , and the stochastic part, estimated through , provides a way to evaluate model fit and to propose model improvements, similarly to bayesian latent residuals , via the posterior estimates.
To demonstrate this, we apply the method to simulated data, and to 2013-2014 Zika data from French Polynesia.
2.1 Integration of a Markov pure jump process w.r.t. a Poisson Random Measure
Continuous time Markov chains, or Markov pure jump processes, are stochastic processes whose state is constant in between jumps, and jump only at exponential times, and that have the Markov property. The trajectories of such processes are right continuous, left limit, and piecewise constant. A complete presentation can be found in. Such processes are widely used in stochastic modelling, and notably for instance in pomp  or BEAST .
Poisson random measures (PRM) are a generalization of Poisson processes to more general spaces, as random variables that take discrete measure values. For a PRM, the number of points found in any measurable subset
follows a Poisson distribution, and the number of points found in two disjoint measurable subsets are independent.
A Markov pure jump process can be formulated as the solution to a Poisson driven stochastic differential equation (SDE). This is an established result , and can be seen as a generalization of uniformization . We provide a proof in the appendix LABEL:app:proof that solutions to the equation that interests us here are càdlàg and Markov, and have the right infinitesimal generator.
The SDE hints at a natural deterministic algorithm to simulate a trajectory of the process exactly, given a realisation of the PRM. Examine the points of ordered by time ; for each successive point, compute the rate of the event corresponding to that point ; if the value of the point is below the rate, then accept the event and update the state accordingly, and if not reject the event (the state remains the same). This iteratively draws a trajectory, with jumps that can only be positioned at points of , as illustrated in sim_exact_prm, and following algorithm 1.
A value of and a value of uniquely determine the trajectory of the process, and thus are enough to compute the likelihood of observing the data .
The a priori independence of and helps mixing in the case of weakly informative data , provided we are able to simulate the process efficiently.
2.2 Simulation of a Markov Jump Process
We consider a Markov Jump Process (MJP) with a state space , and a finite number of events . The -th event happens with a rate , and when it happens, the state changes from to . Let be discrete measures on . integr is the Poisson-driven SDE defining the process.
This equation specifies a way to simulate the process. Each point of the is a potential event point. Consider all points of the ordered by time, and for each one, compare the event rate to the coordinate of the point. If the point is below the rate, then the event happens, and we update the state of the system and the rates. Otherwise nothing happens, and we continue on to the next point. This yields the simplified algorithm 1.
There is an obvious problem with algorithm 1 however. The have an infinite number of points on every time interval, so we cannot order all of their points by time. Instead we split into rectangles , and define each time column. In each rectangle , the number of points of is finite almost surely, and can be written , with the points ordered by time. Equation integr:pieces is equivalent to integr
We never need the rectangles that are above the rate. We can use this fact to simulate the process with finite memory, by lazily drawing points for the rectangles only when they are needed. This exact algorithm is given in the appendix LABEL:app:implem, as algorithm LABEL:algo:app:exact.
This exact algorithm is in the number of points that are considered. As a consequence, an important factor in its efficiency is the proportion of points that are considered but rejected. This is also true of classical uniformization, for which a constant upper bound must be chosen in advance. In contrast, here, the algorithm automatically adapts the upper bound to the variations of the rates, so that the rejection rate can be kept small.
What’s more, we can further reduce the complexity by using an approximate simulation algorithm . To do that, we take inspiration from tau-leaping algorithms, and treat the rates as if they were constant on each time interval , . We then count the points below the rate in each interval. We store the number of points present in each for each event to make this faster. The only points left to consider one by one are the ones from the single such that is in . This approximate algorithm is written out in the appendix LABEL:app:implem, as algorithm LABEL:algo:app:approx.
Once equipped with efficient simulations, we can use this PRM augmentation scheme for parameter inference with MCMC.
2.3 PRM-augmented MCMC
We want to estimate the posterior distribution of the parameter values and of the discrete measure of the model , given the data , that is
is the prior distribution placed on , and is the PRM distribution.
We can use the Metropolis-Hastings algorithm [15, 16], in the same way as for a deterministic system. At every iteration of the MCMC, one draws new values and from the proposal, simulates the corresponding trajectory with or , , computes the likelihood to observe the data , then accepts or rejects the move according to the Metropolis-Hastings ratio.
For a PRM, points in disjoint subsets are independent. Thus, if is a PRM sample, we can choose a subset of , erase its points and redraw new points from the PRM process, and obtain a new (correlated) PRM sample. This proposal is reversible with respect to the PRM prior, and thus can be used for MCMC. More details are given in the appendix LABEL:app:mcmc. can be used with any proposal for the parameters, with density to form a Metropolis-Hastings proposal, as shown in algorithm 2.
2.4 Stochastic differential equation for a SIR model
To illustrate our method, we show how to apply it to the classical SIR model with case count data, and give both the deterministic sir:ode and stochastic sir:prm formulations of the model here, to show the correspondence between them.
is the number of susceptible hosts, the number of infected hosts, the number of removed hosts, and the total number of cases, observed upon infection, up to time .
is the effective contact rate, and is the rate of recovery (inverse duration of infection), while is the reporting probability, that is the probability that when an infection happens, we observe it.
To make the notations less heavy in the SDE, we define the total rate of infection . Also let and be independent PRMs on with intensity the Lebesgue measure, for the events of infections and recoveries respectively.
The equations in sir:ode and sir:prm and their terms are written in the same order so as to show the correspondence between them.
We also use more complex versions of the SIR model, by making the effective contact rate seasonal, by adding host demography, immigration, immunity loss (SIRS) or a latent class (SEIR). To obtain the corresponding SDE, we only need to add a new PRM for every event, and adapt the event rates.
For example, to add host demography, we add the four events of birth of susceptible, death of susceptible, death of infectious, and death of removed, with rates , , , and , and the four independent PRMs, , , respectively.
To make the model seasonal, we would replace by .
The equations for these different models are given in the appendix LABEL:app:results.
To perform inference on the model, we need an observation model to define the likelihood. For incidence data, the observations are the number of cases in each time interval
, and for inference we consider that the case count in each interval is negative binomial. The additional degree of freedom compared to the more natural Poisson distribution helps the fitting on real data. For prevalence data, we consider that the prevalence observed at time
follows the binomial distribution. In both cases, the observations are independent conditionally on the trajectory, and the full likelihood is simply the product of the likelihoods.
We will also apply PRM-augmented MCMC to the simple SIR model with no seasonality, host vitality, or immigration, with prevalence data, in which case the observations follow the binomial distribution .
3 Materials and Methods
3.1 OCaml implementation
3.2 Comparison of inference methods
For the comparisons, we simulate daily prevalence data with the simple SIR model over a one month duration, with a population size taking values , , , and . We estimate the base effective contact rate , and the initial conditions , and .
We start the MCMC from the target parameter values, to not take into account differences in convergence speed between methods.
The different methods we compare are the PRM-augmented MCMC with exact simulations, the PRM-augmented MCMC with approximate simulations, and a subject-level data augmentation MCMC with Gibbs sampling presented in .
For the PRM-augmented MCMC, we first sample from its prior for iterations, then continue with our custom proposal for iterations.
For the Gibbs sampler we perform iterations. Its moves in process space consist of redrawing the infectious history of one subject, conditional on the data and of the histories of the other subjects. To maintain mixing as increases, we keep the number of histories redrawn per iteration proportional to – for and for .
As a measure of computational complexity for the different estimation procedures, we compute the amount of computational time it takes to obtain one effective sample from the posterior.
The multi-dimensional effective sample size  is defined as
where is the number of samples, the number of dimensions estimated, the determinant, the covariance structure of the posterior, and the asymptotic covariance matrix of the Markov chain. can be estimated with the sample covariance matrix, and with the batch means covariance matrix.
The space we explore is infinite dimensional so this definition is not directly applicable. We use the values the process takes at the datapoints , instead. The trajectories of the simple SIR model live in the 2-simplex, so for datapoints, the total dimension is then for , the initial conditions and the .
3.3 SEIRS seasonal model simulation and inference
years of data are generated with a deterministic SEIRS model with seasonality and vitality. On infection, the hosts go through the Susceptible, Exposed, Infectious, Removed, and then back to the Susceptible class. The equations of the model and the parameter values used are given in the appendix LABEL:app:diagnosis. Parameter values are chosen so that the data displays a clear periodic signal of period year, from the attractor of the system. We also use a constant SEIRS model, that is, the same model, but where is constant ().
The data is then inferred with the stochastic seasonal model, and with the stochastic constant model by PRM-augmented MCMC.
3.4 French Polynesia Zika data
The French Polynesia Zika dataset is composed of weekly case counts for Moorea island and of seroprevalence data from a cohort of people from the archipelago. The case count data has been made available in [25, 26], and the seroprevalence data in [27, 26].
We fit a SEIR model with immigration to the data with PRM-augmented MCMC. The seroprevalence likelihood is binomial with probability the removed proportion.
The details of the model and the prior distributions are given in the appendix LABEL:app:zika.
The PRM-augmented MCMC is run for iterations for convergence and adaptation of the covariance matrix , then for iterations, from which samples are kept.
4.1 Efficiency of PRM-augmented MCMC
The comparison between exact and approximate simulations in complex:exact shows that both simulation schemes scale in the same order on population size, but that the approximate scheme is much more economical. In this example, for a population of , a million iterations took a bit less than hours with the approximate method, but close to hours with the exact method. As population size increases, stochasticity plays a less important role, and it becomes increasingly advantageous to use the approximate simulations. For small populations and with high time resolution however, it might be better to use exact simulations to avoid the bias caused by approximation.
As both schemes rely on the same data structure for , it is easy to switch from one to the other, and would be straightforward to implement switching when required during estimation.
The PRM-augmented MCMC also proved much more efficient than the subject level data augmentation method described in , as is shown in benchmark. The Gibbs sampler from  mixes much better than PRM-augmented MCMC when only looking at the mESS 3.2 by iteration, yielding around effective sample every iterations, against every iterations. However, every one of its iterations is much more costly. In theory, the complexity for an iteration of subject level augmentation scales with the square of the population size. Resampling a subject’s history grows linearly with the number of events, and the number of histories to redraw per iteration also grows linearly with population size to maintain mixing. This quickly becomes problematic, even with moderate population sizes.
In contrast, the mixing in the case of PRM-augmentation doesn’t vary much with population size, only the cost of an iteration does. In practice, a linear regression indicates an exponent offor subject level augmentation, and for PRM-augmentation.
Example MCMC estimation for a population of hosts is presented in the Appendix LABEL:app:compare and app:benchmark:estim.
4.2 Model diagnosis
To see the Markov jump process as the solution of an SDE with respect to a PRM is to create a separation between the noise driving the process, contained in the PRM, and the mechanism of the process, described by the parameters of the process. By inferring and jointly, then, we are hoping to capture the mechanistic part in , and the noise part in , provided that the process explains the data well.
act as Bayesian latent residuals . If the posterior distribution of is very different from its prior distribution of a standard PRM (intensity one), then our hypotheses about the noise in the model are wrong. A pattern found in the posterior for means that the pattern in the data is not entirely captured by the mechanism proposed, or said another way, that the model is underfitting the data. The nature of the pattern in the posterior for can then provide clues to improve the model to capture the pattern.
The posterior samples for can thus be used both to evaluate model fit, and to improve the model.
We illustrate this with an example on the SEIRS model, in a seasonal and a constant version (see 3.3, page 3.3). We generate some data with the seasonal model, then compare the fit of this true seasonal model to the data, to the fit of the constant model to the data.
We can see in , (a) that we are able to fit the constant model to the seasonal data. To evaluate the fit, we can look at the posterior samples of . We define the point density of for event and for a time slice as the number of points by unit of volume, . The posterior estimates (diagnosis, (b)) for the true seasonal model are very similar to the prior. However for the constant model, the infection point density rises at the start of every epidemic season, and falls at the ends. For the constant model to reproduce the data, it is necessary to have more infections than expected at the start of the epidemic season, and less than expected at the end. That is, the seasonal signal (diagnosis, (c)) is captured in the posterior for , for us to notice and then include into the model.
This example is artificial, since the seasonal signal is already very obvious in the data and would be included in the model from the start.
However, it leads to two interesting remarks. First, even though the constant model explains the seasonal data very badly, we are still able to fit it to the seasonal data. This shows that the MCMC is able to reach a priori very unlikely discrete measure values, and so that we are able to explore the discrete measure space well. Second, if we look at the estimates for the other events, we can see that recovery is nearly the symmetric of infection. The model doesn’t really make a difference between more infections and less recoveries, or variation in infection rate, and variation in recovery rate. This shows that this procedure is not a replacement for actually including time-varying parameters , with a prior chosen to reflect how likely we expect those variations to be.
4.3 French Polynesia Zika epidemic
During the Zika epidemic in French Polynesia, in 2013-2014, the case counts indicate that the epidemic progressed very fast, but seroprevalence studies show that only around of the population got infected.
The fit of a simple SEIR model with PRM-augmented MCMC shows that the model cannot correctly capture the data, if we restricut ourselves to realistic incubation and infectious periods, as shown in moorea. The posterior distribution contains peaks of point density at the beginning of the epidemic, most notably for the event of becoming infectious, and for the event of infection. This means that trajectories of the model that fit to the data have more infections and shorter incubation periods than expected by intrinsic noise, only at the start of the epidemic. There is no reason to expect that this could happen. Said another way, in the initial period, cases happen faster than the model can follow. More realistically, this could also be explained by an increase in the reporting probability as the epidemic becomes noticed, or also by population structure [26, 30].
Epidemiological data is ever more abundant and complex, and can help us better understand the dynamics of infectious diseases, answer important theoretical questions and address public health problems. Existing inference methods, however, are quickly becoming a limiting factor, because of a prohibitive computational cost, a difficulty in applying the method to arbitrary models, or both. The search for new methods progresses in different directions, to build methods that are both faster and more easily transferable to different models.
For MCMC, one such advance has been the development of adaptive methods [31, 32, 28]. They can relieve the user from the burden of designing proposals taking into account the particular structure of the model, as the adaptive method aims to discover that structure automatically.
The adoption of these methods for complex models takes time. In the case of stochastic processes, the problem is complicated by the very large (infinite dimensional) latent variable space to explore.
As a consequence, a large part of the data augmentation methods for stochastic processes tend to concern themselves with Gibbs sampling, or Metropolis-Hastings component-wise updates when Gibbs sampling cannot be achieved.
For Markov jump processes, an efficient sampler based on uniformization  was proposed in [12, 14]. They make a clever use of uniformization to apply a method from Markov chains to MJPs. This method does not scale well to large state spaces however, and so it is not practical in the case of stochastic epidemic models.
Instead, data augmentation methods targeted specifically at stochastic epidemic models have also been developed these last 20 years [4, 8, 6, 10, 11]. They typically use component-wise updates, the effectiveness of which depends crucially on the degree of posterior independence of the different components. In these methods, the latent variables represent the subject-level disease histories. This is a centered parametrization , as the latent variables are a sufficient statistic for . By opposition, one can also use a non-centered parametrization, in which the latent variables and are a priori independent, or phrased another way, is an ancillary statistic for .
In the centered case, when the data brings little information, and are very correlated and separate updates for and must be very conservative. In that case, non-centering can be much more effective, as and will be nearly independent. If instead the data is very informative, then the reverse is true, and the centered parametrization will mix better than the non-centered one.
This question of centering and non-centering is crucial for mixing speed in the case of component-wise updates, and has already been discussed in the literature around stochastic epidemic models .
For typical stochastic epidemic models, it is often possible to re-parametrize the model in terms of uniform random variates, for instance, and thus to obtain a non-centered parametrization. This can be the basis for non-centered simulation-based MCMC . However, the parametrization is ad hoc and must be found by the modeller. In constrast, we believe PRM-augmentation provides a canonical non-centered parametrization, and thus it is a good choice for low-granularity data like we have shown here. For the case when the data is more informative, it should be possible to design joint adaptive proposals that could move reasonably far in trajectory space.
Indeed, the increased efficiency compared with the subject-level data augmentation method from  can be in part attributed to this. The subject-level augmentation is a centered parametrization, and the cost of using Gibbs sampling to obtain new histories is that only one history can be resampled at a time. Another method of conditional resimulation that should scale better with population size is however presented in .
PRM-augmented MCMC presents an interesting tradeoff for moderately large population sizes, and a possible alternative to linear noise approximations  or PMCMC . At the same time, the method is easily used in practice, as it is akin to a simulation-based method, that one can directly transpose to new models, with no lengthy and error-prone mathematical derivations or implementation.
It would be interesting to make a quantitative comparison with PMCMC, but it is difficult to do this in a fair manner, as the efficiency of PMCMC crucially depends on the number of particles used.
However, we would like to argue that the main advantage of PRM-augmentation is that it is in a sense a natural parametrization of the model, which makes our assumptions about the nature of the stochasticity very clear. It leads to a clear separation between the process mechanism, desribed by , and the process noise, described by , that facilitates interpretation and is very useful for model diagnosis (section 4.2). The procedure is identical to the one presented in [18, 35], with two differences. First, the samples are obtained as part of the inference, and not after. That way, we don’t need to be able to compute (and in our case, is not injective). Second, and most importantly, we have a clear meaning for what the latent residuals represent, so that it is easy to interpret deviations from the prior.
In conclusion, PRM-augmented MCMC is a practical method for the inference of moderate epidemics where stochasticity cannot be ignored. As a simulation-based method, one can easily apply it to many different models, and the PRM samples obtained make it possible to diagnose, compare, and propose better models.
Declaration of Competing Interest
The authors declare no conflict of interest.
BN and BC have been supported by a grant from Agence Nationale de la Recherche for the PANIC project (Targeting PAthogen’s NIChe: a new approach for infectious diseases control in low-income countries: ANR-14-CE02-0015-01). BN is a doctoral student at the Centre de Recherche Interdisciplinaire, in the Université de Paris.
-  Philip D. O’Neill. A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods. Mathematical Biosciences, 180(1-2):103–114, November 2002.
-  Christophe Andrieu, Arnaud Doucet, and Roman Holenstein. Particle Markov chain Monte Carlo methods: Particle Markov Chain Monte Carlo Methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3):269–342, June 2010.
-  Pierre Del Moral, Ajay Jasra, Anthony Lee, Christopher Yau, and Xiaole Zhang. The Alive Particle Filter and Its Use in Particle Markov Chain Monte Carlo. Stochastic Analysis and Applications, 33(6):943–974, November 2015. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/07362994.2015.1060892.
-  S. Cauchemez, F. Carrat, C. Viboud, A. J. Valleron, and P. Y. Boëlle. A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data. Statistics in Medicine, 23(22):3469–3487, November 2004.
-  Alexei J Drummond and Andrew Rambaut. BEAST: Bayesian evolutionary analysis by sampling trees. BMC evolutionary biology, 7(1):214, 2007.
-  Chris P. Jewell, Theodore Kypraios, Peter Neal, and Gareth O. Roberts. Bayesian analysis for emerging infectious diseases. Bayesian Analysis, 4(3):465–496, September 2009.
-  Remco Bouckaert, Joseph Heled, Denise Kühnert, Tim Vaughan, Chieh-Hsi Wu, Dong Xie, Marc A Suchard, Andrew Rambaut, and Alexei J Drummond. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol, 10(4):e1003537, 2014.
-  Peter Neal and Gareth Roberts. A case study in non-centering for data augmentation: Stochastic epidemics. Statistics and Computing, 15(4):315–327, October 2005.
-  Fei Xiang and Peter Neal. Efficient MCMC for temporal epidemics via parameter reduction. Computational Statistics & Data Analysis, 80:240–250, December 2014.
-  Peter Neal and Chien Lin Terry Huang. Forward Simulation Markov Chain Monte Carlo with Applications to Stochastic Epidemic Models. Scandinavian Journal of Statistics, 42(2):378–396, 2015.
-  Jonathan Fintzi, Xiang Cui, Jon Wakefield, and Vladimir N. Minin. Efficient Data Augmentation for Fitting Stochastic Epidemic Models to Prevalence Data. Journal of Computational and Graphical Statistics, 26(4):918–929, October 2017.
Vinayak Rao and Yee Whye Teh.
Fast MCMC Sampling for Markov Jump Processes and
The Journal of Machine Learning Research, 14:26, 2013.
-  Andrew Golightly and Darren J. Wilkinson. Bayesian inference for Markov jump processes with informative observations. Statistical Applications in Genetics and Molecular Biology, 14(2):169–188, 2015.
-  Boqian Zhang and Vinayak Rao. Efficient parameter sampling for Markov jump processes. arXiv:1704.02369 [stat], April 2017. arXiv: 1704.02369.
-  Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21(6):1087–1092, June 1953. Publisher: American Institute of Physics.
-  W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97–109, April 1970. Publisher: Oxford Academic.
-  Arne Jensen. Markoff chains as an aid in the study of Markoff processes. Scandinavian Actuarial Journal, 1953(sup1):87–91, January 1953.
-  M. S. Y. Lau, G. Marion, G. Streftaris, and G. J. Gibson. New model diagnostics for spatio-temporal systems in epidemiology and ecology. Journal of The Royal Society Interface, 11(93):20131093–20131093, February 2014.
-  Stewart N. Ethier and Thomas G. Kurtz. Markov Processes: Characterization and Convergence. John Wiley & Sons, September 2009.
-  Aaron A. King, Dao Nguyen, and Edward L. Ionides. Statistical Inference for Partially Observed Markov Processes via the R Package pomp. Journal of Statistical Software, 69(1):1–43, March 2016. Number: 1.
-  Nobuyuki Ikeda and Shinzo Watanabe. Stochastic differential Equations and diffusion processes. Number 24 in North-Holland mathematical Library. North-Holland [u.a.], Amsterdam, 2. ed edition, 1989. OCLC: 20080337.
-  Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sköld. A General Framework for the Parametrization of Hierarchical Models. Statistical Science, 22(1):59–73, February 2007.
-  OCaml, an industrial strength programming language.
-  Dootika Vats, James M. Flegal, and Galin L. Jones. Multivariate Output Analysis for Markov chain Monte Carlo. arXiv:1512.07713 [math, stat], December 2015. arXiv: 1512.07713.
-  H. P. Mallet, Anne-Laure Vial, and Didier Musso. Bilan de l’epidemie a virus Zika en Polynesie Francaise, 2013–2014. Bulletin d’information sanitaires, épidémiologiques et statistiques, pages 20–21, 2015.
-  Clara Champagne, David Georges Salthouse, Richard Paul, Van-Mai Cao-Lormeau, Benjamin Roche, and Bernard Cazelles. Structure in the variability of the basic reproductive number (R0) for Zika epidemics in the Pacific islands. eLife, 5:e19874, November 2016.
-  Maite Aubry, Anita Teissier, Michael Huart, Sébastien Merceron, Jessica Vanhomwegen, Claudine Roche, Anne-Laure Vial, Sylvianne Teururai, Sébastien Sicard, Sylvie Paulous, Philippe Desprès, Jean-Claude Manuguerra, Henri-Pierre Mallet, Didier Musso, Xavier Deparis, and Van-Mai Cao-Lormeau. Zika Virus Seroprevalence, French Polynesia, 2014–2015. Emerging Infectious Diseases, 23(4):669–672, April 2017.
-  Matti Vihola. Robust adaptive Metropolis algorithm with coerced acceptance rate. Statistics and Computing, 22(5):997–1008, September 2012.
-  Bernard Cazelles, Clara Champagne, and Joseph Dureau. Accounting for non-stationarity in epidemiology by embedding time-varying parameters in stochastic models. PLOS Computational Biology, 14(8):e1006211, August 2018.
-  Adam J. Kucharski, Sebastian Funk, Rosalind M. Eggo, Henri-Pierre Mallet, W. John Edmunds, and Eric J. Nilles. Transmission Dynamics of Zika Virus in Island Populations: A Modelling Analysis of the 2013–14 French Polynesia Outbreak. PLOS Neglected Tropical Diseases, 10(5):e0004726, May 2016.
-  Heikki Haario, Eero Saksman, and Johanna Tamminen. An adaptive Metropolis algorithm. Bernoulli, 7(2):223–242, April 2001.
-  Gareth O. Roberts and Jeffrey S. Rosenthal. Examples of Adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2):349–367, January 2009.
-  Pooley C. M., Bishop S. C., and Marion G. Using model-based proposals for fast parameter inference on discrete state space, continuous-time Markov processes. Journal of The Royal Society Interface, 12(107):20150225, June 2015.
-  Jonathan Fintzi, Jon Wakefield, and Vladimir N. Minin. A linear noise approximation for stochastic epidemic models fit to partially observed incidence counts. arXiv:2001.05099 [q-bio, stat], January 2020. arXiv: 2001.05099.
-  Max S. Y. Lau, Bryan T. Grenfell, Colin J. Worby, and Gavin J. Gibson. Model diagnostics and refinement for phylodynamic models. PLOS Computational Biology, 15(4):e1006955, April 2019.