Probabilistic Model Checking for Continuous Time Markov Chains via Sequential Bayesian Inference

Probabilistic model checking for systems with large or unbounded state space is a challenging computational problem in formal modelling and its applications. Numerical algorithms require an explicit representation of the state space, while statistical approaches require a large number of samples to estimate the desired properties with high confidence. Here, we show how model checking of time-bounded path properties recast exactly as a Bayesian inference problem. In this novel formulation the problem can be efficiently approximated using techniques from machine learning. Our approach is inspired by a recent result in statistical physics which derived closed form differential equations for the first-passage time distribution of stochastic processes. We show on a number of non-trivial case studies that our method achieves both high accuracy and significant computational gains compared to statistical model checking.

Authors

• 10 publications
• 19 publications
• 5 publications
• The Bouquet Algorithm for Model Checking Unbounded Until

The problem of verifying the "Unbounded Until" fragment in temporal logi...
11/24/2019 ∙ by Shiraj Arora, et al. ∙ 0

• Central Limit Model Checking

We consider probabilistic model checking for continuous-time Markov chai...
04/23/2018 ∙ by Luca Bortolussi, et al. ∙ 0

• Model Checking Markov Population Models by Stochastic Approximations

Many complex systems can be described by population models, in which a p...
11/10/2017 ∙ by Luca Bortolussi, et al. ∙ 0

• STAMINA: STochastic Approximate Model-checker for INfinite-state Analysis

Stochastic model checking is a technique for analyzing systems that poss...
06/01/2019 ∙ by Thakur Neupane, et al. ∙ 0

• Efficient Parametric Model Checking Using Domain Knowledge

We introduce an efficient parametric model checking (ePMC) method for th...
12/24/2018 ∙ by Radu Calinescu, et al. ∙ 0

• Statistical Model Checking of Human-Robot Interaction Scenarios

Robots are soon going to be deployed in non-industrial environments. Bef...
07/23/2020 ∙ by Livia Lestingi, et al. ∙ 0

• Symblicit Exploration and Elimination for Probabilistic Model Checking

Binary decision diagrams can compactly represent vast sets of states, mi...
01/08/2020 ∙ by Ernst Moritz Hahn, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Probabilistic model checking of temporal logic formulae is a central problem in formal modelling, both from a theoretical and an applicative perspective [Hansson1994, Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. Classical algorithms based on matrix exponentiation and uniformisation are well understood, and form the core routines of mature software tools such as PRISM [kwiatkowska_prism_2011], MRMC [katoen_markov_2005] and UPPAAL [Behrmann2006]. Nevertheless, the need to explicitly represent the state space makes their application to large systems problematic, or, indeed, theoretically impossible in the case of systems with unbounded state spaces, which appear frequently in biological applications.

Statistical model checking (SMC) approaches [Younes2006, Zuliani2010]

have emerged in recent years as a powerful alternative to exact techniques. Such methods provide a Monte Carlo estimate of the desired probability by repeatedly sampling trajectories from the model. SMC can also provide probabilistic guarantees on the estimated probabilities, and, by choosing the number of simulations to be suitably large, one can reduce the uncertainty over the estimates arbitrarily.

While SMC offers a practical and flexible solution in many scenarios, its reliance on repeated simulation of the system makes it naturally computationally intensive. While SMC can be trivially parallelized, the approach can still be computationally onerous for systems which are intrinsically expensive to simulate, such as systems with large agent counts or exhibiting stiff dynamics.

In this paper, we propose an alternative approach to solving the probabilistic model checking problem which draws on a recently proposed technique from statistical physics [Schnoerr2017_arxiv]. We show that the model checking problem is equivalent to a sequential Bayesian computation of the marginal likelihood of an auxiliary observation process. This marginal likelihood yields the desired time-bounded reachability probability, which is closely related to the eventually and globally temporal operators. We also expand the methodology to the case of the time-bounded until operator, thus covering a wide range of properties for temporal logics such as CSL [Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. The formulation of the model checking problem as a Bayesian inference method allows us to utilise efficient and accurate approximation methodologies from the machine learning community. In particular, we combine Assumed Density Filtering (ADF) [Maybeck1982, Minka2001]

with a moment-closure approximation scheme, which enables us to approximate the entire cumulative distribution function (CDF) of the

first

time that a time-bounded until property is satisfied by solving a small set of closed ordinary differential equations and low-dimensional integrals.

The rest of the paper is organised as follows. We discuss the related work in Section 2 and we provide some background material on Markov chains and model checking in Section 3. We then describe our new approach, highlighting both the links and differences to the recently proposed statistical physics method of [Schnoerr2017_arxiv] in Section 4. To illustrate the performance of the method, we consider four non-linear example systems of varying size and stiffness in Section 5, showing that the method is highly accurate and often considerably faster than SMC.

2 Related Work

In recent years, the computational challenges of probabilistic model checking have motivated the development of approaches that rely on stochastic approximations as an alternative to both classical methods and SMC. In one of the earliest attempts, passage-time distributions were approximated by means of fluid analysis [Hayden2012]. This framework was later extended to more general properties expressed as stochastic probes [Hayden2013]. Fluid approximation has also been used to verify CSL properties for individual agents for large population models [Bortolussi2012, Bortolussi2015]. In [Bortolussi2013], a Linear Noise Approximation (LNA) was employed to verify not only local properties of individuals, but also global ones, which are given as the fraction of agents that satisfy a certain local specification. The verification of such local and global properties has been recently generalised for a wider class of stochastic approximations, including moment closure [Bortolussi2017].

Regarding our work, one key difference with respect to these earlier approaches is that we consider global time-bounded until properties that characterise the behaviour of the system at the population level. In that sense, our approach is mostly related to [Bortolussi2014, Bortolussi2016], which rely on the LNA to approximate the probability of global reachability properties. In particular, the LNA is used to obtain a Gaussian approximation for the distribution of the hitting time to the absorbing set [Bortolussi2014]. The methodology is different in [Bortolussi2016], where it is shown that the LNA can be abstracted as a time-inhomogeneous discrete-time Markov chain which can be used to estimate time-bounded reachability properties. However, this method approximates the unconstrained process, and needs to subsequently resort to space and time discretisation to approximate the desired probability.

3 Background

A Continuous-Time Markov Chain (CTMC) is a Markovian (i.e. memoryless) stochastic process that takes values on a countable state space and evolves in continuous time [Durrett2012]. More formally:

Definition 1

A stochastic process is a Continuous-Time Markov Chain if it satisfies the Markov property, i.e. for any :

 p(Xt+h=j∣Xt=i,{Xτ:0≤τ≤t})=p(Xt+h=j∣Xt=i) (1)

A CTMC is fully characterised by its generator matrix , whose entries denote the transition rate from state to state , for any [Norris1997]. The dynamics of a CTMC are fully described by the master equation, which is a system of coupled ordinary differential equations that describe how the probability mass changes over time for each of the states of the system. For a CTMC with generator matrix , the master equation will be:

 dP(t)dt=P(t)Q (2)

where is the transition probability matrix at time ; the quantity denotes the probability to transition from state at time to state at time . The master equation is solved subject to initial conditions .

Throughout this work, we shall consider CTMCs that admit a population

structure, so that we can represent the state of a CTMC as a vector of non-negative integer-valued variables

, that represent population counts for different interacting entities.

3.1 Moment Closure Approximation

For most systems, no analytic solutions to the master equation in (2) are known. If the state space is finite, (2) constitutes a finite system of ordinary differential equations and can be solved by matrix exponentiation. For many systems of practical interest however, is either infinite, or so large that the computational costs of matrix exponentiation become prohibitive.

Moment closure methods constitute an efficient class of approximation methods for certain types of master equations, namely if the elements of the generator matrix are polynomials in the state . This is for example the case for population CTMC of mass action type which are frequently used to model chemical reaction networks [Gardiner2009]. In this case, one can derive ordinary differential equations for the moments of the distribution of the process. Unless the are all polynomials in of order one or smaller, the equation for a moment of a certain order will depend on higher order moments, which means one has to deal with an infinite system of coupled equations. Moment closure methods close this infinite hierarchy of equations by truncating to a certain order. A popular class of moment closure methods does so by assuming to have a certain parametric form [schnoerr2017approximation]. This then allows to express all moments above a certain order in terms of lower order moments and thus to close the equations for these lower order moments.

In this paper, we utilise the so-called normal moment closure

which approximates the solution of the master equation by a multi-variate normal distribution by setting all cumulants of order greater than two to zero

[Goodman1953, schnoerr2014validity, schnoerr2015comparison]. This class of approximations was recently used within a formal modelling context in [Feng2016].

3.2 Probabilistic Model Checking

The problem of probabilistic model checking of CTMCs is defined in the literature as the verification of a CTMC against Continuous Stochastic Logic (CSL) [Aziz1996, Aziz2000, Baier2000, Baier2002, Baier2003]. A CSL expression is evaluated over the states of a CTMC. In the original specification [Aziz1996], the syntax of a CSL formula is described by the grammar:

 ϕ::=tt | α | ¬ϕ | ϕ1∧ϕ2 | P⋈p(Φ)

where is a state-formula, and is a path-formula, i.e. it is evaluated over a random trajectory of the Markov chain. An atomic proposition identifies a subset of the state space; in this paper, we consider atomic propositions to be linear inequalities on population variables. The probabilistic operator allows reasoning about the probabilities of a path-formula :

 Φ::=Xϕ | ϕ1Uϕ2 | ϕ1U[t1,t2]ϕ2

asserts whether the probability that is satisfied meets a certain bound expressed as , where and . In order to evaluate the probabilistic operator, we need to calculate the satisfaction probability for a path-formula , which involves one of three temporal operators: next , unbounded until , and time-bounded until .

For a finite CTMC, it is well-known that evaluating the probability of is reduced to matrix/vector multiplication, while evaluating the unbounded until requires solving a system of linear equations [Baier2000]. The time-bounded until operator can also be evaluated numerically via an iterative method that relies on uniformisation [Baier2000]. This process may have a prohibitive computational cost if the size of the state space is too large. For systems with unbounded state space, the only option to estimate the time-bounded until probabilities is by the means of stochastic simulation [Younes2006, Zuliani2010], which also has a high computational cost.

Other temporal operators can be expressed as special cases of the until operator. For the time-bounded eventually operator we have: , while for the globally operator we have: . The latter two operators formally describe the problem of time-bounded reachability.

4 Methodology

Assuming a property of the form , our goal is to approximate the cumulative probability of being satisfied for the first time at time , that is, the cumulative distribution function for the first-passage time of .

4.1 Time-bounded Reachability as Bayesian Inference

Before discussing the until operator, we shall consider the problem of reachability, which is closely related to the eventually temporal operator . The globally operator can also be formulated as the negation of a reachability problem, as shown in Section 3.2. If denotes the set of states that satisfy the formula , then we are interested in the probability that is reached for the first time; this quantity is also known in the literature as first-passage time.

Building upon [Cseke2013, Cseke2016] Schnoerr et al [Schnoerr2017_arxiv] have recently formulated time-bounded reachability as a Bayesian inference problem. Using this formulation, they proposed a method where the entire distribution of first-passage times can be approximated by taking advantage of some well-established methodologies in the Bayesian inference and statistical physics literature. In the current section, we revise the approach of Schnoerr et al [Schnoerr2017_arxiv] for reachability, while in Section 4.2 we expand the method to the more general case of time-bounded until.

In the Markov chain literature [Norris1997], the states in the set are often called the absorbing states. Let denote the set of non-absorbing states. The cumulative probability for the system to reach an absorbing state at or before time is clearly equal to 1 minus the probability of the system having remained in until . Schnoerr et al’s insight was to formulate this probability in terms of a Bayesian computation problem. Consider an auxiliary binary observation process which evaluates to 1 whenever the system is in the non-absorbing set . The pair constitutes a hidden Markov model (HMM) in continuous time; the required cumulative probability would then correspond to the marginal likelihood of observing a string of all 1s as output of the HMM. Computing this marginal likelihood is a central and well studied problem in machine learning and statistics.

Even in this novel formulation, the problem is generally still intractable. To make progress, we first discretise the time interval into time points with spacing . For the process at time being in we thus have the observation model if and zero otherwise. Note that is the distribution of the observation process , i.e. . The marginal likelihood of having remained in for all factorises as

 Z[0,t]=p(Ct0)N∏i=1p(Cti|C

where we introduced the notation . The factors of the rhs in (3) can be computed iteratively as follows. Let be the initial condition of the process. Suppose that the system did not transition into the absorbing set until time (that is, the process remained in ), and that the state distribution conditioned on this observations is . We can solve the system forward in time up to time to obtain the predictive distribution , which will serve as a prior, and combine it with the likelihood term that the process has remained in at time .

We can then define a posterior over the state space by simply applying the Bayes rule as follows:

 p(xti|C≤ti,x0)=p(Cti|xti)p(xti|C

The likelihood term represents the probability that the process does not leave at time . The prior denotes the state space probability considering that the process had remained in for time . The posterior then will be the state space distribution after observing that the Markov process has remained in at the current step.

Note that the evidence in (4) is just a factor in the rhs of (3). It can be easily obtained by marginalising the joint probability over :

 p(Cti|C

The process described above is a Bayesian formulation for the introduction of absorbing states. By multiplying by the likelihood, we essentially remove the probability mass of transitioning to a state in ; the remaining probability mass (the evidence) is simply the probability of remaining in . Therefore, the probability of transitioning to for the first time at time is the complement of the evidence:

 (6)

Thus, Equation (6) calculates the first-passage time probability for any . Note that this approach neglects the possibility of the process leaving from and returning to region within on time step. The time spacing thus needs to be chosen small enough for this to be a good approximation.

Schnoerr et al [Schnoerr2017_arxiv] further approximated the binary observation likelihood

by a soft, continuous loss function. This allowed them to take the continuum limit of vanishing time steps which in turn allows to approximate the evidence

by solving a set of ODEs. In this work, we keep the binary, discontinuous observation process and keep time discrete, which allows us to extend the framework from [Schnoerr2017_arxiv] to the time-bounded until operator.

4.2 The Time-bounded Until Operator

Consider the time-bounded property which will be satisfied if a state in is reached up to time and the stochastic process has remained in until then. Assuming that is satisfied up to , there are three distinct possibilities regarding the satisfaction of the until property:

• it evaluated as false if we have and simultaneously,

• the property is evaluated as true if ,

• otherwise the satisfaction of the property is undetermined up to time .

These possibilities correspond to three non-overlapping sets of states: , and accordingly, as seen in Figure 1.

In order to calculate the first-passage time probabilities for any time , we assume that the property has not been determined before . That means that the Markov process has remained in the set , which is marked as the grey area in Figure 1. The Bayesian formulation of reachability discussed in Section 4.1 can be naturally applied to the problem of reaching the union . The prior term denotes the state distribution given that the property remained undetermined before . The likelihood term indicates whether the Markov process has transitioned within the non-absorbing set at . Finally, the posterior given by (4) will be the state space distribution after observing that the property has remained undetermined at the last step.

In contrast with the reachability problem however, once the absorbing set is reached, we only know that the formula has been determined, but we do not know whether it has been evaluated as true or false. More specifically, the evidence as given by Equation (5) represents the probability that the satisfaction has remained undetermined at time . Although the negation of the evidence was sufficient to resolve the reachability probability as in Equation (6), now we are interested only in a subset of the absorbing states. At a particular time we have to calculate the probability of reaching explicitly, which is given by the overlap mass of the prior process and probability of transitioning into :

 p(Sϕ2ti|C

Considering the fact that the likelihood is actually a truncation of the state space, as it will be if and otherwise, the first-passage probability at time is given as follows:

 p(Sϕ2ti|C

Considering a Gaussian approximation for , as we discuss in the next section, and given that the state formula is a conjunction of linear inequalities, Equation (8) can be easily calculated by numerical routines.

The Bayesian formulation that we introduce has essentially the same effect as the traditional probabilistic model checking methods [Baier2002]. The probability of the until operator is usually evaluated by first introducing the set of absorbing states , and then calculating the probability of reaching the set , which is a subset of the absorbing states. The advantage of this sequential Bayesian inference formulation is that it allows us to leverage well-established machine learning methodologies, as we see in the section that follows.

4.3 Gaussian Approximation via Assumed Density Filtering

The Bayesian formulation as described in the previous section does not involve any approximation. In fact for a discrete-state system, both the prior and the likelihood terms (i.e.  and equivalently) will be discrete distributions in (4). Therefore, quantities such as the evidence in (5) and the probability of reaching in Equation (7) can be calculated exactly, as the integrals reduce to summations. However, if the size of the state space is too large or unbounded, this process can be computationally prohibitive. The formulation presented above allows us to derive an efficient approximation method that relies on approximating the discrete process by a continuous one.

We adopt a moment closure approximation scheme where all cumulants of order three or larger are set to zero, which corresponds to approximating the single-time distribution of the process by a Gaussian distribution. As described in Section

3.1, the moment closure method results in a system of ODEs that describe the evolution of the expected values and the covariances of the population variables in a given CTMC. At any time , the state distribution is approximated by a Gaussian with mean and covariance :

 p(xti|C

The evidence is the probability mass of non-absorbing states; i.e. it is observed that the process has remained within . Since is identified by linear inequalities on the population variables, both the evidence in Equation (5) and the probability mass in the target set in (7) can be estimated by numerically solving the integral in (8). There are many software routines readily available to calculate the CDF of multivariate Gaussian distributions by numerical means.

Nevertheless, the posterior in Equation (4) is not Gaussian, thus we have to introduce a Gaussian approximation. It is proven that ADF minimises the KL divergence between the true posterior and the approximating distribution, subject to the constraint that the approximating distribution is Gaussian [Maybeck1982, Minka2001]. Considering the prior , the ADF updates [Cseke2016] will be:

 ~μti =μti+Σti∂μtilogZti (9) ~Σti =Σti+Σti∂2μ2tilogZtiΣti (10)

where the evidence is equal to the mass of the truncated Gaussian that corresponds to the non-absorbing states . The dimensionality of the Gaussians is equal to the number of distinct populations in the system; this is generally small, meaning that computations of truncated Gaussian integrals can be carried out efficiently. A detailed exposition can be found in Appendix 0.A.

4.4 Algorithm

Algorithm 1 is an instantiation of model checking via sequential Bayesian inference (MC-SBI). The algorithm evaluates the probability that a property is satisfied for a sequence of time points , thus approximating the CDF of the first time that is satisfied.

In the beginning of each iteration at line 5, we calculate the probability that is satisfied at . At lines 68, we calculate the posterior state distribution, assuming that has not been determined at the current step. Finally, the state distribution is propagated by the moment closure ODEs; the new state probabilities will serve as the prior in the next iteration.

It is useful at this stage to pause and consider the differences from the first-passage time algorithm proposed in [Schnoerr2017_arxiv]: both papers share the same insight that reachability properties can be computed via Bayesian inference. However, the resulting algorithms are quite different. The crucial technical difficulty when considering formulae involving an until operator is the need to evaluate the probability of transitioning into the region identified by the second formula . It is unclear how to incorporate such a computation within the continuous-time differential equations approach of [Schnoerr2017_arxiv], which dictates the choice of pursuing a time discretisation approach here. The time discretisation however brings the additional benefit that we can evaluate exactly the moments of the Bayesian update in step 8 of Algorithm 1, thus removing one of the sources of error in [Schnoerr2017_arxiv] (at a modest computational cost, as the solution of ODEs is generally faster than the iterative approach proposed here).

5 Examples

In this section, we demonstrate the potential of our approach on a number of examples. More specifically, we report for each example the calculated CDF for the time that a formula is first satisfied. Additionally for each until property, we also report the CDF of the first-passage time to the absorbing set; this corresponds to the eventually formula , following the discussion of Section 4.2.

As a baseline reference, we use the PRISM Model Checker [kwiatkowska_prism_2011], which is a well-established tool in the literature. For a time-bounded until property , PRISM is capable of estimating its satisfaction probability by considering the following variation of the probabilistic operator . The result of denotes the probability that has been satisfied at any , thus it can be directly compared to our approach. In particular, PRISM offers numerical verification of time-bounded until properties that relies on the uniformisation method [Baier2000]. We make use of numerical verification when possible, but for more complex models we resort to SMC, as an explicit representation of their state space is practically not possible.

5.1 An epidemiology model

We first consider a SIR model of spreading a contagious disease. The system state is described by a vector of three variables that represent the number of susceptible (), infected (), and recovered () individuals in a population of fixed size. The dynamics of the model are described by the following reactions:

, with rate function ;

, with rate function ;

Considering initial state , the reachable state space as reported by PRISM involves 1271 states and 2451 transitions, which is a number small enough to allow the use of numerical verification.

We consider two properties: the first property states whether the infected population remains under a certain threshold until the extinction of the epidemic:

 φ1=XI<30U[0,t1]XI=0 (11)

where . Also, we consider a property that involves more than one species:

 φ2=XS>1U[0,t2]XI

where

. It is well known that the random variables

will follow a joint Gaussian distribution.

We have used Algorithm 1 to approximate the CDF of the time that and are first satisfied on a sequence of 200 time-points. We have also used the hybrid engine of PRISM in order to produce accurate estimates of the satisfaction probabilities of and , for an respectively.

The calculated CDFs for are summarised in Figure (b)b, while in Figure (a)a we report the CDFs of the first-passage time into its absorbing set. Similarly, the CDFs for are reported in (d)d, and the CDFs of the corresponding absorbing set can be found in Figure (c)c. In both cases the distribution functions calculated by our approach (MC-SBI) is very close to the numerical solutions of PRISM.

5.2 LacZ - A model of prokaryotic gene expression

As a more complicated example, we consider the model of LacZ protein synthesis in E. coli that first appeared in [Kierzek2002] and has been used before as a model checking benchmark [bortolussi:smoothed16]. The full model specification can be found in the appendix.

We are interested in three variables: for the population of ribosomes, which represents the population of translated sequences, and representing the molecules of protein produced. The following property:

 φ3=(XRibosome>0∧XTrRbsLacZ<200)U[0,500]XLacZ>150 (13)

monitors whether both and satisfy certain conditions until the LacZ protein produced reaches a specified threshold (i.e. ). A randomly sampled trajectory can be seen in Figure (a)a.

We have attempted to explore the reachable state space of the model using the hybrid engine of PRISM; that involved more than 26 trillions of states and 217 trillions of transitions. The state space exploration alone lasted nearly six hours and consumed more that 60 GB of memory in a computing cluster. It is fair to state that numerical methodologies can be ruled out for this example. Thus we compare our approach against SMC as implemented in PRISM. We note that 1000 samples were used by the SMC approach; the confidence interval for the results that follow

0.039, based on 99.0% confidence level.

Figure 3 summarises the calculated first-passage time CDFs evaluated on a sequence 200 time-points. In Figure (b)b we see that the moment closure method resulted in a particularly accurate approximation of the first-passage time distribution for the absorbing states. Regarding the distribution of , the results of MC-SBI and PRISM’s SMC seem to be in agreement (Figure (c)c); however that our method overestimates the final probability of satisfying .

5.3 A stiff viral model

Stiffness is a common computational issue in many chemical reaction systems. The problem of stiffness arises when a small number reactions in the system occur much more frequently than others. This small group of fast reactions dominates the computational time, and thus renders simulation particularly expensive.

As an example of a stiff system, we consider the model of viral infection in [Haseltine2002]. The model state is described by four variables: the population of viral template , the viral genome , the viral structural protein , and that captures the number of viruses produced. For the initial state we have , and the rest of the variables are equal to zero. The reactions and the kinetic laws that determine the dynamics of the model can be found in the appendix. An interesting aspect of this model is that its state space is not bounded, therefore we resort to the statistical model checking capabilities of PRISM to evaluate our approach. The SMC used 1000 samples, resulting in confidence interval 0.038, based on 99.0% confidence level.

Figure (a)a depicts a random trajectory that shows the evolution of the viral genome and the virus population over time. We see that slowly increases until it apparently reaches a steady-state and fluctuates around the value , while continues to increase at a non-constant rate. In this example, we shall monitor whether the viral genome remains under the value of until the virus population reaches a certain threshold:

 φ4=XG<200U[0,200]XV>500 (14)

The results of Figure 4 show that our method did not capture the distributions functions as well as in the previous two examples. However, considering that our method is four orders of magnitude faster than statistical model-checking (cf. Table 1), it still gives a reasonably good approximation, particularly in the case of the eventually value. Again, we have considered a sequence of length 200.

5.4 A genetic oscillator

As a final example, we consider the model of a genetic oscillator in [Azunre2011]. The original model is defined in terms of concentrations; in order to properly convert the model specification in terms of molecular populations, we consider a volume . The full model specification can be found in the appendix. We consider an initial state where , and the rest of the variables are equal to . As we can see in the random trajectory in Figure (a)a, the populations of , and approach or exceed the value of . Therefore, we have a system whose state space is simply too large to apply traditional model checking methods that rely on uniformisation.

We shall turn out interest on the variables and ; the following property monitors whether remains under until exceeds the value of :

 φ5=X7<19000U[0,50]X9>24000 (15)

In this example, the CDF has been evaluated on a sequence of length 2000. As a comparison baseline for this example, we use the SMC algorithm in PRISM using 1000 samples, which resulted in confidence interval 0.030, based on 99.0% confidence level. The results in Figure 5 show an accurate approximation of the rather unusual first-passage time distribution functions for both the absorbing states (Figure (b)b) and the property (Figure (c)c).

5.5 A note on the execution times

Table 1 summarises the execution times for our method (MC-SBI) and statistical model checking (SMC). We have used numerical verification as implemented in PRISM for the SIR model only. For the other examples, the state space is simply too large to allow the use of a method that relies on explicit representation, so we report the simulation running times only. Of course, the numerical approach is much faster when this is applicable. However, the computational savings for MC-SBI are obvious for the more complicated examples, in particular the viral model and the genetic oscillator.

In order to derive the moment closure approximations automatically, we have used StochDynTools [HespanhaDec06]. We note that the CDFs have been evaluated on a sequence of 200 time-points for all models except from the genetic oscillator, where 2000 points were used instead.

6 Conclusions

Probabilistic model checking remains one of the central problems in formal methods. As the applications of quantitative modelling extend to more complex systems, scalable techniques for accurate approximation will increasingly play a central role in the deployment of formal methods to practical systems.

Here we presented a novel approach to the classical model checking problem based on a reformulation as a sequential Bayesian inference problem. This reformulation is exact: it was originally suggested in [Schnoerr2017_arxiv] for reachability problems, and was extended in the present work to general CSL formulae including time-bounded Until operators. Apart from its conceptual appeal, this reformulation is important because it enables us to obtain an approximate solution using efficient and highly accurate tools from machine learning. Our method leverages a class of analytical approximations to CTMCs known as moment closures, which enable an efficient computation of the process marginal statistics.

We have shown on a number of diverse case studies that our method achieves excellent accuracy with much reduced computational costs compared to SMC. Nevertheless, our algorithm requires some approximations to the underlying stochastic process. The first approximation is the adoption of a time discretisation; this is a controllable approximation and can be rendered arbitrarily precise by reducing the time step (at a computational cost that grows linearly with the number of steps). The second approximation consists in propagating forward the first two moments of the process via a moment closure approximation. The quality of the approximation in this case is system dependent. Several studies have examined the problem of convergence of moment closure approximations [schnoerr2014validity, schnoerr2015comparison], however, to the best of our knowledge, error bounds for such approximations are an open problem in the mathematics of stochastic processes. Despite such issues, we believe that the reformulation of model checking problems in terms of Bayesian inference has the potential to open the door to a new class of approximate algorithms to attack this classic problem in computer science.

Appendix 0.A Partial derivatives of Gaussian likelihood

In assumed density filtering, the mean and variance updates for a Gaussian approximate distribution are given as follows:

 ~μ =μ+Σ∂μlogZ (16) ~Σ =Σ+Σ∂2μ2logZΣ (17)

The posterior mean and variance are given as function of the prior and the derivative of the logarithm of the likelihood with respect to the prior mean . We show that the derivatives of can be expressed as a sum of partial derivatives of Gaussian distribution functions, which can be easily evaluated either analytically or numerically.

Without loss of generality, we consider the two-dimensional case, i.e. , where we have:

 ∂μlogZ=[∂μxlogZ∂μylogZ] (18)
 ∂2μ2logZ=⎡⎣∂2μ2xlogZ∂2μxμylogZ∂2μyμxlogZ∂2μ2ylogZ⎤⎦ (19)

Therefore, the quantities of interest will be , and , for which we have:

 ∂μxlogZ=∂μxZZ (20)
 ∂2μ2xlogZ=Z∂2μ2xZ−(∂μxZ)2Z2 (21)
 ∂2μxμylogZ=Z∂2μxμyZ−∂μxZ∂μyZZ2 (22)

Thus we have to calculate the likelihood partial derivatives: , and .

For a normal distribution , the evidence is always be a sum of multivariate CDFs. For example, consider a constraint of the form and for the two-dimensional case:

 Z =∫byay∫bxaxN(x,y;μ,Σ)dxdy =F(bx,by;μ,Σ)−F(bx,ay;μ,Σ)−F(ax,by;μ,Σ)+F(ax,ay;μ,Σ)

Finally in order to calculate the ADF updates, we need to calculate the following partial derivatives for the Gaussian CDF: , and .

First-order derivatives

For first-order derivatives of the form we have:

 ∂xF(x,y)=∫y−∞f(x,y)dy=f(x)∫y−∞f(y|x)dy=f(x)F(y|x) (23)

Note that we need the derivative with respect to (rather than ). For a univariate normal distribution we have:

 ∂μxF(x;μx,Σx)=−∂xF(x;μx,Σx)=−f(x;μx,Σx) (24)

Therefore for the bivariate case we have:

 ∂μxF(x,y)=−f(x)F(y|x) (25)

Second-order derivatives

In the case of a bivariate Gaussian distribution, the second-order derivative with respect to both and can be evaluated analytically:

 ∂2μxμyF(x,y) =∂μy(−f(x)F(y|x))=−f(x)∂μyF(y|x) =−f(x)(−f(y|x))=f(x,y)

In the more general case of a multivariate Gaussian distribution, the derivative can also be evaluated analytically:

 ∂2μxμyF(x,y,z) =∂μy(−f(x)F(y,z|x))=−f(x)∂μyF(y,z|x) =−f(x)(−f(y|x)F(z|y,x))=f(x,y)F(z|x,y)

However, there may not always be an analytical form for the second-order derivative with respect to .

 ∂2μ2xF(x,y) =∂μx(−f(x)F(y|x))=−∂μxf(x)F(y|x)−f(x)∂μxF(y|x) (26) =−(x−μx)Σ−1xf(x)F(y|x)−f(x)∂μxF(y|x) (27)

If the random variable is univariate, then it is easy to show that the derivative of its CDF will be:

 ∂μxF(y|x)=ΣxyΣ−1xf(y|x) (28)

In a different case, there is not analytical expression available. Nevertheless, it is reasonable to approximate by means of numerical differentiation.