Distributed systems with Markovian interactions can be modelled as continuous-time Markov chains . Examples include randomised population protocols , genetic regulatory networks  and biochemical systems evolving in a spatially homogeneous environment, at constant volume and temperature [31, 28]. For such systems, stochastic modelling is necessary to describe stochastic fluctuations for low/medium population counts that deterministic fluid techniques cannot capture .
A versatile programming language for modelling the behaviour of Markovian distributed systems is that of Stochastic Reaction Networks (SRNs)
, which induce CTMCs under certain mild restrictions. Computing the probability distributions of the species of a SRN over time is achieved by solving the Kolmogorov Equation, also known in the biochemical literature as the Chemical Master Equation (CME). Unfortunately, classical numerical solution methods for computing transient probability based on uniformisation  are often infeasible because of the state space explosion problem, that is, the number of states of the resulting Markov chain grows exponentially with respect to the number of species and may be infinite. A more scalable transient analysis can be achieved by employing simulations combined with statistical inference , but to obtain good accuracy large numbers of simulations are needed, which for some systems can be very time consuming.
A promising approach, which we explore in this paper, is to instead approximate the CTMC induced by a Stochastic Reaction Network as a continuous-space stochastic process by means of the Central Limit Approximation (CLA) , also known in statistical physics as the Linear Noise Approximation (LNA). That is, a Gaussian process is derived to approximate the original CTMC . As the marginals of a Gaussian process are fully determined by its expectation and covariances, its solution requires solving a number of differential equations that is quadratic in the number of species and independent of the population size. As a consequence, the CLA is generally much more scalable than a discrete-state stochastic representation and has been successfully used for analysis of large Stochastic Reaction Networks [22, 18, 23, 21]. However, none of these works enables the computation of complex temporal properties such as global probabilistic reachability properties, which quantify the probability of reaching a particular region of the state space in a particular time interval. This property is fundamental for verification of more complex temporal logic properties, for example probabilistic until properties, where the probability of reaching a certain region within a certain time bound while remaining in another region is quantified. Such properties can be expressed in Continuous Stochastic Logic (CSL)  or Linear Temporal Logic (LTL) , whose formulae are verified by reduction to the computation of the reachability properties .
We derive fast and scalable approximate probabilistic model checking algorithms for CTMCs induced by Stochastic Reaction Networks against a time-bounded fragment of CSL extended with reward operators. Our model checking algorithms are numerical and explore a continuous-space approximation of the CTMC in terms of a Gaussian process. One of our key results is a novel scalable algorithm for computing probabilistic reachability for Gaussian processes over target regions of the state space that are assumed to be convex polytopes, i.e. intersections of a finite set of linear inequalities. More specifically, for a CTMC approximated as a Gaussian process, the resulting algorithm computes the probability that the system falls in the target region within a specified time interval. Given a set of
linear inequalities, and relying on the fact that a linear combination of the components of a Gaussian distribution is still Gaussian, we discretize time and space for the-dimensional stochastic process defined by the particular linear combinations. This allows us to derive an abstraction in terms of a time-inhomogeneous discrete-time Markov chain (DTMC), whose dimension is independent of the number of species, since a linear combination is a uni-dimensional entity. The method ensures scalability, as in general we are interested in a small number, i.e., one or at most two, of linear inequalities. This abstraction is then used to perform model checking of time-bounded CSL properties [37, 9]. To compute such an abstraction, the most delicate aspect is to derive equations for the transition kernel of the resulting DTMC. This is formulated as the conditional probability at the next discrete time step given the system in a particular state. Reachability probabilities are then computed by making the target set absorbing. We then extend CSL with the reward operators as in . We derive approximate reward measures for such operators using the CLA, and prove the convergence in distribution of our algorithms to the original measures when the size of the system (number of molecules) tends to infinity. We show the effectiveness of our approach on a set of case studies taken from the biological literature, also in cases where existing numerical model checking techniques are infeasible.
A preliminary version of this work has appeared in . This paper extends  in several aspects. While in  we only consider probabilistic reachability, here we generalise our algorithms to the time-bounded fragment of CSL, which we also extend with reward operators. Furthermore, we prove weak convergence of our algorithms and significantly extend the experimental evaluation.
1.0.2 Related work.
Algorithms for model checking CSL properties for continuous-time Markov chains have been introduced and then improved with techniques based on uniformization  (essentially a discretisation of the original CTMC), and reward computation . The analysis typically involves computing the transient probability of the system residing in a state at a given time, or, for a model annotated with rewards, the expected reward that can be obtained. Despite improvements such as symmetry reduction , sliding window  and fast adaptive uniformisation , their practical use for Stochastic Reaction Networks is severely hindered by state space explosion , which in a SRN grows exponentially with the number of molecules when finite, and may be infinite, in which case finite projection methods have to be used 
. As a consequence, approximate but faster algorithms are appealing. The mainstream solution is to rely on simulations combined with statistical inference to obtain estimates[38, 20]. These methods, however, are still computationally expensive. A recent trend of works explored as an alternative whether estimates could be obtained by relying on approximations of the stochastic process based on mean-field  or linear noise [19, 18, 22]. However, CSL and some classes of reward properties, like those considered here, are very challenging. In fact, most approaches consider either local properties of individual molecules , or properties obtained by observing the behaviour of individual molecules and restricting the target region to an absorbing subspace of the (modified) model . The only approach dealing with more general subsets, , imposes restrictions on the behaviour of the mean-field approximation, whose trajectory has to enter the reachability region in a finite time. Another interesting approach has been developed in [47, 42]
, where model checking of time-bounded properties for CTMCs is expressed as a Bayesian inference problem, and approximated model checking algorithms are derived. However, no guarantees on the convergence of the resulting algorithms is given.
Our approach differs in that it is based on the CLA and considers regions defined by polytopes, which encompasses most properties of practical interest. The simplest idea would be to consider the CLA and compute reachability probabilities for this stochastic process, invoking convergence theorems for the CLA to prove the asymptotic correctness. Unfortunately, there is no straightforward way to do this, since dealing with a continuous space and continuous time diffusion process, e.g., Gaussian, is computationally hard, and computing reachability is challenging (see ). As a consequence, discrete abstractions are appealing.
Stochastic Reaction Networks. A Stochastic Reaction Network (SRN) is a pair of finite sets, where is a set of species, denotes its size, and is a set of reactions. Species interact according to the reactions in . A reaction is a triple , where is the reactant complex, is the product complex and is the reaction rate associated to . and represent the stoichiometry of reactants and products. Given a reaction , where
is the transpose of a vector, we often refer to it as. The state change associated to a reaction is defined by . For example, for as above, we have . A configuration or state of the system is given by a vector of the number of molecules of each species. Given a configuration then represents the number of molecules of in the configuration and is the concentration or density of in the same configuration, where is the population system size, which for molecular systems may represent the volume of the solution, and otherwise it is typically the total population count.
Stochastic Reaction Networks are a versatile programming language used to model stochastic evolution of populations of indistinguishable agents, where the species represent the states of the agents. They are relevant not only for modelling of biochemical systems, such as genetic regulatory networks, molecular signalling pathways and DNA computing circuits, but also certain classes of stochastic distributed systems due to their equivalence to Petri nets , Vector Addition Systems (VAS)  and distributed population protocols .
As a running example we consider the following simple model of gene expression , where the mRNA is produced by an always active promoter, and then catalyzes the production of the protein. We have and the following set of reactions :
2.1 Stochastic Semantics of Stochastic Reaction Networks
Under the well-mixed assumption , a Stochastic Reaction Network induces a discrete-state Markov process. For a reaction , is also called the propensity rate of reaction and is a function of the current configuration of the system, such that is the probability that a reaction event occurs in the next time interval . For instance, in case of mass action kinetics, , where is the factorial of , and is the component of vector relative to species . In this paper we assume is a real analytic function , that is, a function that locally coincides with its Taylor expansion. This is not restrictive, as it includes all the more commonly used kinetics such as mass action or Hill. We also require that the SRN satisfies the density dependent rate condition111Note that this condition is not strictly necessary for our results, but guarantees a simpler form for equations ., that is, for any , there exists a function such that for it holds that where represents the concentration of the species in in configuration . Consequently, a SRN is modelled in terms of a time-homogeneous continuous-time Markov chain (CTMC)  with state space given by the set of possible configurations of the system, where in we made explicit the dependence on the system size . Thus, is a random vector describing the population count of each species at time . Given , we denote by the CTMC describing the evolution of the species in in terms of concentrations. The transient evolution of , and consequently also of the concentrations , is described by the Kolmogorov equations, also called the Chemical Master Equation (CME), namely, a set of differential equations describing the transient evolution of the reachable states .
(Kolmogorov Equations) Let be the initial configuration of . For , we define . describes the transient evolution of
, and is the solution of the following system of ordinary differential equations (ODEs):
Solving Eqn (1) requires computing the solution of a differential equation for each reachable state. The size of the reachable state space is exponential in the number of the species, and may be infinite. As a consequence, solving the CME is generally feasible only for SRNs with very few species and small molecular populations. This is the so-called state space explosion problem, which strongly limits the applicability of the CME in practice. Finite projection methods have been developed to numerically solve Eqn (1) when the state space is not finite . However, they still suffer from the state space explosion problem and are limited to SRNs with few species and moderate population counts.
Often, Eqn (1) is a approximated with a deterministic model using fluid techniques , where the concentrations of the species are approximated over time as the solution of the following set of ODEs, the so-called rate equations:
where in case of mass action kinetics we have , for the i-th component of vector raised to the power of , i-th component of vector . The initial condition is . Eqn (2) converges to when the system size, tends to infinity . However, Eqn (2) completely neglects the stochastic fluctuations, which may be essential to understand the behaviour of the system being modelled .
Consider the SRN introduced in Example 1. Then, for , we have that
is a random variable describing the number of molecules in the system at time. Given an initial condition , the state space of is given by the set of states reachable from . That is, for any there is a sequence of reactions such that Note that the presence of the reaction implies that, in this example, is not finite. Thus, most of the techniques commonly used for model checking CTMCs would not be directly applicable in this case . describes the evolution of mRNA and Pro in terms of concentrations.
2.2 Central Limit Approximation
The Central Limit Approximation (CLA), also called the Linear Noise Approximation (LNA), is a continuous-space
approximation of the CTMC in terms of a Gaussian process based on the Central Limit theorem[50, 28].
The CLA at time approximates the distribution of with the distribution of the random vector such that:
where is a random vector, independent of the system size , representing the stochastic fluctuations at time around , the solution of Eqn (2). The probability distribution of is given by the solution of a linear Fokker-Planck equation . As a consequence, for any time instant ,
has a multivariate normal distribution whose expected valueand covariance matrix are the solution of the following differential equations:
where is the Jacobian of , its transpose, and the th component of . We assume with probability ; as a consequence and , which implies for every . The following theorem illustrates the nature of the approximation using the CLA.
Theorem 2.1 ()
In the above, indicates convergence in distribution as the system size parameter increases . The CLA is exact in the limit of high populations, but has also been successfully used in many different scenarios showing surprisingly good results [32, 51]. To compute the CLA it is necessary to solve first order differential equations, and the complexity is independent of the initial number of molecules of each species. Therefore, one can avoid the exploration of the state space that methods based on uniformization rely upon, taking an important step towards scalable stochastic analysis of reaction systems.
By Eqn (3), we have that the distribution of is Gaussian with expected value and covariance matrix given by:
Then, the following standard proposition guarantees that a set of linear combinations of the components of is still Gaussian.
Proposition 1 ()
Let be a matrix and a dimensional Gaussian process. Then, is a m-dimensional Gaussian process. For any , we have that is characterized by the following mean and covariance:
Thus, represents the time evolution of linear combinations of the population counts of the species defined by over time. Importantly, is still a Gaussian process, and hence completely characterized by its mean and covariance matrix. Note also that the distribution of (concentrations) depends on only via its mean and covariance, which are obtained by solving ODEs in Eqns (4) and (5). This is a key feature that we will use to obtain an effective dimensionality reduction in our model checking algorithms.
3 Continuous Stochastic Logic (CSL)
Temporal properties of continuous time Markov chains can be expressed using Continuous Time Stochastic Logic (CSL) , which can thus be used for the CTMC induced from a SRN . We will develop approximate model checking algorithms for CSL based on the CLA. Since the CLA is correct in the limit of diverging system size , we will define CSL for the normalized process as introduced in the previous section. Therefore, we will be working in terms of concentrations instead of population counts. This is not a limitation: if we are interested in a fixed value of , population counts can always be rescaled to population densities, and vice versa, by dividing or multiplying them by . In the following, we will thus refer to states and concentrations interchangeably without loss of generality.
Given a SRN a path of the induced CTMC is defined as where and for all there exists such that and , where is the density dependent rate. That is, is an alternating sequence of states (equivalently, concentrations) and residence times in those states. Let be the set of all paths of and the set of all paths of starting from . Call the state of the path at time , i.e. where . Then, a probability measure, Prob, for can be defined using cylinder sets of paths . For further details on the measure-theoretic properties we refer to .
Since takes values in , we will work with predicates over concentrations, similarly to how real-time signals are verified in Signal Temporal Logic (STL) , instead of the conventional atomic propositions defined in states of the Markov chain .
(Convex Predicate). Let be a predicate. We call a convex predicate if there exist , such that for it holds that:
Hence, convex predicates are true for concentration belonging to a, not necessarily bounded, convex polytope. We denote by the set of all convex predicates with domain in .
We now define the time-bounded fragment of CSL for SRNs as follows. We do not consider time-unbounded properties because of the nature of the convergence of CLA, which is guaranteed just for finite time. In Section 7 we extend this fragment with reward operators.
(CSL Syntax) Given a SRN , and the induced CTMC , we define the syntax of CSL as:
where , , and .
The above definition slightly differs from the usual definition of CSL in that the reachability () and until () operators work directly with predicates over concentrations, rather state labels. Note also that, in Definition 2, we do not allow nesting of CSL properties, and we restrict predicates to sets that are convex polytopes. This latter point does not limit the expressivity of the logic. However, it is a fundamental requirement for our model checking algorithms, which allows us to obtain an exponential speed up compared to existing algorithms.
Given the SRN of Example 1 for , the property ”is the probability that the concentration of Pro remains below until there is a concentration of mRNA of at least in the first time units greater than ?” can be expressed as:
where with an abuse of notation we call and the components of vector relative to species and . Obviously, this property is equivalent to the following one, but expressed on the rescaled process :
(Semantics of CSL) Let be the CTMC induced by SRN . Given , the semantics of CSL is defined as follows:
Note that the reachability operator can be expressed with the until. For example, is equivalent to . Similarly to classical CSL, can be replaced with in the style of quantitative model checking, indicating the probability of satisfaction .
Model checking procedures for CTMCs against CSL specifications are well known [37, 10]. They reduce to computing the probability of reaching a given set, and hence to solving Eqn (1), albeit resulting in the well known state space explosion problem. Here, we explore the usage of the CLA to derive approximate model checking procedures that converge to the original CTMC values but do not suffer from the state space explosion problem, therefore enabling fast stochastic characterization of the system.
4 The Reachability Operator
In this section we define our CLA-based algorithm to verify the probabilistic reachability operator which is the key procedure for model checking of more complex CSL properties. As is a convex predicate, in order to check this property, for a convex polytope defined as where , we need to compute:
where is the set of paths of starting from as defined in Section 3. We will compute such a probability for , the CLA of expressed in terms of concentrations, and then show how the computed measure converges to the original process , but guaranteeing much greater scalability. Computing the reachability probability for is not straightforward, because the system evolves in continuous time and analytic solutions cannot be derived in general. As a consequence, we need to devise numerical algorithms and prove their correctness. Here, we derive a scalable numerical algorithm based on time and space discretization of linear projections of , and, using properties of Gaussian processes, we then prove the convergence of the algorithm to the original measure.
In order to exploit the CLA, we first discretize time for the Gaussian process given by the CLA, with a fixed (or adaptive) step size , which we can do effectively owing to the Markov property and the knowledge of its mean and covariance. As a result, we obtain a discrete-time, continuous-space, Markov process with a Gaussian transition kernel. Then, by resorting to state space discretization with parameter , we compute the reachability probability on this new process, obtaining an approximation in terms of time-inhomogeneous discrete-time Markov chain (DTMC) converging to the CLA approximation uniformly, when and go to . At first sight, there seems to be little gain, as we now have to deal with a
-dimensional continuous state space. Indeed, for general regions this can be the case. However, if we restrict to regions defined by intersections of linear inequalities (i.e. polytopes), we can exploit properties of Gaussian distributions (i.e. their closure with respect to linear combinations), reducing the dimension of the continuous space to the number of different linear combinations used in the definition of the linear inequalities (in fact, the same hyperplane can be used to fix both an upper and a lower bound). As we are generally interested only in one or few projections, the complexity will then be dramatically reduced.
4.1 Time Discretization Scheme
Given , the CLA of expressed in terms of concentrations, and matrix we introduce an exact time discretization scheme for . For simplicity we assume but all the results extend to Fix a small time step . By sampling at step and invoking the Markov property,222The Gaussian process obtained by the Linear Noise Approximation is Markovian, as it is the solution of a linear Fokker-Planck equation (stochastic differential equation) . we obtain a discrete-time Markov process (DTMP) on continuous space. Applying the linear projection mapping to , and leveraging its Gaussian nature, we obtain a process which is also a DTMP, though with a kernel depending on time through the mean and variance of .
A (time-inhomogeneous) discrete-time Markov process (DTMP) is uniquely defined by a triple , where is a measurable space and is a transition kernel such that, for any , and , is the probability that conditioned on .
From Definition 5, it follows that, for , is a discrete-time stochastic process defined on the sample space given by the product space , endowed with the sigma-algebra, , generated by the product topology, and with a probability measure , which is uniquely defined by the transition kernel and the initial condition .
Thus, in order to characterize , we need to compute its transition kernel, . This is equivalent to computing , i.e. the density function of given the event .
Consider the joint distribution, which is Gaussian. Its projected counterpart is thus also Gaussian, with covariance function:
where is the covariance function of at times and . It follows by the closure properties of Gaussian processes that is Gaussian too, and thus fully characterized by its mean and variance. Hence, we need to derive . From now on, we denote and . Following , we introduce the following matrix differential equation:
with and initial condition , where
is the identity matrix of dimension. Then, as illustrated in , we have:
where is the matrix introduced in Eqn (5). This is an integral equation, which has to be computed numerically. To simplify this task, we derive an equivalent representation in terms of differential equations. This is given by the following lemma.
Solution of Eqn (10) is given by the solution of the following differential equations:
with initial condition computed as the solution of:
can be computed by solving Eqn (9). Knowledge of allows us to directly compute:
Then, by using the law for conditional expectation of a Gaussian distribution, we finally have:
As the kernel is Gaussian, it is completely determined by its expectation and covariance matrix over time. Note that the resulting kernel is time-inhomogeneous. The dependence on time is via the mean and covariance of , which are functions of time and define completely the distribution of . The following result, which is a corollary of Theorem 3 in , guarantees the correctness of the approximation.
Given vector , , measurable set and process with initial condition , call
where is the Gaussian probabilisty measure of the process . Further, let be the DTMP obtained by discretizing at time step Then, for , it holds that
4.2 Space Discretization
In order to compute the reachability probability for the DTMP , we discretize its continuous state space into a countable set of non-overlapping cells (regions) of constant size (except for at most regions of measure , i.e. the boundaries of the cells), obtaining an abstraction in terms of a discrete-time Markov chain with state space . Specifically, given , the state space of , the target set for , we into a grid of cells of length , where defines how fine our space discretization is. For each of the resulting regions we consider a representative point, given by the median of the set. We call the set of representative points . Then, we have where is the state representing the target set. Theorem 4.2 guarantees that for the error introduced by the space discretization tends to zero. However, for a fixed a possible choice of is , which means that the rescaled process takes values in . Nevertheless, when the population is of the order of hundreds or thousands, it can be beneficial to consider , since a coarser state space aggregation is reasonable.
Similarly to the previous section (see Definition 5), as is a discrete-time stochastic process, given we can associate to a probability space with sample space given by the product space , and with a probability measure uniquely defined by , the transition kernel of which is defined as follows. For , is defined as:
where is the discrete time step, assumed to be fixed to simplify the notation. For , we have:
and for the last case, we have:
That is, is made absorbing. Finally, we define: