is a formal technique for the analysis of Markov chains with transitions probabilities specified as rational functions over a set of continuous variables. When the analysed Markov chains model software systems, these variables represent configurable parameters of the software or environment parameters unknown until runtime. The properties of Markov chains analysed by PMC are formally expressed in probabilistic computation tree logic (PCTL) extended with rewards , and the results of the analysis are algebraic expressions over the same variables.
In software engineering, Markov chains are used to model the stochastic nature of software aspects including user inputs, execution paths and component failures, and the expressions generated by PMC correspond to reliability, performance and other quality-of-service (QoS) properties of the analysed software. The availability of algebraic expressions for these key QoS properties has multiple applications. First, evaluating the expressions for different parameter values enables the fast comparison of alternative system designs, e.g., in software product lines [30, 31]. Second, self-adaptive software can efficiently evaluate the expressions at runtime, when the unknown environment parameters can be measured and suitable new values for the configuration parameters need to be selected . Third, PMC expressions allow the algebraic calculation of parameter values such that a QoS property satisfies a given constraint . Finally, they enable the precise analysis of the sensitivity of QoS properties to changes in the system parameters .
PMC is supported by the model checkers PARAM , PRISM  and Storm . However, despite significant advances in recent years [20, 34, 36], the current PMC techniques (which these model checkers implement) are computationally very expensive, generate expressions that are often extremely large and inefficient to evaluate, and do not support the analysis of parametric Markov chains modelling important classes of software systems.
Our work addresses these major limitations of existing PMC techniques and tools. To this end, we introduce an efficient parametric model checking (ePMC) method that exploits domain-specific modelling patterns, i.e., “fragments” of parametric Markov chains occurring frequently in models of software systems from a domain of interest, and corresponding to typical ways of architecting software components within that domain.
As shown in Fig. 1, ePMC comprises two stages. The first stage is performed only once for each domain that ePMC is applied to. This stage uses domain-expert input to identify modelling patterns for components of systems from the considered domain, and precomputes closed-form expressions for key QoS properties of these patterns.
For example, the modelling patterns for the service-based systems domain (described in detail in Section 6) correspond to different ways in which functionally-equivalent services can be used to execute an operation of the system. One option is to invoke the services sequentially, such that service 1 is always invoked, and service is only invoked if the invocations of services 1, 2, …, have all failed. The component modelling pattern labelled ‘’ at the top of Fig. 1 depicts this option. The graphical representation of the pattern shows the invocations of the services as states labelled , , …, , and the successful and failed completion of the operation as states labelled with a tick ’✓’ and a cross ’✗’, respectively. QoS properties such as the probability of reaching the success state and the expected execution time and cost of the operation for this pattern can be computed as
where , and are the probability of successful invocation, the execution time and the cost of service , , respectively. As illustrated in Fig. 1, these calculations can be carried out using an existing probabilistic model checker or manually. The resulting expressions are stored in a domain-specific repository, and are used in the next ePMC stage.
The second ePMC stage is performed for each structurally different variant of a system and QoS property under analysis. The stage involves the PMC of a parametric Markov chain that models the interactions between the system components. This Markov model can be provided by software engineers with PMC expertise, or can be generated from more general software models, such as UML activity diagrams annotated with probabilities as in [6, 27, 19]. The model states associated with system components are labelled with pattern instances that specify the modelling pattern used for each component and its parameters. For instance, the pattern instance from Fig. 1 labels a component implemented using the sequential pattern described earlier and services with success probabilities , costs and mean execution times . The pattern-annotated Markov model is analysed by a model checker with pattern manipulation capabilities. The result of the analysis is a set of formulae comprising:
A formula for the system-level QoS property, specified as a function over the component-level QoS property values. This formula is obtained by applying standard PMC to the pattern-annotated Markov model;
Formulae for the relevant component-level QoS properties. These formulae are obtained by instantiating the appropriate closed-form expressions from the domain-specific repository produced in the first ePMC stage.
All ePMC formulae are rational functions that can be efficiently evaluated for any combinations of parameter values, e.g., using tools such as Matlab and Octave.
The main contributions of our paper are:
A theoretical foundation for the ePMC method.
An open-source tool that automates the application of the method, and is freely available from our project website https://www.cs.york.ac.uk/tasp/ePMC/.
Repositories of modelling patterns for the service-based systems and multi-tier software architecture domains.
An extensive evaluation which shows that ePMC is several orders of magnitude faster and produces much smaller algebraic expressions compared to the PMC techniques currently implemented by the leading model checkers PARAM, PRISM and Storm, in addition to supporting the analysis of parametric Markov chains that are too large for these model checkers.
These contributions build on our preliminary work from , extending it with a theoretical foundation, tool support, repositories of modelling patterns for two domains, and a significantly larger evaluation.
The rest of the paper is structured as follows. Section 2 provides a brief introduction to the model checking of parametric Markov chains. Section 3 describes a simple service-based system that we then use as a running example when presenting the ePMC theoretical foundation in Section 4. Section 5 covers the implementation of the ePMC tool, while Sections 6 and 7 detail the application of ePMC to the service-based systems and multi-tier software architectures domains, respectively. Section 8 presents our experimental results, and Section 9 compares our method with related work. Finally, Section 10 provides a brief summary and discusses our plans for future work.
2.1 Parametric Markov chains
Markov chains (MCs) are finite state transition systems used to model the stochastic behaviour of real-world systems. MC states correspond to relevant configurations of the modelled system, and are labelled with atomic propositions which hold in those states. State transitions model all possible transitions between states, and are annotated with probabilities as specified by the following definition.
A Markov chain over a set of atomic propositions is a tuple
where is the finite set of MC states; is the initial state; is a transition probability matrix where, for any states , is the probability of transitioning to state from state ; and is the state labelling function.
A state of a Markov chain is an absorbing state if and for all , and a transient state otherwise. A path over is a possibly infinite sequence of states from such that for any adjacent states and in , . The -th state on a path , , is denoted . For any state , represents the set of all infinite paths over that start with state . Finally, we assume that every state is reachable from the initial state, i.e., there exists a path such that for some .
where is the set of all infinite paths that start with the prefix (i.e., the cylinder set of this prefix). Further details about this probability measure and its properties are available from [39, 2].
To allow the verification of a broader set of QoS properties, MC states can be annotated with nonnegative values termed rewards . These values are interpreted as “costs” (e.g. energy used) or ”gains” (e.g. requests processed).
A reward structure over a Markov chain is a function . For any state , represents the reward “earned” on leaving state .
2.2 Property specification
The properties of Markov chains are formally expressed in probabilistic variants of temporal logic. In our work we use probabilistic computation tree logic (PCTL) [18, 35] extended with rewards , which is supported by all leading probabilistic model checkers. Rewards-extended PCTL allows the specification of probabilistic and reward properties using the probabilistic operator and the reward operator , respectively, where is a probability bound, is a reward bound, and is a relational operator. Formally, a state formula and a path formula in PCTL are defined by the grammar:
and a reward state formula is defined by the grammar:
where is a timestep bound and is an atomic proposition.
The PCTL semantics is defined using a satisfaction relation over the states and the paths , , of a Markov chain (1). Given a state and a path of the Markov chain, means “ holds in state ”, means “ holds for path ”, and we have:
for all ;
iff and ;
the next path formula holds for path iff ;
the time-bounded until path formula holds for path iff holds in the -th path state, , and holds in the first path states, i.e.:
the unbounded until formula removes the bound from the time-bounded “until” formula.
The notation is used when the first part of an until formula is . Thus, the reachability property holds if the probability of reaching a state where is true satisfies . Finally, the reward state formulae specify the expected values for: the instantaneous reward at timestep , ; the cumulative reward up to timestep , ; the reachability reward cumulated until reaching a state that satisfies a property , ; and the steady-state reward in the long run, . For a detailed description of the PCTL semantics, see [18, 35, 1].
2.3 Parametric model checking
Probabilistic model checkers including MRMC , PRISM  and Storm  support the verification of PCTL properties of Markov chains. To verify whether a formula holds in a state , these tools first compute the probability that holds for MC paths starting at , and then compare to the bound . The actual probability can also be returned (for the outermost operator of a formula) so PCTL was extended to include the formula denoting this probability. Likewise, the extended-PCTL formulae , , , and denote the actual values of the expected rewards from (4).
Parametric model checking (PMC) represents the verification of quantitative PCTL properties without nested probabilistic operators and reward properties of parametric Markov chains using algorithms such as [20, 34, 36]. The PMC verification result is a rational function of the variables used to define the transition probabilities of the verified parametric Markov chain. PMC is supported by verification tools including the dedicated model checker PARAM , the latest versions of PRISM , and the recently released model checker Storm .
3 Running Example
We will illustrate the theoretical aspects and the application of our ePMC method using a service-based system that implements the simple workflow from Fig. 2. This workflow handles user requests by first performing an operation . Depending on the result of , its execution is followed by the execution of either operation or operation . The execution of completes the workflow, while after the execution of the workflow may terminate or may need to re-execute . The outgoing branches of the decision nodes from Fig. 2 are annotated with their unknown probabilities of execution ( and , and and ).
We suppose that multiple functionally-equivalent services , , …can be used to perform each operation , , and that these services have probabilities of successful invocation , expected response times and invocation costs Accordingly, the workflow can be implemented using different system architectures and service combinations. Our running example considers the implementation where:
Operation is executed by invoking services and sequentially, such that service is always invoked, and service is only invoked if the invocation of fails (i.e., times out or returns an error). As a result, the operation completes successfully whenever either service invocation is successful, and fails when the invocations of both services fail.
Operation is executed using services and probabilistically, such that is invoked with probability for , where .
Operation is executed by invoking services and sequentially with retry. This involves invoking the two services sequentially (as for ) and, if both service invocations fail, retrying the execution of the operation by using the same strategy with probability .
The parametric Markov chain from Fig. 3a models this implementation of the workflow. For instance, the MC states and (labelled ‘’) model the execution of operation by first invoking service (state ) and, if service service fails (which happens with probability ), also invoking (state ). The invocation of fails with probability , in which case the system transitions to state and then to the ‘fail’ state . If either or succeeds (state ), there is a probability that operation (modelled by states –) is executed next, and a probability that the next operation is (modelled by states –).
To model the execution of operation , the MC includes transitions with probabilities and from to state (which corresponds to the invocation of service ) and to state (which corresponds to the invocation of service ), respectively. The successful execution of or results in state being reached and in the successful completion of the workflow (state , labelled ‘succ’), while a failed invocation results in state being reached and in the failure of the workflow (state ).
Finally, states – model the execution of operation similarly to how is modelled by –, except that a successful execution of is followed by (with probability ) or the successful end of the workflow (with probability ), and failed invocations of lead to a retry of the operation (with probability ) or to the failure of and thus of the entire workflow (with probability ).
Parametric model checking applied to the MC from Fig. 3a can compute closed-form expressions for a wide range of QoS properties of the system. These properties can then be evaluated very efficiently for different combinations of services with different parameters. For our running example, we assume that the software engineers developing the system are interested to analyse the following properties:
The probability that the workflow implemented by the system completes successfully;
The probability that the workflow fails due to a failed execution of operations or ;
The expected execution time of the workflow;
The expected cost of executing the workflow.
Table I presents these properties formalised in PCTL, and their closed-form expressions computed using the probabilistic model checker Storm and significantly simplified through manual factorisation. Although PMC is feasible for the simple model from Fig. 3, the complex expressions from Table I already suggest that this might not be true for larger systems and models. The experimental results presented later in Section 8 confirm that indeed the PMC techniques implemented by current model checkers do not scale to much larger systems than the one from Fig. 2—a limitation addressed by our ePMC method described next.
|Prop.||PCTL formula||PMC expression|
|expression (similar to the PMC expression for property ) not included for brevity, but available on project website|
4 ePMC Theoretical Foundation
ePMC patterns are reoccurring “fragments” of parametric MCs with a single entry state and one or several output states. We formally define these concepts below.
A fragment of a parametric Markov chain is a tuple where:
is a subset of transient MC states;
is the (only) entry state of , i.e., ;
is the non-empty set of output states of , and all outgoing transitions from the output states are to states outside , i.e., for all .
The shaded areas of the parametric MC from Fig. 3a (each corresponding to an operation of the workflow from our running example) contain the three MC fragments:
As shown by this example, MC fragments may or may not contain cycles.
Given a fragment of a parametric MC , ePMC performs parametric model checking by separately analysing two parametric MCs determined by , and combining the results of the two analyses. As each of the two parametric MCs has fewer states and transitions than , the overall result can be obtained in a fraction of the time required to analyse the original model . The first of these parametric MCs is defined below.
The Markov chain associated with a fragment of a parametric MC is the Markov chain where is an additional, “end” state, the transition probability matrix is given by
and the atomic propositions for state are given by
where and are atomic propositions that hold in state and state , respectively.
adding transitions of probability 1 from the output states in , and to additional states , and , respectively;
labelling the output states with the additional atomic propositions and , and , and and , and the end states with the new atomic propositions to .
The second parametric MC determined by a fragment and analysed by ePMC is obtained from the original MC by replacing all states from with a single state.
Given a fragment of a parametric Markov chain , the abstract MC induced by is where:
The state set , where is a new, abstract state that stands for all the states from ;
The initial state , if is not the initial state of (i.e., ), and otherwise;
The transition probability between states is
is a reachability property calculated over the parametric Markov chain associated with the fragment , for all output states ;111Note that as required.
The labelling function coincides with for the states from the original MC, and maps the new state to the set of atomic propositions common to all states from ,
Finally, for every structure of rewards defined over the Markov chain , state from the induced Markov chain is annotated with a reward
calculated over the parametric MC associated with . Thus, represents the cumulative reward to reach the end state of .
Consider again the parametric Markov chain from our running example (Fig. 3a). The corresponding abstract MC induced by fragment from Example 1 is shown in Fig. 3c(i). This abstract MC is obtained by replacing all the states from with the single abstract state , and by using the rules from Definition 6 to find the outgoing transition probabilities and atomic propositions for . For example, the transition probability from to is calculated as:
where and are reachability properties calculated over the parametric MC associated with fragment (cf. Fig. 3b). As the two output-state reachability probabilities (5) for fragment can be expressed in terms of a single probability, we use the notation for this probability in Fig. 3c(i). The transition probabilities from to and are calculated similarly, and the transition probability from to is simply (since the entry state of is ). All other transition probabilities from to other states and from other states to are zero. State is labelled with the atomic proposition , which is the only label common to all states from the fragment . Finally, is annotated with the rewards and computed over the parametric MC associated with .
Fig. 3c(ii) shows the abstract Markov chain obtained after all three fragments – from Example 1 were used to simplify the initial MC from Fig. 3a. Note how even for the small MC from our running example, the abstract MC from Fig. 3c(ii) is much simpler than the initial MC from Fig. 3a; the abstract MC has only 5 states and 10 transitions, compared to 15 states and 25 transitions for the initial MC.
The ePMC computation of unbounded until properties (and thus also of reachability properties ) is underpinned by the following result, whose proof is provided in Appendix A.
Let be a fragment of a parametric Markov chain , and and two PCTL state formulae over . If every atomic proposition in and either holds in all states from or holds in no state from , then the PMC of the until PCTL formula over yields an expression equivalent to that produced by the PMC of the formula over the abstract Markov chain induced by .
The repeated application of Theorem 1 reduces the computation of until properties of a parametric MC with multiple fragments , , …to computing:
the output-state reachability probabilities (5) for the parametric MCs associated with , , …;
for the parametric MC induced by the fragments
and combining the results from the two ePMC stages into a set of algebraic formulae over the parameters of the original MC. The parametric MCs from these stages are typically much simpler than the original, “monolithic” MC, and much faster to analyse. In addition, ePMC focuses on frequently used domain-specific fragments , , …, and thus stage 1 only needs to be executed once for a domain. Note that a result similar to Theorem 1 is not available for bounded until properties because the abstract MC induced by a set of fragments does not preserve the path lengths from the original MCs.
We use the above two-stage method to compute properties and from our running example (cf. Table I).
for the MC associated with , ;
for the MC associated with , ;
for the MC associated with , .
We computed these algebraic expressions manually, based on the MCs from Fig. 3b. However, they can also be obtained using one of the model checkers mentioned earlier (i.e., PARAM, PRISM and Storm), or can be taken directly from our ePMC repository of such expressions for the service-based systems domain (see Section 6 later in the paper).
In stage 2, we use a probabilistic model checker (we used Storm) to compute and over the induced parametric MC from Fig. 3c. The shaded formulae from Table II show the expressions obtained for and , preceded by the results from the first ePMC stage.
Expectedly, the set of formulae from Table II is much simpler than the “monolithic” and formulae from Table I. As we will show in Section 8, this difference is even more significant for larger models, making the computation and evaluation of “monolithic” formulae challenging for existing PMC techniques.
|Output-state reachability formulae computed in stage 1 of ePMC|
|Property-specific formulae computed in stage 2 of ePMC|
|Prop.||PCTL formula||ePMC set of formulae|
The final result from this section allows the efficient parametric model checking of reachability reward properties.
Let be a fragment of a parametric Markov chain , and a set of states from . If includes no state from , then the PMC of the PCTL formula (i.e., the cumulative reward to reach a state from ) over yields an expression equivalent to that produced by the PMC of the formula over the abstract MC induced by .
Analogous to Theorem 1, this result (which we prove in Appendix A) reduces the computation of cumulative reachability reward properties of a parametric MC with fragments , , …to computing:
the per-fragment cumulative reachability reward properties (6) for the MCs associated with , , …;
for the MC induced by these fragments
and combining the results from the two stages into a set of algebraic formulae over the parameters of the original MC. Note that results similar to Theorem 2 are not available for instantaneous, cumulative and steady-state rewards formulae because the abstract MC induced by , , …does not preserve the path lengths and the rewards structures of the original MC.
We use ePMC to calculate properties and from our running example (cf. Table I), starting with the cumulative reachability reward properties (6) for fragments –, i.e., and for . The resulting formulae, which we obtained manually (but which can also be obtained using a PMC tool) are shown in the top half of Table II. For the second ePMC stage, we used the model checker Storm to obtain the algebraic expressions for and from the lower half of Table II.
As in Example 4, ePMC produced a set of formulae that is far simpler than the “monolithic” and from Table I. Note that we do not compare the analysis time of our ePMC method with that of existing PMC here or in Example 4 because for the simple system from our running example the two analysis times are similar. However, we do provide an extensive comparison of these analysis times for larger systems in our evaluation of ePMC from Section 8.
We developed a pattern-aware parametric model checker that implements the theoretical results from the previous section. This tool, which is freely available on our project website https://www.cs.york.ac.uk/tasp/ePMC, automates the second stage of ePMC. As shown in Fig. 1, the ePMC tool uses a domain-specific repository of QoS-property expressions to analyse PCTL-specified QoS properties of a parametric Markov chain annotated with pattern instances.
The domain-specific repository comprises entries with the general format:
Each such entry defines algebraic expressions for the reachability properties (5) and the reachability reward properties (6) of a parametric MC fragment commonly used within the domain of interest, i.e., a modelling pattern.
Table III shows a part of the ePMC repository for the service-based systems domain. This part includes the three patterns used by the operations from our running example (SEQ for , PROB for , and SEQ_R for ), such that the formulae from the top half of Table II can be obtained (without any calculations) by instantiating the relevant patterns: