Counterexample-Driven Synthesis for Probabilistic Program Sketches

04/28/2019 ∙ by Milan Ceska, et al. ∙ 0

Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative designs. To mitigate this complexity, we adopt counterexample-guided inductive synthesis (CEGIS) to automatically synthesise finite-state probabilistic programs. Our approach leverages efficient model checking, modern SMT solving, and counterexample generation at program level. Experiments on practically relevant case studies show that design spaces with millions of candidate designs can be fully explored using a few thousand verification queries.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the ever tighter integration of computing systems with their environment, quantifying (and minimising) the probability of encountering an anomaly or unexpected behaviour becomes crucial. This insight has led to a growing interest in probabilistic programs and models in the software engineering community. Henzinger 

[43] for instance argues that “the Boolean partition of software into correct and incorrect programs falls short of the practical need to assess the behaviour of software in a more nuanced fashion .” In [61], Rosenblum advocates taking a more probabilistic approach in software engineering. Concrete examples include quantitative analysis of software product lines [40, 68, 60, 32, 67], synthesis of probabilities for adaptive software [23, 19], and probabilistic model checking at runtime to support verifying dynamic reconfigurations [20, 37].

Synthesis of probabilistic programs. The development of systems under uncertainty is intricate. Probabilistic programs are a prominent formalism to deal with uncertainty. Unfortunately, such programs are rather intricate. Their development is complex and error prone requiring quantitative reasoning over many alternative designs. One remedy is the exploitation of probabilistic model checking [6] using a Markov chain as the operational model of a program. One may then apply model checking on each design, or some suitable representation thereof [32, 27]. Techniques such as parameter synthesis [42, 26, 58] and model repair [9, 31] have been successful, but they only allow to amend or infer transition probabilities, whereas the control structure—the topology of the probabilistic model—is fixed.

Counter-Example-Guided Inductive Synthesis. This paper aims to overcome the existing limitation, by adopting the paradigm of CounterExample-Guided Inductive Synthesis (CEGIS, cf. Fig. 1[65, 3, 64, 1] to finite-state probabilistic models and programs. Program synthesis amounts to automatically provide an instantiated probabilistic program satisfying all properties, or returns that such realisation is non-existing. This syntax-based approach starts with a sketch, a program with holes, and iteratively searches for good—or even optimal—realisations, i.e., instantiated programs. Rather than checking all realisations, the design space is pruned by potentially ruling out many realisations (dashed area) at once. From every realisation that was verified and rejected, a counterexample (CE) is derived, e.g., a program run violating the specification.




reject +





no instance

synthesised program

Figure 1: CEGIS for synthesis.

An SMT (satisfiability modulo theory)-based synthesiser uses the CE to prune programs that also violate the specification. These programs are safely removed from the design space. The synthesis and verification step are repeated until either a satisfying program is found or the entire design space is pruned implying the non-existence of such a program.

Problem statement and program-level approach. This paper tailors and generalises CEGIS to probabilistic models and programs. The input is a sketch—a probabilistic program with holes, where each hole can be replaced by finitely many options—, a set of quantitative properties that the program needs to fulfil, and a budget. All possible realisations have a certain cost and the synthesis provides a realisation that fits within the budget. Programs are represented in the PRISM modelling language [50] and properties are expressed in PCTL (Probabilistic Computational Tree Logic) extended with rewards, as standard in probabilistic model checking [50, 34]. Program sketches succinctly describe the design space of the system by providing the program-level structure but leaving some parts (e.g., command guards or variable assignments) unspecified.

Outcomes. To summarise, this paper presents a novel synthesis framework for probabilistic programs that adhere to a given set of quantitative requirements and a given budget. We use families of Markov chains to formalise our problem, and then formulate a CEGIS-style algorithm on these families. Here, CEs are subgraphs of the Markov chains. In the second part, we then generalise the approach to reason on probabilistic programs with holes. While similar in spirit, we rely on program-level CEs [33, 71], and allow for a more flexible sketching language. To the best of our knowledge, this is the first lifting of CEGIS to probabilistic programs. The CEGIS approach is sound and complete: either an admissible program does exist and it is computed, or no such program exists and the algorithm reports this. We provide a prototype implementation build on top of the model checker Storm [34] and the SMT-tool Z3 [56]. Experiments with different examples demonstrate scalability: design spaces with millions of realisations can be fully explored by a few thousand verification queries and result in a speedup of orders of magnitude.

Related work. We build on the significant body of research that employs formal methods to analyse quality attributes of alternative designs, e.g. [8, 16, 10, 38, 66, 72]. Enumerative approaches based on Petri nets [54], stochastic models [19, 62] and timed automata [44, 52], and the corresponding tools for simulation and verification (e.g. Palladio [10], PRISM [50], UPPAAL [44]) have long been used.

For non-probabilistic systems, CEGIS can find programs for a variety of challenging problems [64, 63]. Meta-sketches and the optimal and quantitative synthesis problem in a non-probabilistic setting have been proposed [25, 30, 17].

A prominent representation of sets of alternative designs are modal transition systems [53, 5, 49]. In particular, parametric modal transition systems [11] and synthesis therein [12] allow for similar dependencies that occur in program-level sketches. Probabilistic extensions are considered in, e.g. [35], but not in conjunction with synthesis. Recently [36] proposed to exploit relationships between model and specification, thereby reducing the number of model-checking instances.

In the domain of quantitative reasoning, sketches and likelihood computation are used to find probabilistic programs that best match available data [57]. The work closest to our approach synthesises probabilistic systems from specifications and parametric templates [39]. The principal difference to our approach is the use of counterexamples. The authors leverage evolutionary optimisation techniques without pruning. Therefore, the completeness is only achieved by exploring all designs, which is practically infeasible. An extension to handle parameters affecting transition probabilities (rates) has been integrated into the evolutionary-driven synthesis [21, 23] and is available in RODES [22]. Some papers have considered the analysis of sets of alternative designs within the quantitative verification of software product lines [40, 68, 60]. The typical approach is to analyse all individual designs (product configurations) or build and analyse a single (so-called all-in-one

) Markov decision process describing all the designs simultaneously. Even with symbolic methods, this hardly scales to large sets of alternative designs. These techniques have recently been integrated into ProFeat 

[32] and QFLan [67]. An abstraction-refinement scheme has recently been explored in [27]. It iteratively analyses an abstraction of a (sub)set of designs—it is an orthogonal and slightly restricted approach to the inductive method presented here (detailed differences are discussed later). An incomplete method in [45]

employs abstraction targeting a particular case study. SMT-based encodings for synthesis in Markov models have been used in, e.g. 

[46, 24]. These encodings are typically monolithic—they do not prune the search space via CEs. Probabilistic CEs have been recently used to ensure that controllers obtained via learning from positive examples meet given safety properties [74]. In contrast, we leverage program-level CEs that can be used to prune the design space.

2 Preliminaries and Problem Statement

We start with basics of probabilistic model checking, for details, see [7, 6], and then formalise families of Markov chains. Finally, we define some synthesis problems.

Probabilistic models and specifications. A probability distribution over a finite set is a function with . Let denote the set of all distributions on .

Definition 1 (Mc)

A discrete-time Markov chain (MC) is a tuple with finite set of states, initial state , and transition probabilities . We write to denote .

For , the set denotes the successor states of . A path of an MC is an (in)finite sequence , where , and for all .

Definition 2 (sub-MC)

Let MC and critical states with . The sub-MC of is the MC with for , for , and

Specifications. For simplicity, we focus on reachability properties for a set of goal states, threshold , and comparison relation . The interpretation of on MC is as follows. Let denote the probability to reach from ’s initial state. Then, if . A specification is a set of properties, and if . Upper-bounded properties (with ) are safety properties, lower-bounded properties are liveness properties. Extensions to expected rewards or -regular properties are rather straightforward.

Families of Markov chains. We recap an explicit representation of a family of MCs using a parametric transition function, as in [27].

Definition 3 (Family of MCs)

A family of MCs is a tuple with , as before, a finite set of parameters where the domain for each parameter is , and transition probability function .

The transition probability function of MCs maps states to distributions over successor states. For families, this function maps states to distributions over parameters. Instantiating each parameter with a value from its domain yields a “concrete” MC, called a realisation.

Definition 4 (Realisation)

A realisation of a family is a function where . A realisation yields an MC , where is the transition probability matrix in which each in is replaced by . Let denote the set of all realisations for .

As a family has finite parameter domains, the number of family members (i.e. realisations from ) of is finite, but exponential in . While all MCs share their state space, their reachable states may differ.

Example 1

Consider the family of MCs where , , and with , , , , and , and given by:

(a) with

(b) with

(c) with

(d) with
Figure 2: The four different realisations of family .

Fig. 2 shows the four MCs of . Unreachable states are greyed out.

The function assigns realisation costs. Attaching costs to realisations is a natural way to distinguish preferable realisations. We stress the difference with rewards in MCs; the latter impose a cost structure on paths in MCs.

Problem statement Synthesis problems. Let be a family, and be a set of properties, and a budget. Consider the synthesis problems:

  1. Feasibility synthesis: Find a realisation with and .

  2. Max synthesis: For given , find with

The problem in feasibility synthesis is to determine a realisation satisfying all , or return that no such realisation exists. This problem is NP-complete [27]. The problem in max synthesis is to find a realisation that maximises the reachability probability of reaching . It can analogously be defined for minimising such probabilities. As families are finite, such optimal realisations always exist. It is beneficial to consider a variant of the max-synthesis problem in which the realisation is not required to achieve the maximal reachability probability, but it suffices to be close to it. This notion of -maximal synthesis for a given amounts to find a realisation with .

Problem statement and structure. In this paper, we propose novel synthesis algorithms for the probabilistic systems that are based on two concepts, CEGIS [64] and syntax-guided synthesis [3]. To simplify the presentation, we start with CEGIS in Sect. 3 and adopt it for MCs and the feasibility problem. In Sect. 4, we lift and tune CEGIS, in particular towards probabilistic program sketches.

3 CEGIS for Markov Chain Families

We follow the typical separation of concerns as in oracle-guided inductive synthesis [4, 39, 41]: a synthesiser selects single realisations that have not been considered before, and a verifier checks whether the MC satisfies the specification (cf. Fig. 1 on page 1). If a realisation violates the specification, the verifier returns a conflict representing the core part of the MC causing the violation.

3.1 Conflicts and synthesiser

To formalise conflicts, a partial realisation of a family is a function such that . For any partial realisations , , let iff for all .

Definition 5 (Conflict)

Let be a realisation with for . A partial realisation is a conflict for the property iff for each realisation . A set of conflicts is called a conflict set.

To explore all realisations, the synthesiser starts with and picks some realisation .111We focus on program-level synthesis, and refrain from discussing important implementation aspects—like how to represent —here. Either and we immediately return , or a conflict is found: then is pruned by removing all conflicts that the verifier found. If is empty, we are done: each realisation violates a property .

3.2 Verifier

Definition 6

A verifier is sound and complete, if for family , realisation , and specification , the verifier terminates, the returned conflict set is empty iff , and if it is not empty, it contains a conflict for some .

Algorithm 1 outlines a basic verifier. It uses an off-the-shelf probabilistic model-checking procedure Check(, ) to determine which (if any) are violated. The algorithm then iterates over the violated and computes critical sets of that induce sub-MCs such that (line 6). The critical sets for safety properties can be obtained via standard methods [2], support for liveness properties is discussed at the end of the section.

Example 2

Reconsider from Ex. 1 with . Assume the synthesiser picks realisation . The verifier builds and determines . Observe that the verifier does not need the full realisation to refute . In fact, the paths in the fragment of in Fig. (a)a (ignoring the outgoing transitions of states and ) suffice to show that the probability to reach state exceeds . Formally, the fragment in Fig. (b)b is a sub-MC with critical states .

(a) Fragment of

(b) Sub-MC of with
Figure 3: Fragment and corresponding sub-MC that suffices to refute

The essential property is [70]:

If a sub-MC of a MC refutes a safety property , then refutes too.

Observe that is part of too. Formally, the sub-MC of is isomorphic to and therefore also violates . Thus, .

Finally, the verifier translates the obtained critical set for realisation to a conflict and stores it in the conflict set (line 7). The procedure generateConflict identifies the subset of parameters  that occur in the sub-MCs and returns the corresponding partial realisation. The proposition below clarifies the relation between critical sets and conflicts.

1:function Verify(family , realisation , specification )
2:   ; ; ;
3:   for all  do
4:      if not Check then    
5:   for all  do
6:       ComputeCriticalSet
7:       generateConflict    
8:   return
Algorithm 1 Verifier
Proposition 1

If is a critical set for and , then is also a critical set for each , , and .

Example 3

Recall from Ex. 2 that . This can be concluded without constructing . Just considering , and suffices: First, take all parameters occurring in for any . This yields . The partial realisation is a conflict. The values for the other parameters do not affect the shape of the sub-MC induced by . Realisation only varies from in the value of , but , i.e., is not included in the conflict. This suffices to conclude .

Conflicts for liveness properties. To support liveness properties such as , we first consider a (standard) dual safety property , where is the set of all states that do not have a path to . Observe that can be efficiently computed using graph algorithms. We have to be careful, however.

Example 4

Consider , and let . . Then, , which is refuted with critical set as before. Although is again isomorphic to , we have . The problem here is that state is in for as , but not in for , as .

To prevent the problem above, we ensure that the states in cannot reach in other realisations, by including in the critical set of : Let be the critical set for the dual safety property . We define as critical states for . Together, we reach states with a critical probability mass222A good implementation takes a subset of by considering the ., and never leave .

Example 5

In , we compute critical states , preventing the erroneous reasoning from the previous example. For , we compute as critical states, and as is isomorphic to , we obtain that .

4 Syntax-Guided Synthesis for Probabilistic Programs

Probabilistic models are typically specified by means of a program-level modelling language, such as PRISM [50], PIOA [73], JANI [18], or MODEST [15]. We propose a sketching language based on the PRISM modelling language. A sketch, a syntactic template, defines a high-level structure of the model and represents a-priori knowledge about the system under development. It effectively restricts the size of the design space and also allows to concisely add constraints and costs to its members. The proposed language is easily supported by model checkers and in particular by methods for generating CEs [33, 71]. Below, we describe the language, and adapt CEGIS from state level to program level. In particular, we employ so-called program-level CEs, rather than CEs on the state level.

4.1 A program sketching language

Let us briefly recap how the model-based concepts translate to language concepts in the PRISM guarded-command language. A PRISM program consists of one or more reactive modules that may interact with each other. Consider a single module. This is not a restriction, every PRISM program can be flattened into this form. A module has a set of bounded variables spanning its state space. Transitions between states are described by guarded commands of the form:

The guard is a Boolean expression over the module’s variables of the model. If the guard evaluates to true, the module can evolve into a successor state by updating its variables. An update is chosen according to the probability distribution given by expressions . In every state enabling the guard, the evaluation of must sum up to one. Overlapping guards yield non-determinism and are disallowed here. Roughly, a program thus is a tuple of variables and commands. For a program , the underlying MC are ’s semantics. We lift specifications: Program satisfies a specification , iff , etc.

A sketch is a program that contains holes. Holes are the program’s open parts and can be replaced by one of finitely many options. Each option can optionally be named and associated with a cost. They are declared as:

where is the hole identifier, is the option name, is an expression over the program variables describing the option, and are the cost, given as expressions over natural numbers. A hole can be used in commands in a similar way as a constant, and may occur multiple times within multiple commands, in both guards and updates. The option names can be used to describe constraints on realisations. These propositional formulae over option names restrict realisations, e.g.,

requires that whenever the options or are taken for some (potentially different) holes, option is also to be taken.

Definition 7 (Program sketch)

A (PRISM program) sketch is a pair where is a program with a set of holes with options , are constraints over , and option-costs.

(a) Program sketch
(b) Instance
Figure 4: Running example
Example 6

We consider a small running example to illustrate the main concepts. Fig. (a)a depicts the program sketch with holes . For X, the options are . The constraint forbids XA and XB both being one; it ensures a non-trivial random choice in state s=0.

Remark 1

Below, we formalise notions previously used on families. Due to flexibility of sketching (in particular in combination with multiple modules), it is not straightforward to provide family semantics to sketches, but the concepts are analogous. In particular: holes and parameters are similar, parameter domains are options, and family realisations and sketch realisations both yield concrete instances from a family/sketch. The synthesis problems carry over naturally.

Definition 8 (Realisations of sketches)

Let be a sketch, a sketch realisation on holes is a function with and that satisfies all constraints in . The sketch instance for realisation is the program (without holes) in which each hole in is replaced by . The cost is the sum of the cost of the selected options, .

Example 7

We continue Ex. 6. The program in Fig. (b)b reflects for realisation , with as and all other options have cost zero. For realisation , . The assignment violates the constraint and is not a realisation. In total, represents programs and their underlying MCs.

4.2 A program-level synthesiser

Feasibility synthesis.

The synthesiser follows the steps in Alg. 2. During the synthesis process, the synthesiser stores and queries the set of realisations not yet pruned. These remaining realisations are represented by (the satisfying assignments of) the first-order formula over hole-assignments. Iteratively extending with conjunctions thus prunes the remaining design space.

1:function Synthesis(program sketch , specification , budget )
2:    Initialise
3:    GetRealisation
4:   while  Unsat do
5:       Verify
6:      if  then return
7:       LearnFromConflict
8:       GetRealisation    
9:   return Unsat
Algorithm 2 Synthesiser (feasibility synthesis)

We give a brief overview, before detailing the steps. Initialise constructs such that it represents all sketch realisations that satisfy the constraints in the sketch within the budget . GetRealisation() exploits an SMT-solver for linear (bounded) integer arithmetic to obtain a realisation consistent with , or Unsat if no such realisation exists. As long as new realisations are found, the verifier analyses them (line 5) and returns a conflict set . If , the satisfies the specification and the search is terminated. Otherwise, the synthesiser updates based on the conflicts (line 7). is always pruned.

Initialise: Let hole have (ordered) options . To encode realisation , we introduce integer-valued meta-variables with the semantics that whenever hole has value , i.e., . We set , where ensures that each hole is assigned to some option, ensures that the sketch’s constraints are satisfied, and ensures that the budget is respected. These sub-formulae are:

where denotes that in every constraint we replace each option name for an option with , and are fresh variables storing the cost for the selected option at hole .

Example 8

For sketch in Fig. (a)a, we obtain (with slight simplifications)

GetRealisation: To obtain a realisation , we check satisfiability of . The solver either returns Unsat indicating that the synthesiser is finished, or Sat, together with a satisfying assignment . The assignment uniquely identifies a realisation by . The sum over gives .

Verify: invokes any sound and complete verifier, e.g., an adaption of the verifier from Sect. 3.2 as presented in Sect. 4.3.

LearnFromConflict: For a conflict333As in Sect. 3.1: A partial realisation for is a function s.t. . For partial realisations , let iff . Let be a realisation s.t.  for . Partial realisation is a conflict for iff . , we add the formula

that excludes realisations . Intuitively, the formula states that the realisations remaining in the design space (encoded by the updated ) must have different valuations of holes w.r.t.  (for holes where ).

Example 9

Consider from Ex. 8. The satisfying assignment (for ) is represents , from Ex. 6. Consider . The verifier (for now, magically) constructs a conflict set with . The synthesiser updates (recall that encodes ). A satisfying assignment for encodes from Ex. 7. As , the verifier reports no conflict.

Optimal synthesis. We adapt the synthesiser to support max synthesis, cf. Alg. 3.

1:function Synthesis(, , , goal predicate , tolerance )
2:   , , Initialise
3:    getRealisation
4:   while  Unsat do
5:       OptimiseVerify
6:      if   then
7:       LearnFromConflict
8:       getRealisation    
9:   return
Algorithm 3 Synthesiser (max synthesis)

Recall the problem aims at maximizing the probability of reaching states described by a predicate , w.r.t. the tolerance . Algorithm 3 stores in the maximal probability among all considered realisations , and this in . In each iteration, an optimising verifier is invoked (line 5) on realisation . If and , it returns an empty conflict set and . Otherwise, it reports a conflict set for .

4.3 A program-level verifier

We now adapt the statel-level verifier from Sect. 3.2 in Alg. 1 to use program-level counterexamples [71] for generating conflicts. The appendix contains more details.

generateMC: This procedure first constructs the instance , i.e., a program without holes, from and , as in Fig. (b)b: Constraints in the sketch are removed, as they are handled by the synthesiser. This approach allows us to use any model checker supporting PRISM programs. The realisation is passed separately, the sketch is parsed once and then appropriately instantiated. The instance is then translated into the underlying MC via standard procedures, with transitions annotated with their generating commands.

(a) CE for upper bound
(b) CE for lower bound
Figure 5: CEs for (a) and (b) .

ComputeCriticalSet computes program-level CEs as analogue of critical sets. They are defined on commands rather than on states. Let be a program with commands . Let denote the restriction of to (with variables and initial sates as in ). Building may introduce deadlocks in (just like a critical set introduces deadlocks). To remedy this, we use the standard operation , which takes a program and adds commands that introduce self-loops for states without enabled guard.

Definition 9

For program and specification with , a program-level CE is a set of commands, such that for all (non-overlapping) programs with (i.e, extending ), .

Example 10

Reconsider . Figure (a)a shows a CE for in Fig. (b)b. The probability to reach s=3 in the underlying MC is .

For safety properties, program-level CEs coincide with high-level CEs proposed in [71], their extension to liveness properties follows the ideas on families. The program-level CEs are computed by an extension of the MaxSat [14] approach from [33]. The appendix contains details and further extensions.

GenerateConflict generates conflicts from commands: we map commands in to the commands from , i.e., we restore the information about the critical holes corresponding to the part of the design space that can be pruned by CE . Formally, for all that appear in restriction .

Proposition 2

If is a CE for , then is also a CE for each , .

Example 11

The CEs in Fig. (a)a contain commands which depend on the realisations for holes X and Y. For these fixed values, the program violates the specification independent of the value for Z, so Z is not in the conflict .

5 Experimental Evaluation and Discussion

Implementation. We evaluate the synthesis framework with a prototype444 using the SMT-solver Z3 [56], and (an extension of) the model checker Storm [34].

Case studies. We consider the following three case studies:

Dynamic power management (DPM). The goal of this adapted DPM problem [13] is to trade-off power consumption for performance. We sketch a controller that decides based on the current workload, inspired by [39]. The fixed environment contains no holes. The goal is to synthesise the guards and updates to satisfy a specification with properties such as : the expected number of lost requests is below , and : the expected energy consumption is below .

Intrusion describes a network (adapted from [51]), in which the controller tries to infect a target node via intermediate nodes. A failed attack makes a node temporarily harder to intrude. We sketched a partial strategy aiming to minimise the expected time to intrusion. Constraints encode domain specific knowledge.

Grid is based on a classical benchmark for solving partially observable MDPs (POMDPs) [48]. To solve POMDPs, the task is to find an observation-based strategy, which is undecidable for the properties we consider. Therefore, we resort to finding a deterministic -state strategy [55] s.t. in expectation, the strategy requires less than steps to the target. This task is still hard: finding a memoryless, observation-based strategy is already NP-hard [69, 29]. We create a family describing all -state strategies (for some fixed ) for the POMDP. Like in [47] actions are reflected by parameters, while parameter dependencies ensure that the strategy is observation-based.

Evaluation. We compare w.r.t. an enumerative approach. That baseline linearly depends on the number of realisations, and the underlying MCs’ size. We focus on sketches where all realisations are explored, as relevant for optimal synthesis. For concise presentation we use Unsat variants of feasibility synthesis, where methods perform mostly independent of the order of exploring realisations. We evaluate results for DPM, and summarise further results. All results are obtained on a Macbook MF839LL/A, within 3 hours and using less than 8 GB RAM.

DPM has 9 holes with 260K realisations, and MCs have 5K (reachable) states on average, ranging from 2K to 8K states. The performance of CEGIS significantly depends on the specification, namely, on the thresholds appearing in the properties. Fig. (a)a shows how the number of iterations (left axis, green circle) and the runtime in seconds (right axis, blue) changes for varying for property (stars and crosses are explained later). We obtain a speedup of over the baseline for , dropping to for , where is the minimal probabilty over all realisations. The strong dependency between performance and “unsatisfiability” is not surprising. The more unsatisfiable, the smaller the conflicts (as in [33]). Small conflicts have a double beneficial effect. First, the prototype uses an optimistic verifier searching for minimal conflicts; small conflicts are found faster than large ones. Second, small conflicts prune more realisations. A slightly higher number of small conflicts yields a severe decrease in iterations. Thus the further the threshold from the optimum, the better the performance.

Reconsider Fig. (a)a, crosses and stars correspond to a variant in which we have blown up the state space of the underlying MCs by a factor B-UP. Observe that performance degrades similarly for the baseline and our algorithm, which means that the speedup w.r.t. the baseline is not considerably affected by the size of the underlying MCs. This observation holds for various models and specifications.















time (seconds)

B-UP = 1

B-UP = 2

B-UP = 5

(a) Performance for varying





variants of sketch













time (seconds)



(b) Performance for varying
Figure 6: Performance (runtime and iterations) on DPM

Varying the sketch tremendously affects performance, cf. Fig. (b)b for the performance of variants of the original sketch with some hole substituted. The framework performs significantly better on sketches with holes that lie in local regions of the MC. Holes relating to states all-over the MC are harder to prune. Finally, our prototype generally performs better with specifications that have multiple (conflicting) properties: Some realisations can be effectively pruned by conflicts w.r.t. property , whereas other realisations are easily pruned by conflicts w.r.t., e.g., property .

Intrusion has 26 holes and 6800K realisations, the underlying MCs have only 500 states on average. We observe an even more significant effect of the property thresholds on the performance, as the number of holes is larger (recall the optimistic verifier). We obtain a speedup of factor , and over the baseline, for thresholds , and , respectively. For , many conflicts contain only 8 holes. Blowing up the model does not affect the obtained speedups. Differences among variants are again significant, albeit less extreme.

Grid is structurally different: only 6 holes in 3 commands and 1800 realisations, but MCs having 100K states on average. Observe that reaching the targets on expectation below some threshold implies that the goal must almost surely be reached. The MCs’ topology and the few commands make pruning hard: our algorithm needs more than iterations. Still, we obtain a speedup for . Pruning mostly follows from reasoning about realisations that do not reach the target almost surely. Therefore, the speedup is mostly independent of the relation between and .

Discussion. Optimistic verifiers search for a minimal CE and thus solve an NP-hard problem [28, 71]. In particular, we observed a lot of overhead when the smallest conflict is large, and any small CE that can be cheaply computed might be better for the performance (much like the computation of unsatisfiable cores in SMT solvers). Likewise, reusing information about holes from previous runs might benefit the performance. Improvements in concise sketching, and exploiting the additional structure, will also improve performance.

Sketching. Families are simpler objects than sketches, but their explicit usage of states make them inadequate for modelling. Families can be lifted to a (restricted) sketching class, as in [27]. However, additional features like conflicts significantly ease the modelling process. Consider intrusion: Without constraints, the number of realisations grows to realisations. Put differently, the constraint allows to discard over of the realisations up front. Moreover, constraints can exclude realisations that would yield unsupported programs, e.g, programs with infinite state spaces. While modelling concise sketches with small underlying MCs, it may be hard to avoid such invalid realisations without the use of constraints.

Comparison with CEGAR. We also compared with our CEGAR-prototype [27], which applies an abstraction-refinement loop towards, e.g., feasibility synthesis. In particular, the abstraction aggregates multiple realisations in a single model to effectively reason about a set of designs. This approach does not support multiple objective specification (as of now), and, more importantly, the algorithm is conceptually not capable of handling constraints. For DPM with a single objective, CEGAR-prototype is drastically faster, while on Grid, it is typically (often significantly) slower. Comparing two prototypes is intricate, but there is a CEGIS strength and weakness that explains the different characteristics.

Weakness: Upon invocation, the CEGIS verifier gets exactly one realisation, and is (as of now) unaware of other options for the holes. The verifier constructs a CE which is valid for all possible extensions (cf. Definition 9), even for extensions which do not correspond to any realisation. It would be better if we compute CEs that are (only) valid for all possible realisations. The following example (exaggerating DPM) illustrates that considering multiple realisations at once may be helpful: Consider a family with a parametric transition (hole) from the initial state and specification that requires reaching the failure state with probability smaller than 0.1. Assume that all 100 options lead to a failure states with probability 1. CEGIS never prunes this hole as it is relevant for every realisation. Knowing that all options lead to the failure state, however makes the hole trivially not relevant. Thus, all corresponding options can be safely pruned.

Strength: The weakness is related to its strength: the verifier works with one concrete MC. An extreme example (exaggerating Grid) is a sketch with holes , where hole has options , and option makes holes irrelevant (by the model topology). CEGIS considers a realisation, say , that violates the specification. As holes with are not relevant, CEGIS finds a conflict . Indeed, for every realisation, it is able to prune all but two holes. However, if the verifier would consider many realisations for , it may (without advanced reasoning) generate much larger conflicts. Thus, considering a single realisation naturally fixes the context of the selected options and makes it clearer which holes are not relevant.


  • [1] Abate, A., David, C., Kesseli, P., Kroening, D., Polgreen, E.: Counterexample guided inductive synthesis modulo theories. In: CAV (1). LNCS, vol. 10981, pp. 270–288. Springer (2018)
  • [2] Ábrahám, E., Becker, B., Dehnert, C., Jansen, N., Katoen, J.P., Wimmer, R.: Counterexample Generation for Discrete-Time Markov Models: An Introductory Survey, pp. 65–121. Springer (2014)
  • [3] Alur, R., Bodík, R., Dallal, E., Fisman, D., Garg, P., Juniwal, G., Kress-Gazit, H., Madhusudan, P., Martin, M.M.K., Raghothaman, M., Saha, S., Seshia, S.A., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: Dependable Software Systems Engineering, NATO Science for Peace and Security Series, vol. 40, pp. 1–25. IOS Press (2015)
  • [4] Alur, R., Singh, R., Fisman, D., Solar-Lezama, A.: Search-based program synthesis. Commun. ACM 61(12), 84–93 (2018)
  • [5] Antonik, A., Huth, M., Larsen, K.G., Nyman, U., Wasowski, A.: 20 years of modal and mixed specifications. Bulletin of the EATCS 95, 94–129 (2008)
  • [6] Baier, C., de Alfaro, L., Forejt, V., Kwiatkowska, M.: Model checking probabilistic systems. In: Handbook of Model Checking, pp. 963–999. Springer (2018)
  • [7] Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press (2008)
  • [8] Balsamo, S., Di Marco, A., Inverardi, P., Simeoni, M.: Model-based performance prediction in software development: A survey. IEEE Trans. Softw. Eng. 30(5), 295–310 (2004)
  • [9] Bartocci, E., Grosu, R., Katsaros, P., Ramakrishnan, C.R., Smolka, S.A.: Model repair for probabilistic systems. In: TACAS. LNCS, vol. 6605, pp. 326–340. Springer (2011)
  • [10] Becker, S., Koziolek, H., Reussner, R.: The Palladio component model for model-driven performance prediction. J. Syst. & Softw. 82(1) (2009)
  • [11] Benes, N., Kretínský, J., Larsen, K.G., Møller, M.H., Sickert, S., Srba, J.: Refinement checking on parametric modal transition systems. Acta Inf. 52(2-3), 269–297 (2015)
  • [12] Benes, N., Křetínský, J., Larsen, K.G., Møller, M.H., Srba, J.: Dual-priced modal transition systems with time durations. In: LPAR. LNCS, vol. 7180, pp. 122–137. Springer (2012)
  • [13] Benini, L., Bogliolo, A., Paleologo, G., Micheli, G.D.: Policy optimization for dynamic power management. IEEE Trans. on CAD of Integrated Circuits and Systems 8(3), 299–316 (2000)
  • [14]

    Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability, Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press (2009)

  • [15] Bohnenkamp, H.C., D’Argenio, P.R., Hermanns, H., Katoen, J.P.: MODEST: A compositional modeling formalism for hard and softly timed systems. IEEE Trans. Software Eng. 32(10), 812–830 (2006)
  • [16] Bondy, A.B.: Foundations of Software and System Performance Engineering. Addison Wesley (2014)
  • [17] Bornholt, J., Torlak, E., Grossman, D., Ceze, L.: Optimizing synthesis with metasketches. In: POPL. pp. 775–788. ACM (2016)
  • [18] Budde, C.E., Dehnert, C., Hahn, E.M., Hartmanns, A., Junges, S., Turrini, A.: JANI: quantitative model and tool interaction. In: TACAS. LNCS, vol. 10206, pp. 151–168 (2017)
  • [19]

    Calinescu, R., Ghezzi, C., Johnson, K., et al.: Formal verification with confidence intervals to establish quality of service properties of software systems. IEEE Trans. Rel.

    65(1), 107–125 (2016)
  • [20] Calinescu, R., Ghezzi, C., Kwiatkowska, M.Z., Mirandola, R.: Self-adaptive software needs quantitative verification at runtime. Commun. ACM 55(9), 69–77 (2012)
  • [21] Calinescu, R., Češka, M., Gerasimou, S., Kwiatkowska, M., Paoletti, N.: Designing robust software systems through parametric Markov chain synthesis. In: ICSA. pp. 131–140. IEEE (2017)
  • [22] Calinescu, R., Češka, M., Gerasimou, S., Kwiatkowska, M., Paoletti, N.: RODES: A robust-design synthesis tool for probabilistic systems. In: QEST. pp. 304–308. Springer (2017)
  • [23] Calinescu, R., Češka, M., Gerasimou, S., Kwiatkowska, M., Paoletti, N.: Efficient synthesis of robust models for stochastic systems. J. Syst. & Softw. 143, 140 – 158 (2018)
  • [24] Cardelli, L., Češka, M., Fränzle, M., et al.: Syntax-guided optimal synthesis for chemical reaction networks. In: CAV. pp. 375–395. Springer (2017)
  • [25] Černý, P., Chatterjee, K., Henzinger, T.A., Radhakrishna, A., Singh, R.: Quantitative synthesis for concurrent programs. In: CAV. Springer (2011)
  • [26] Češka, M., Dannenberg, F., Paoletti, N., Kwiatkowska, M., Brim, L.: Precise parameter synthesis for stochastic biochemical systems. Acta Inf. 54(6), 589–623 (2017)
  • [27] Češka, M., Jansen, N., Junges, S., Katoen, J.P.: Shepherding hordes of Markov chains. In: TACAS. LNCS, vol. 11428. Springer (2019)
  • [28] Chadha, R., Viswanathan, M.: A counterexample-guided abstraction-refinement framework for markov decision processes. ACM Trans. Comput. Log. 12(1), 1:1–1:49 (2010)
  • [29] Chatterjee, K., Chmelik, M., Davies, J.: A symbolic SAT-based algorithm for almost-sure reachability with small strategies in POMDPs. In: AAAI. pp. 3225–3232. AAAI Press (2016)
  • [30] Chaudhuri, S., Clochard, M., Solar-Lezama, A.: Bridging boolean and quantitative synthesis using smoothed proof search. In: POPL. ACM (2014)
  • [31] Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M.Z., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: TASE. pp. 85–92. IEEE (2013)
  • [32] Chrszon, P., Dubslaff, C., Klüppelholz, S., Baier, C.: ProFeat: feature-oriented engineering for family-based probabilistic model checking. Formal Asp. Comput. 30(1), 45–75 (2018)
  • [33] Dehnert, C., Jansen, N., Wimmer, R., Ábrahám, E., Katoen, J.P.: Fast debugging of PRISM models. In: ATVA. LNCS, vol. 8837, pp. 146–162. Springer (2014)
  • [34] Dehnert, C., Junges, S., Katoen, J.P., Volk, M.: A storm is coming: A modern probabilistic model checker. In: CAV. LNCS, vol. 10427, pp. 592–600. Springer (2017)
  • [35] Delahaye, B., Katoen, J.P., Larsen, K.G., Legay, A., Pedersen, M.L., Sher, F., Wasowski, A.: Abstract probabilistic automata. Inf. Comput. 232, 66–116 (2013)
  • [36] Dureja, R., Rozier, K.Y.: More scalable LTL model checking via discovering design-space dependencies. In: TACAS (1). LNCS, vol. 10805, pp. 309–327. Springer (2018)
  • [37] Filieri, A., Tamburrelli, G., Ghezzi, C.: Supporting self-adaptation via quantitative verification and sensitivity analysis at run time. IEEE Trans. Software Eng. 42(1), 75–99 (2016)
  • [38] Fiondella, L., Puliafito, A.: Principles of Performance and Reliability Modeling and Evaluation. Springer Series in Reliability Engineering (2016)
  • [39] Gerasimou, S., Tamburrelli, G., Calinescu, R.: Search-based synthesis of probabilistic models for quality-of-service software engineering. In: ASE. pp. 319–330. IEEE Computer Society (2015)
  • [40] Ghezzi, C., Sharifloo, A.M.: Model-based verification of quantitative non-functional properties for software product lines. Information & Software Technology 55(3), 508–524 (2013)
  • [41] Gulwani, S., Polozov, O., Singh, R.: Program synthesis. Foundations and Trends in Programming Languages 4(1-2), 1–119 (2017)
  • [42] Hahn, E.M., Hermanns, H., Zhang, L.: Probabilistic reachability for parametric markov models. Software Tools for Technology Transfer 13(1), 3–19 (2011)
  • [43] Henzinger, T.A.: Quantitative reactive modeling and verification. Computer Science - R&D 28(4), 331–344 (2013)
  • [44] Hessel, A., Larsen, K.G., Mikucionis, M., et al.: Testing real-time systems using UPPAAL. In: Formal Methods and Testing, pp. 77–117. Springer (2008)
  • [45] Jansen, N., Humphrey, L.R., Tumova, J., Topcu, U.: Structured synthesis for probabilistic systems. CoRR abs/1807.06106 (2018), to appear at NFM’19
  • [46]

    Junges, S., Jansen, N., Dehnert, C., Topcu, U., Katoen, J.P.: Safety-constrained reinforcement learning for MDPs. In: TACAS. LNCS, vol. 9636, pp. 130–146. Springer (2016)

  • [47] Junges, S., Jansen, N., Wimmer, R., Quatmann, T., Winterer, L., Katoen, J.P., Becker, B.: Finite-state controllers of POMDPs using parameter synthesis. In: UAI. pp. 519–529. AUAI Press (2018)
  • [48] Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1-2), 99–134 (1998)
  • [49] Kretínský, J.: 30 years of modal transition systems: Survey of extensions and analysis. In: Models, Algorithms, Logics and Tools. LNCS, vol. 10460, pp. 36–74. Springer (2017)
  • [50] Kwiatkowska, M., Norman, G., Parker, D.: Prism 4.0: Verification of probabilistic real-time systems. In: CAV. LNCS, vol. 6806, pp. 585–591. Springer (2011)
  • [51] Kwiatkowska, M.Z., Norman, G., Parker, D., Vigliotti, M.G.: Probabilistic mobile ambients. Theor. Comput. Sci. 410(12-13), 1272–1303 (2009)
  • [52] Larsen, K.G.: Verification and performance analysis of embedded and cyber-physical systems using UPPAAL. In: MODELSWARD’14. pp. IS–11–IS–11 (2014)
  • [53] Larsen, K.G., Thomsen, B.: A modal process logic. In: LICS. pp. 203–210. IEEE Computer Society (1988)
  • [54] Lindemann, C.: Performance modelling with deterministic and stochastic Petri nets. Perf. Eval. Review 26(2),  3 (1998)
  • [55] Meuleau, N., Kim, K.E., Kaelbling, L.P., Cassandra, A.R.: Solving POMDPs by searching the space of finite policies. In: UAI. pp. 417–426. Morgan Kaufmann Publishers Inc. (1999)
  • [56] de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: TACAS’08. pp. 337–340. Springer (2008)
  • [57] Nori, A.V., Ozair, S., Rajamani, S.K., Vijaykeerthy, D.: Efficient synthesis of probabilistic programs. In: PLDI. pp. 208–217. ACM (2015)
  • [58] Quatmann, T., Dehnert, C., Jansen, N., Junges, S., Katoen, J.P.: Parameter synthesis for markov models: Faster than ever. In: ATVA. LNCS, vol. 9938, pp. 50–67 (2016)
  • [59] Quatmann, T., Jansen, N., Dehnert, C., Wimmer, R., Ábrahám, E., Katoen, J.P., Becker, B.: Counterexamples for expected rewards. In: FM. LNCS, vol. 9109, pp. 435–452. Springer (2015)
  • [60] Rodrigues, G.N., Alves, V., Nunes, V., Lanna, A., Cordy, M., Schobbens, P., Sharifloo, A.M., Legay, A.: Modeling and verification for probabilistic properties in software product lines. In: HASE. pp. 173–180. IEEE (2015)
  • [61] Rosenblum, D.S.: The power of probabilistic thinking. In: ASE. p. 3. ACM (2016)
  • [62] Sharma, V.S., Trivedi, K.S.: Quantifying software performance, reliability and security: An architecture-based approach. J. of Syst. & and Softw. 80(4), 493 – 509 (2007)
  • [63] Solar-Lezama, A., Jones, C.G., Bodik, R.: Sketching concurrent data structures. In: PLDI. pp. 136–148. ACM (2008)
  • [64] Solar-Lezama, A., Rabbah, R.M., Bodík, R., Ebcioglu, K.: Programming by sketching for bit-streaming programs. In: PLDI. pp. 281–294. ACM (2005)
  • [65] Solar-Lezama, A., Tancau, L., Bodik, R., Seshia, S., Saraswat, V.: Combinatorial sketching for finite programs. In: ASPLOS. pp. 404–415. ACM (2006)
  • [66] Stewart, W.J.: Probability, Markov Chains, Queues, and Simulation: the Mathematical Basis of Performance Modeling. Princeton Univ. Press (2009)
  • [67] Vandin, A., ter Beek, M.H., Legay, A., Lluch-Lafuente, A.: Qflan: A tool for the quantitative analysis of highly reconfigurable systems. In: FM. LNCS, vol. 10951, pp. 329–337. Springer (2018)
  • [68] Varshosaz, M., Khosravi, R.: Discrete time Markov chain families: modeling and verification of probabilistic software product lines. In: SPLC Workshops. pp. 34–41. ACM (2013)
  • [69] Vlassis, N., Littman, M.L., Barber, D.: On the computational complexity of stochastic controller optimization in POMDPs. ACM Trans. on Computation Theory 4(4), 12:1–12:8 (2012).
  • [70] Wimmer, R., Jansen, N., Ábrahám, E., Becker, B., Katoen, J.P.: Minimal critical subsystems for discrete-time Markov models. In: TACAS. LNCS, vol. 7214, pp. 299–314. Springer (2012)
  • [71] Wimmer, R., Jansen, N., Vorpahl, A., Ábrahám, E., Katoen, J.P., Becker, B.: High-level counterexamples for probabilistic automata. Logical Methods in Computer Science 11(1) (2015)
  • [72] Woodside, M., Petriu, D., Merseguer, J., Petriu, D., Alhaj, M.: Transformation challenges: from software models to performance models. J. Softw. & Syst. Modeling 13(4), 1529–1552 (2014)
  • [73] Wu, S., Smolka, S.A., Stark, E.W.: Composition and behaviors of probabilistic I/O automata. Theor. Comput. Sci. 176(1-2), 1–38 (1997)
  • [74] Zhou, W., Li, W.: Safety-aware apprenticeship learning. In: CAV’18. pp. 662–680. Springer (2018)

Appendix 0.A Program-level Counterexamples for CEGIS

This section reports on the relevant steps to instantiate an efficient implementation of the syntax-guided synthesis framework. First, available implementations of CE generation are too restricted in the variety of properties they support. Moreover, the embedding into a CEGIS-loop changes the focus of the CE generation. We motivate and report on a selection of changes. Finally, when moving from the analysis of a single model to a family of models, the well-foundedness criteria need reviewing, which we exemplify for three particularly important criteria.

0.a.1 Better CEs for CEGIS

Before diving into any changes, we recap the technique of [33] for computing traditional program-level CEs:

Definition 10 (High-level CEs [33])

Given a program and property s.t. , is a high-level CE if .

We have the following connection between high-level and program-level CEs.

Proposition 3

For safety properties, high-level and program-level CEs coincide.

Each program-level CE is trivially a high-level CE. A high-level CE is a program-level CE, as any additional command can only be enabled in unreachable or deadlock states. Otherwise, the program would contain overlapping commands, violating Def. 9. Recall that assuming only non-overlapping programs is crucial. Consider adding the command s=0 -> s’=2 to the program in Fig. (a)a. The probability to reach s=3 is now reduced to resulting in the extended program no longer violating . Unreachable states are irrelevant to any property and, intuitively speaking, adding transitions to deadlocks cannot decrease the probability to reach .

The technique in [33] computes minimal high-level CEs (i.e. the smallest set of commands) violating a given reachability property with an upper bound . The set is not unique in general. It reduces the computation to repeatedly solving MaxSat instances over two sets of propositional formulae, and . encodes a selection of commands and (via a negation) the MaxSat solver minimises the size of the command set. encodes required (but not sufficient) constraints on valid CEs, e.g, it states that a command enabled in the initial state must be selected. In this way, the MaxSat solver returns a candidate set of commands. Using standard procedures it is then checked whether is sufficient to exceed . If so, by construction is a minimal CE. Otherwise, is strengthened to exclude and (possibly) other candidate sets.

0.a.1.1 More properties

We provide support for program-level CEs beyond upper bounds on reachability probabilities. We deem support for liveness properties essential. Consider a sketch for a controller that moves a robot. A sketch that includes the option to wait together with the single objective that something bad may only happen with a small probability might simply never make any progress. While not doing anything typically induces safety, the synthesised controller is not useful. Likewise, performance criteria such as expected time to completion or expected energy consumption are widespread, and typically expressed as expected rewards.

Liveness properites

Proposition 3 is crucial for the correctness of our approach, but it does not