Hybrid Compositional Reasoning for Reactive Synthesis from Finite-Horizon Specifications

11/19/2019 ∙ by Suguman Bansal, et al. ∙ 0

LTLf synthesis is the automated construction of a reactive system from a high-level description, expressed in LTLf, of its finite-horizon behavior. So far, the conversion of LTLf formulas to deterministic finite-state automata (DFAs) has been identified as the primary bottleneck to the scalabity of synthesis. Recent investigations have also shown that the size of the DFA state space plays a critical role in synthesis as well. Therefore, effective resolution of the bottleneck for synthesis requires the conversion to be time and memory performant, and prevent state-space explosion. Current conversion approaches, however, which are based either on explicit-state representation or symbolic-state representation, fail to address these necessities adequately at scale: Explicit-state approaches generate minimal DFA but are slow due to expensive DFA minimization. Symbolic-state representations can be succinct, but due to the lack of DFA minimization they generate such large state spaces that even their symbolic representations cannot compensate for the blow-up. This work proposes a hybrid representation approach for the conversion. Our approach utilizes both explicit and symbolic representations of the state-space, and effectively leverages their complementary strengths. In doing so, we offer an LTLf to DFA conversion technique that addresses all three necessities, hence resolving the bottleneck. A comprehensive empirical evaluation on conversion and synthesis benchmarks supports the merits of our hybrid approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Reactive synthesis is the automated construction, from a high-level description of its desired behavior, of a reactive system that continuously interacts with an uncontrollable external environment [Church1957]. This declarative paradigm holds the promise of simplifying the task of designing provably correct reactive systems.

This work looks into the development of reactive synthesis from specifications in Linear Temporal Logic over finite traces (), or synthesis, for short. is a specification language that expresses rich and complex temporal behaviors over a finite time horizon [De Giacomo and Vardi2013]. This formalism has found application in specifying task plans in robotics [He et al.2017, Lahijanian et al.2015], safety-critical objectives [Zhu et al.2017a], business processes [Pesic, Bosnacki, and van der Aalst2010], and the like.

Seminal results have established that synthesis is 2-complete [De Giacomo and Vardi2015]. Since then, several undertakings have led to algorithmic solutions for synthesis [De Giacomo and Vardi2015, Camacho et al.2018]. The current state-of-the-art reduces synthesis to a reachability game played on a deterministic finite-state automaton, or DFA [Zhu et al.2017b]. The DFA is obtained by converting the input specification into a DFA that recognizes the same language. This conversion has been identified as a primary scalability bottleneck in synthesis [Zhu et al.2017b]. This is not surprising as the DFA is known to be double-exponential in the size of the specification in the worst case [Kupferman and Vardi1999]. In order to be effective for synthesis the conversion must, in addition to being time and memory performant, also prevent state-space explosion, as recent investigations have discovered that the efficiency of solving the game on a DFA is strongly affected by the size of the state space [Tabajara and Vardi2019]. This work contributes towards the development of -to-DFA conversion techniques that are aimed at advancing the scalability of synthesis.

Prior works on -to-DFA conversion have led to two contrasting algorithmic approaches. In the first approach [Zhu et al.2017b], the state-space of the DFA is represented explicitly, the construction is syntax driven, and the DFA is aggressively minimized. This approach first converts to an equivalent first-order-logic formula and then constructs a DFA for this formula using the tool [Henriksen et al.1995]. The algorithm first produces the binary syntax tree of the specification, then traverses the tree bottom-up while constructing the minimal DFA at each node. Consequently, it constructs the final DFA at the root of the tree in its canonical minimal form. Aggressive minimization can often prevent state-space explosion, as for many specifications arising from real-life situations the minimal DFAs are rarely more than exponential in the size of the specification, as opposed to double exponential [Tabakov, Rozier, and Vardi2012]. Yet, an exponential DFA might still be too large if the set of states is represented explicitly, and the overhead caused by aggressive DFA minimization grows rapidly with specification size.

The second approach, inspired by [Tabajara and Vardi2019], represents the DFA state space symbolically, uses a compositional construction, and avoids minimizing the DFAs. In compositional constructions, the specification is decomposed into multiple smaller sub-specifications for which explicit DFA conversion is tractable. These intermediate DFAs are then composed to get the final DFA. The symbolic representation encodes the state space of a DFA in a logarithmic number of bits, potentially achieving a polynomial representation even for an exponential-sized DFA, depending on the complexity of the DFA’s structure. The existing compositional approach takes advantage of this by representing the intermediate DFAs symbolically. In this case, the DFAs are composed by simply taking the symbolic product without performing minimization. The problem with this, however, is that each symbolic product results in a DFA with a larger state space than its minimal DFA, as no minimization is performed. When the number of symbolic products is large, the overhead in the size of the state space magnifies. Because of this, this approach ultimately produces a state space that is so enlarged that not even the succinct symbolic representation can compensate for the blow-up.

The key issue with both approaches is that their critical operation is effective at small scale but becomes inhibitory at large scale. Explicit approaches aggressively perform minimization, which is efficient on small DFAs but expensive on larger ones. Meanwhile, symbolic approaches perform symbolic products without minimization. While few symbolic products are manageable, too many products may lead to a large blow-up in the size of the state space.

This work proposes a novel compositional approach that is able to overcome the drawbacks of both existing approaches. Our approach utilizes a hybrid state-space representation, i.e., at different times it uses both the explicit and symbolic state representations for the intermediate DFAs. The core idea is to use explicit-state representation for the intermediate DFAs as long as minimization is not prohibitively expensive, and to switch over to symbolic state representation as soon as that occurs. This way, our hybrid-representation approach applies explicit state representation to small DFAs, and also delays the point at which switch-over to symbolic representation occurs, thus ensuring that fewer symbolic products have to be performed to generate the final DFA. Therefore, by finding a balance between the two representations, our hybrid appoach is able to extract their benefits and mitigate their weaknesses.

We have implemented our -to-DFA conversion algorithm, and its extension to synthesis via reachability games, in tools called and , respectively. A comprehensive empirical analysis reveals the merits of the proposed hybrid compositional approach on both DFA conversion and synthesis, as each tool outperforms the current state-of-the-art in runtime and memory consumption. In addition, the DFAs generated from have size comparable to the minimal DFA and significantly smaller than those obtained from pure symbolic-state methods.

2 Preliminaries

2.1 Linear Temporal Logic over finite traces

Linear Temporal Logic over finite traces ([Baier and McIlraith2006, De Giacomo and Vardi2013] extends propositional logic with finite-horizon temporal operators. In effect, is a variant of  [Pnueli1977] that is interpreted over a finite rather than infinite trace. The syntax of an formula over a finite set of propositions is identical to , and defined as . Here (Next), (Until), (Eventually), (Always) are temporal operators. The semantics of can be found in [De Giacomo and Vardi2013]. W.l.o.g., we assume that every formula is written as a conjunction of subformulas i.e. . The language of an formula , denoted by , is the set of finite words over that satisfy .

synthesis is formally defined as follows:

Definition 1 ( synthesis).

Let be an formula over where the set of input variables and output variables are two disjoint sets of propositions. We say is realizable if there exists a strategy such that for every infinite sequence of interpretations over , there exists such that satisfies . The problem of synthesis is to decide whether a given is realizable and to construct such a strategy if so.

Intuitively, synthesis can be perceived as a game between an external environment and the desired system that take turns to assign values to input and output propositions, respectively. The system responds to the environment inputs using the strategy . The game is won by the system if its strategy is able to guarantee that the resultant input-output sequence will satisfy formula after a finite number of turns. In our formulation of synthesis, like in [Tabajara and Vardi2019], the environment plays first. Alternatively, the system may play first [Zhu et al.2017b]. Solving the alternative formulation requires only slight changes to the algorithm presented in (§ 5). We adhere to the formulation in Definition 1 in this paper as our benchmarks assume that formulation and all tools being compared support it.

2.2 DFA and its representations

A deterministic finite automaton (DFA) [Thomas, Wilke, and others2002] is a tuple where is a finite set of symbols (called an alphabet), is a finite set of states, is the initial state, is the set of accepting states and is the transition function. A finite word has a run in if for all we have that and . A run is an accepting run in if . A word is in the language of , , if has an accepting run in . A DFA is said to be minimal if the language represented by that DFA cannot be represented by another DFA with fewer states.

Every formula over can be converted into a DFA with alphabet  [De Giacomo and Vardi2013] such that . If this DFA is constructed in a form that explicitly enumerates all DFA states, we call it an explicit-state representation. A DFA over the alphabet can also be compactly represented symbolically, by also encoding the state space using a logarithmic number of propositions. The symbolic-state representation of a DFA is a tuple . In this representation, are propositions encoding the state space , with , and their primed counterparts encode the next state. Each state corresponds to an interpretation over propositions . When representing the next state of the transition function, the same encoding is used for an interpretation over . Then, , and are Boolean formulas representing , and , respectively. is satisfied only by the interpretation of the initial state over . is satisfied by interpretations , and iff , where and are the states corresponding to and . Lastly, is satisfied by the interpretation over corresponding to state iff . The intersection of two DFAs and , denoted , is given by . In this paper, all Boolean formulas, including , and of a symbolic DFA, will be encoded using Reduced Ordered Binary Decision Diagrams (BDDs) [Bryant1986].

2.3 DFA game

A DFA game is a reachability game between two players, called the environment and the system, played over a DFA with alphabet . The environment player assigns values to the input variables , while the system assigns values to the output variables . The DFA game starts at the initial state of the DFA. At each round of the game, first the environment chooses an assignment to the variables, and then the system will choose an assignment to the variables. The combined assignment determines the unique state the game moves to according to the transition function of the DFA. The system wins the game if the game reaches an accepting state of the DFA. Solving a DFA game corresponds to determining whether there exists a strategy for the system to always win the game.

DFA games are known to be solvable in polynomial time with respect to the number of states [Mazala2002]. The algorithm determines if the initial state is a winning state, i.e., a state that is either accepting or from which, for every assignment to the variables, the system can always choose an assignment to the variables that leads to a winning state. More details will be given in (§ 5). If the initial state is a winning state, then there exists a winning strategy that can be represented by a Mealy machine that determines the output of the system given the current state and input. For more details, refer to [Tabajara and Vardi2019].

3 Related work

to DFA conversion

There are two commonly used approaches for the conversion currently. In the current state-of-the art approach, the formula is translated into first-order logic over finite traces, and then converted into a DFA by , a more general conversion tool from monadic second-order logic to DFA [Henriksen et al.1995]. The first synthesis tool utilizes this method for DFA generation [Zhu et al.2017b].

An alternative approach, used by the tool  [Duret-Lutz et al.2016], is to translate the formula into an formula with equivalent semantics, convert this formula into a Büchi automaton [Gerth et al.1995], and then transform this Büchi automaton into a DFA. Both approaches generate a DFA in explicit state-space representation.

DFA vs. NFA

NFAs are more general than DFAs. In fact, NFAs can be constructed from an formula in a single-exponential blow-up as opposed to the double-exponential blow-up incurred for DFA construction. Various approaches for -to-NFA with single-exponential blow-up have been described such as [Baier and McIlraith2006, De Giacomo and Vardi2015]. Yet, in practice, single exponential NFA conversion tools do not perform as well as DFA conversion tools. [Tabakov, Rozier, and Vardi2012] shows that minimal DFAs from formulas tend to be orders of magnitude smaller than their NFA counterparts constructed from implementations of the single-exponential algorithms.

synthesis

As aforementioned, current state-of-the-art tool  [Zhu et al.2017b] uses to construct an explicit-state DFA, then converts this DFA into a symbolic representation in order to solve the game using a symbolic fixed-point computation. The explicit-state DFA construction has been identified as the primary bottleneck to as the length of the formula increases. Therefore, recent attempts in synthesis have been made to avoid the explicit DFA construction. We describe these attempts below.

A recent approach attempted to avoid the full construction by instead decomposing the specification into conjuncts, then converting each conjunct to an individual DFA [Tabajara and Vardi2019]. Since these conjuncts are smaller formulas, their explicit-state DFAs can be constructed efficiently. The smaller DFAs are then converted into a symbolic representation and the game is solved over this decomposed symbolic representation. While the construction was indeed more efficient in terms of time and memory, the resulting DFA had a much larger state space. This severely decreased the performance of the game-solving algorithm, rendering a poorly scaling procedure for synthesis.

In another attempt to avoid DFA construction, novel synthesis techniques have been developed that operate on the more general NFA [Camacho et al.2018]. Here, synthesis is reduced to fully-observable nondeterministic (FOND) planning that is encoded from an NFA equivalent to the formula. Even here, the specification is decomposed into conjuncts, which are separately converted to NFAs and used to encode to FOND. Despite the generalization to NFAs, in practice FOND-based methods rely on DFA conversion tools since they are more competitive than existing NFA construction tools that incur a single-exponential blow up. Previous experiments suggest the FOND-based approach is complementary with the game-based approach, each being able to solve instances that the other cannot.

Compositional techniques in temporal synthesis

Both [Tabajara and Vardi2019] and [Camacho et al.2018] benefit from compositional techniques as they both decompose the input formula into conjuncts before construction of the respective automata. Application-specific decomposition has also been shown to lead to an orders-of-magnitude improvement in synthesis for robotics [He et al.2019].

A precedent for compositional techniques exists also in synthesis of over infinite traces, including in several state-of-the-art tools such as  [Meyer, Sickert, and Luttenberger2018] and  [Bohy et al.2012]. decomposes the formula semantically, i.e., it generates a subformula if it belongs to a restricted fragment of such as safety or co-safety . This way it benefits from constructing automaton using more efficient fragment-specific algorithms. On the other hand, decomposes the formula into conjuncts, which are each solved as a separate safety game. The final solution is obtained by composing solutions from the separate safety games.

4 Hybrid compositional DFA generation

This section describes the primary contribution of this work. We present a novel compositional approach for -to-DFA conversion. Our approach is based on using a hybrid-state representation, i.e., at different times it uses both explicit and symbolic-state representations for intermediate DFAs, as opposed to prior works in which only one of the two state-representations is used [Zhu et al.2017b, Camacho et al.2018, Tabajara and Vardi2019]. By diligent application of both representations, our hybrid approach is able to leverage their complementary strengths and render an algorithm that is not only competitive time- and memory-wise, but also generates DFAs with small number of states.

Our compositional approach is comprised of two phases, called the decomposition phase and the composition phase. In the decomposition phase, the input formula is first decomposed into smaller subformulas which are then converted into their equivalent DFAs using standard algorithms. In the composition phase, the intermediate DFAs are composed to produce the final DFA. We describe each phase for our hybrid approach in detail below. The formal description of our algorithm has been deferred to the Appendix.

4.1 Decomposition phase

The decomposition phase is the first step in our algorithm. This phase receives the formula as input. We make an assumption that the formula is given as the conjunction of multiple small subformulas, i.e., where each is an formula in itself. This assumption has been adopted as a standard practice in synthesis domains as large specifications arising from applications tend to exhibit this form [Filiot, Jin, and Raskin2010, Filiot, Jin, and Raskin2011].

We interpret formula as an -ary syntax tree as opposed to a binary-tree. Consequently, the input formula is decomposed into -subformulas . Then each of these subformulas is converted into its minimal DFA in explicit-state representation. This can be performed by an existing tool [De Giacomo and Vardi2013, Duret-Lutz et al.2016, Henriksen et al.1995, Kupferman and Vardi1999]. More advanced decomposition schemes could be adopted from [Camacho et al.2018].

The rationale behind this step is that existing explicit-state tools are efficient in generating minimal DFA for small formulas. Since the subformulas are typically small in length, we are able to benefit from existing literature in this step.

4.2 Composition phase

The composition phase receives the minimal DFAs for subformulas in the previous phase, which are represented with explicit states. Our goal in this phase is to construct a DFA corresponding to . In theory, this can be obtained by simply taking the intersection of DFAs . In practice, the intersection of DFAs may lead to state-space explosion since DFA intersection is done by performing their product construction. Therefore, the main focus of the composition phase is about how to efficiently construct the intersection without incurring state explosion. We discuss the salient features of our algorithm before describing it in detail.

Briefly speaking, we perform the composition of DFAs in iterations. In each iteration, two DFAs are selected based on a

dynamic smallest-first heuristic

, which will be described below, and removed from the set. A new DFA is formed by the product of the two selected DFAs. The new DFA will be minimized based on a selective DFA heuristic, which is also described below. The new DFA is then inserted back into the set. The new set is the input to the next iteration. This continues until only one DFA remains, which is presented as the final DFA. In the following, we denote by the set of DFAs at the -th iteration. Then , and where is the final output DFA.

In contrast to prior works which either use explicit states or symbolic states, the central feature of our algorithm is that it uses hybrid representation for DFAs, i.e., in different iterations all DFAs in are either represented in explicit- or symbolic-state form. Initially, all DFAs in are in explicit-state form. This continues while the DFAs in have a small number of states, since the product and minimization of DFAs are efficient for small DFAs with explicit-state representation. But as some DFAs in grow in size they require more memory and longer time to perform minimization. So, as soon as some DFA in reaches a large number of states, all DFAs in are converted into symbolic-state representation, in which the DFAs are represented more succinctly. By this time, hopefully, we are left with few DFAs in the set . Here onwards, all DFAs are represented in symbolic form until the end of the algorithm. Therefore, fewer DFAs in implies fewer symbolic products need to be performed, and hence limits the blow-up in state-space of the final DFA. This way, our algorithm balances the strengths of both approaches, mitigates their individual drawbacks, and efficiently generates a small DFA, if not the minimal.

We now describe the two heuristics, namely dynamic smallest-first composition of DFAs and selective DFA minimization abbreviated to DSF and SDM, respectively.

We first discuss DSF, which is used to decide which two DFAs should be composed in each iteration. We observe that the order in which intersection of DFAs is performed does not affect the correctness of the final DFA since both Boolean conjunction and DFA intersection are associative and commutative operations. In theory, we can design any criteria to select two DFAs to be composed at each iteration. In practice, a careless choice of the two DFAs may produce an unnecessarily large intermediate DFA that causes the algorithm to fail at the composition phase due to the large memory footprint. Therefore, we aim to find an order that can optimize time and space in the composition phase. To help with that we use DSF, which as the name suggests chooses the smallest two DFAs in each iteration. The DFAs with explicit states are chosen based on the number of states, while the DFAs with symbolic-state representation are chosen based on the number of nodes in the BDD representation of the transition function. The intuition behind this heuristic is that if the algorithm would fail on the composition of the smallest two DFAs in that iteration, then it would probably fail on the composition of all other pairs of DFAs as well.

Next we discuss SDM, which decides when it is beneficial to perform DFA minimization after the intersection of DFAs in each iteration. DFA minimization has been proved to be critical to the performance of DFA generation in [Henriksen et al.1995] as it helps in maintaining a smaller number of states, which is also one of our critical parameters. However, it is also an expensive operation. Currently, the best known complexity for minimization are and for explicit- and symbolic-state representations, respectively [Hopcroft1971, Wimmer et al.2006]. Therefore, there is a tension between reducing the number of states and achieving efficiency. To resolve this, we conducted an empirical study to evaluate the effect of minimization. We observed that in most cases, minimization reduces the number of states by 2-3 times. While this is significant when the states are represented explicitly, in symbolic-state representation this leads to a reduction in 1-2 state variables only. Therefore, we adhere to the SDM heuristic in which we minimize intermediate DFAs in explicit-state representation only. There are two advantages to this. First, since minimization is performed on explicit-state representation only, by virtue of our algorithm design this occurs only when the DFAs are small. For these, the time spent in minimization is so low that it is worth maintaining minimal DFAs. Second, by maintaining minimal DFAs in the explicit-form, the algorithm delays the switch over to symbolic form as the DFA sizes take longer to reach the thresholds. This leads to fewer symbolic products, which results in curbing the amount of blow-up in state-space.

A semi-formal description of the steps of the algorithm are given below. The complete formal description has been deferred to the Appendix.

Step 0. (Initial)

We are given input formula , and switch-over threshold values . The parameters and correspond to the thresholds for the numbers of states in an individual DFA and in the product of two DFAs, respectively, to trigger the symbolic representation.

Step 1. (Decomposition)

Construct the minimal DFA in explicit-state representation for all . Create the set .

Step 2. (Explicit-state composition)

For , let be the set of DFAs in the -th iteration.

If has only one DFA, return that as the solution.

Otherwise, if the DFAs in become too large, proceed to Step 3. Assume w.l.o.g. that and are the two DFAs chosen by the DSF heuristic. Let denote the number of states in a DFA represented in explicit-state form. If or , move to Step 3. Let be the iteration in which this occurs, i.e. when .

Otherwise, as per SDM, construct DFA by minimization of . Then, create for the next iteration, and repeat Step 2.

Step 3. (Change state representation)

Convert all DFAs in from explicit-state to symbolic-state representation, and proceed to Step 4. Note that the state space of each DFA is encoded symbolically using a different set of state variables , where all are disjoint. Since no more minimization occurs after this point, the total set of state variables defines the state space of the final DFA.

Step 4. (Symbolic-state composition)

For , let be the set of DFAs in the -th iteration.

If has only one DFA, return that DFA as the solution.

Otherwise, assume w.l.o.g. that and are the two DFAs chosen by the DSF heuristic. Construct . Recall that, since and are in symbolic form, we do not perform DFA minimization of . Create for the next iteration, and repeat Step 4.

5 synthesis

synthesis can be reduced to solving a DFA game played on the DFA corresponding to the formula  [De Giacomo and Vardi2015]. As explained in (§ 2.3), this amounts to computing the set of winning states. If the initial state of the DFA is in this set, then the formula is realizable and a winning strategy can be constructed, otherwise not.

In this section, we describe the winning set computation algorithm on a DFA game when its states are represented symbolically. This is a standard least-fixed point algorithm for reachability games with symbolic state space, and is similar to [Zhu et al.2017b, Tabajara and Vardi2019]. For sake of completion, we summarize the algorithm here.

Let be an formula over disjoint input and output propositions and , respectively, and be a symbolic DFA for . The DFA game is played on . In our case, this DFA is obtained from our hybrid compositional approach (§ 4), which we assume is in symbolic form, since explicit-state outputs can easily be converted to symbolic form.

To compute the winning set of , we compute the least-fixed point of a Boolean formula that denotes the set of states from which the system can win in at most steps of the DFA game. Initially, is the set of accepting states. At each iteration, the algorithm constructs from by adding those states from which the system is guaranteed to reach in one step. Formally,

where can be obtained from by substituting variables with . This continues until no more states can be added to , i.e., until it encounters the first index such that . Since the number of states in the DFA is finite, the algorithm is guaranteed to terminate. The initial state is present in the winning set, say , if holds. Details on winning-strategy construction has been deferred to [Tabajara and Vardi2019].

In this work, all Boolean formulas for and all are represented as BDDs. All boolean operations, quantification and variable substitution are available in standard BDD libraries. Finally, is a constant time operation in BDDs.

The complexity of solving a DFA game is polynomial in the size of the state space. Therefore, the efficiency of synthesis is heavily affected by the size of the constructed DFA. Therefore, as our hybrid compositional approach generates small (if not minimal) DFAs, these are suitable for synthesis, as witnessed also by our experimental evaluation.

6 Experimental evaluation

The goal of the empirical analysis is to examine the performance of our hybrid approach in -to-DFA generation and synthesis against existing tools and approaches.

6.1 Implementation details

Our hybrid compositional -to-DFA conversion procedure (§ 4) has been implemented in a tool called . has been extended to to perform synthesis using the winning strategy computation described in (§ 5).

takes an formula and switch-over thresholds , as inputs, and outputs a corresponding DFA with symbolic states. The output may not be minimal. For the same inputs, internally invokes , solves the DFA game given by ’s output, and returns whether the formula is realizable. If so, it can also return a winning strategy.

and have been written in C++. They employ  [Cohen et al.] as their BDD library for the symbolic representations and operations on DFAs, and take advantage of dynamic variable ordering for the BDDs.

To generate explicit-state minimal DFAs in the decomposition phase, uses  [Duret-Lutz et al.2016] and the -based method [Henriksen et al.1995]. It borrows the rich APIs from to conduct DFA intersection and minimization in the explicit-state composition phase. Per se, APIs are available for -automata (automata over infinite words). In order to use the API for operations over DFAs, stores intermediate explicit DFAs as a weak deterministic Büchi automata (wDBA) [Dax, Eisinger, and Klaedtke2007]. Intuitively, if the DFA accepts the language , then its wDBA accepts the language , where is a fresh variable not present in . The wDBA can be constructed from the DFA for by making the following changes (a) add a new state , (b) for each accepting state in the DFA, add a transition from that state to on , (c) add a transition from to itself on , (d) make the only accepting state in the wDBA. This automaton accepts a word iff its run visits infinitely often. Since wDBA is an -automaton, we use APIs for wDBAs to conduct intersection and minimization, both of which return a wDBA as output, in the similar complexity for those operations in a DFA [Dax, Eisinger, and Klaedtke2007, Kupferman2018]. Lastly, a wDBA for language can be easily converted back to a DFA for language .

6.2 Design and setup for empirical evaluation111Our code base and benchmarks have been provided with the artifact. We are unable to provide raw data as they amount to 1.7GB. We can provide them upon request via an anonymous link during author response period. 222Figures are best viewed online in color.

The evaluation has been designed to compare the performance of and to their respective existing tools and approaches. -to-DFA conversion tools are compared on runtime, number of benchmarks solved, hardness of benchmarks solved (size of minimal DFA) and the number of state variables in the output DFA. synthesis tools are compared on runtime and the number of benchmarks solved. We conduct our experiments on a benchmark suite curated from prior works, spanning classes of realistic and synthetic benchmarks. In total, we have 454 benchmarks split into four classes: random conjunctions (400 cases) [Zhu et al.2017b], single counters (20 cases), double counters (10 cases) and Nim games (24 cases) [Tabajara and Vardi2019]. More details on each class can be found in the Appendix.

# States in
the minimal
DFA
Number of benchmarks solved
Mona-
based
Lisa-
Explicit
Lisa
1K 111 123 137
5K 70 82 96
10K 48 60 74
50K 13 23 35
100K 8 16 26
250K 1 5 12
500K 0 2 4
750K 0 2 2
Size unknown 21**
Total solved 307 338 372
Table 1: DFA construction. Hardness of benchmarks is measured by the size of minimal DFA. **Note: There are 34 benchmarks that were solved only by . Of these, we were able to identify the size of the minimal DFA of 13 benchmarks using a symbolic DFA minimization algorithm [Wimmer et al.2006]. The 21 cases with unknown size are those that could not be minimized even after 24hrs with 190GB.
Figure 1: DFA construction. Cactus plot indicating number of benchmarks each tool can solve for a given timeout.

A good balance between explicit- and symbolic-representation of states is crucial to the performance of , i.e., it is crucial to carefully choose values of the switch-over thresholds and . Recall the switch is triggered if either the smallest minimal DFA has more than states, or if the product of the number of states in the two smallest minimal DFAs is more than . Intuitively, we want to be large enough that the switch is not triggered too soon but small enough that conversion of all DFAs from explicit- to symbolic-state representation is not too expensive. Threshold is closely related to how effective minimization is, and hence depends on the benchmark class. If the benchmark class is such that minimization reduces the DFA size by only 2-3 times, then we would set to be a low value. But if the class is such that minimization reduces DFA size by orders of magnitude, as it does for the Nim game class, we set to a higher value to take advantage of minimization. Currently, these are determined empirically. We set and to 800 and 300000, respectively, for the Nim-game class and to 800 and 2500, respectively, for all other classes.

For experiments on -to-DFA conversion, we compare to the current-state-of-the-art -based method  [Zhu et al.2017b, Camacho et al.2018] and two other derivations of . Recall the -based method is a syntax-driven, explicit-state based approach that returns minimal DFAs. The first derivation is - which is adapted from by setting . Therefore, it is a purely explicit-state compositional approach. Like the -based method, it also generates the minimal DFA, but unlike the former it uses the smallest-first heuristic. The second derivation is -, adapted from by setting . This corresponds to the compositional, symbolic-state approach referred to in (§ 1).

For experiments on synthesis, we compared to an enhanced version of (a tool that uses the -based method for DFA conversion) [Zhu et al.2017b] that we call , and the partitioned approach from [Tabajara and Vardi2019], referred to as . was created by enabling dynamic variable ordering in . This was necessary for a fair comparison as , unlike and , uses static variable ordering. We observed that shows upto 75% reduction in runtime compared to . Note that uses the same symbolic-state approach as - for constructing the DFAs, except that it skips the composition step, instead performing synthesis directly over the initial set of symbolic DFAs . Ultimately, it still suffers from the state-space explosion, only in this case it happens during the winning-state computation.

All experiments were conducted on a single node of a high-performance cluster. Each node consists of four quad-core Intel-Xeon processor running at 2.6 GHz. -to-DFA conversion experiments were run for 1 hour with 8 GB each, -synthesis experiments for 8 hours with 32 GB each.

Figure 2: DFA construction. Runtime for double-counter benchmarks. Plots touching black line means time/memout.
Figure 3: DFA construction. Number of variables needed to symbolically represent the DFA’s state-space for double-counter benchmarks. No bar indicates time/memout.

6.3 Observations

and - scale better to larger benchmarks

than the -based method, not just solving more total benchmarks but also being able to handle instances of larger scale (Table 1). Between - and , the former is more consistent in solving benchmarks with large minimal DFAs due to the DSF heuristic that enables low memory consumption in intermediate stages. Finally, solves benchmarks with even larger minimal DFAsas it is designed to combine minimal DFAs of explicit state- and succinctness of symbolic-state representation to solve larger formulas.

is the most efficient tool among all four options.

This is clear from the cactus plot in Fig. 1. The plot may seem to indicate that only has a slight advantage over -. But, on closer inspection we observe that - solves most random benchmarks but fares poorly on the realistic ones (see Fig 2). This is because they have more sub-specifications, resulting in a large number of symbolic products. The -based method is still the fastest in generating small DFAs (fewer than 50K states) but memouts soon due to explicit-state representation of DFAs. Finally, - is a close second but does not scale as well as due to minimization on very large DFAs. has been designed to overcome these deficiencies, and is supported by the current empirical evaluation as well.

mitigates state-space explosion.

Even though may not generate the minimal DFAs, we observe that in most cases the state-space of the final DFA produced by is one or two variables more than that of the minimal DFA. This is significantly lower than the number of state variables used by - (Fig. 3). Note that - fails to solve the double counter benchmarks for (Fig 2). Yet we know the number of state variables immediately after Step 3 (§ 4). Analyzing the benchmarks, we observed that they were split into 3-200 sub-formulas, yet only 1-3 symbolic products were conducted to construct the DFA. This demonstrates that our threshold-values are able to delay the switch-over to symbolic representations and reduce blow-up by the product. This is why the DFAs generated by have comparable sizes to the minimal DFAs. An important future work, therefore, is to design mechanisms to determine the switch-over thresholds at runtime as opposed to relyng on user-expertise to assign threshold values.

’s small DFAs improve synthesis performance.

We evaluate for synthesis on non-random benchmarks only, i.e., sequential counters and nim games. We chose to disregard random benchmarks as their winning set computation time is negligible, as in those benchmarks the fixed point is reached in 2-3 iteration irrespective of the DFA size. Figure 4-5 show that solves most benchmarks and is the most efficient tool. We observed that + fails because memouts early, while suffers from state-space explosion. is resilient to both as consumes low memory by virtue of symbolic representation and small state space.

The time consumed inside the winning set computation during synthesis depends on the number of iterations before the fixed-point is reached. Yet, so far not much focus has been given to optimizing this step as the DFAs generated so far have not been large enough for the number of iterations to become an issue. With ’s ability to construct large DFAs, we were able to observe that the single and double counter benchmarks can spend more than 90% of the time in the winning set computation, as the number of iterations is exponential in the number of bits (Appendix). This provides concrete evidence of the importance of investigating the development of faster algorithms for winning set computation to improve game-based synthesis.

Figure 4: Synthesis. Number of benchmarks synthesized from each non-random benchmark class.
Figure 5: Synthesis. Cactus plot (non-random benchmarks).

7 Concluding remarks

This work tackles the primary bottleneck in synthesis- to DFA conversion. The central problem addressed in this work is the efficient and scalable construction of DFAs with small state space from specifications, as a step to synthesis. To the best of our knowledge, ours is the first hybrid approach for DFA construction. Our approach combines explicit- and symbolic-state representations in a manner that effectively leverages their strengths and alleviates their individual shortcomings. Our empirical evaluations on DFA conversion and synthesis on and outperform the current states of the art, and demonstrate the merit of our hybrid approach. This indicates promise to further develop and explore hybrid approaches for automaton generation for other specification languages as well, and encourages similar investigations into the other building blocks in synthesis algorithms.

Acknowledgments

We thank A. Camacho, A. M. Wells and S. Zhu for their valuable inputs at different stages of the project. This work is partially supported by NSF grants IIS-1527668, CCF-1704883, IIS-1830549, the National Natural Science Foundation of China (Grant Nos. 61761136011, 61532019), the Guangdong Science and Technology Department (Grant No. 2018B010107004), and the Brazilian agency CNPq through the Ciência Sem Fronteiras program.

References

  • [Baier and McIlraith2006] Baier, J. A., and McIlraith, S. 2006. Planning with temporally extended goals using heuristic search. 342–345.
  • [Bohy et al.2012] Bohy, A.; Bruyère, V.; Filiot, E.; Jin, N.; and Raskin, J. 2012. Acacia+, a tool for LTL synthesis. In CAV, 652–657. Springer.
  • [Bouton1901] Bouton, C. L. 1901. Nim, a game with a complete mathematical theory. Annals of Mathematics 3(1/4):35–39.
  • [Bryant1986] Bryant, R. E. 1986. Graph-based algorithms for boolean function manipulation. Computers, IEEE Transactions on 100(8):677–691.
  • [Camacho et al.2018] Camacho, A.; Baier, J. A.; Muise, C.; and McIlraith, S. A. 2018. Finite LTL synthesis as planning. In ICAPS, 29–38.
  • [Church1957] Church, A. 1957. Applications of recursive arithmetic to the problem of circuit synthesis. Institute for Symbolic Logic, Cornell University.
  • [Cohen et al.] Cohen, H.; Whaley, J.; Wildt, J.; and Gorogiannis, N. BuDDy. http://sourceforge.net/p/buddy/.
  • [Dax, Eisinger, and Klaedtke2007] Dax, C.; Eisinger, J.; and Klaedtke, F. 2007. Mechanizing the powerset construction for restricted classes of -automata. In ATVA, 223–236. Springer.
  • [De Giacomo and Vardi2013] De Giacomo, G., and Vardi, M. Y. 2013. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI, 854–860.
  • [De Giacomo and Vardi2015] De Giacomo, G., and Vardi, M. 2015. Synthesis for LTL and LDL on finite traces. In IJCAI, 1558–1564.
  • [Duret-Lutz et al.2016] Duret-Lutz, A.; Lewkowicz, A.; Fauchille, A.; Michaud, T.; Renault, E.; and Xu, L. 2016. Spot 2.0 – a framework for ltl and -automata manipulation. In ATVA, 122–129. Springer.
  • [Filiot, Jin, and Raskin2010] Filiot, E.; Jin, N.; and Raskin, J.-F. 2010. Compositional algorithms for LTL synthesis. In ATVA, 112–127. Springer.
  • [Filiot, Jin, and Raskin2011] Filiot, E.; Jin, N.; and Raskin, J.-F. 2011. Antichains and compositional algorithms for LTL synthesis. Formal Methods in System Design 39(3):261–296.
  • [Gerth et al.1995] Gerth, R.; Peled, D.; Vardi, M. Y.; and Wolper, P. 1995. Simple on-the-fly automatic verification of linear temporal logic. In PSTV, 3–18. Springer.
  • [He et al.2017] He, K.; Lahijanian, M.; Kavraki, L. E.; and Vardi, M. Y. 2017. Reactive synthesis for finite tasks under resource constraints. In Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on, 5326–5332. IEEE.
  • [He et al.2019] He, K.; Wells, A. M.; Kavraki, L. E.; and Vardi, M. Y. 2019. Efficient symbolic reactive synthesis for finite-horizon tasks. In ICRA, 8993–8999. IEEE.
  • [Henriksen et al.1995] Henriksen, J. G.; Jensen, J.; Jørgensen, M.; Klarlund, N.; Paige, R.; Rauhe, T.; and Sandholm, A. 1995. Mona: Monadic second-order logic in practice. In TACAS, 89–110. Springer.
  • [Hopcroft1971] Hopcroft, J. 1971. An algorithm for minimizing states in a finite automaton. In Theory of machines and computations. Elsevier. 189–196.
  • [Jobstmann and Bloem2006] Jobstmann, B., and Bloem, R. 2006. Optimizations for LTL synthesis. In 2006 Formal Methods in Computer Aided Design, 117–124. IEEE.
  • [Kupferman and Vardi1999] Kupferman, O., and Vardi, M. Y. 1999. Model checking of safety properties. In Proc. of CAV, 172–183. Springer.
  • [Kupferman2018] Kupferman, O. 2018. Automata theory and model checking. In Handbook of Model Checking. 107–151.
  • [Lahijanian et al.2015] Lahijanian, M.; Almagor, S.; Fried, D.; Kavraki, L. E.; and Vardi, M. Y. 2015. This time the robot settles for a cost: A quantitative approach to temporal logic planning with partial satisfaction. In AAAI, 3664–3671.
  • [Mazala2002] Mazala, R. 2002. Infinite games. In Automata logics, and infinite games. Springer. 23–38.
  • [Meyer, Sickert, and Luttenberger2018] Meyer, P. J.; Sickert, S.; and Luttenberger, M. 2018. Strix: Explicit reactive synthesis strikes back! In CAV, 578–586. Springer.
  • [Pesic, Bosnacki, and van der Aalst2010] Pesic, M.; Bosnacki, D.; and van der Aalst, W. M. P. 2010. Enacting declarative languages using LTL: avoiding errors and improving performance. In International SPIN Workshop on Model Checking of Software, 146–161. Springer.
  • [Pnueli1977] Pnueli, A. 1977. The temporal logic of programs. In Proc. of FOCS, 46–57. IEEE.
  • [Tabajara and Vardi2019] Tabajara, L. M., and Vardi, M. Y. 2019. Partitioning techniques in LTLf synthesis. In IJCAI, 5599–5606.
  • [Tabakov, Rozier, and Vardi2012] Tabakov, D.; Rozier, K. Y.; and Vardi, M. Y. 2012. Optimized temporal monitors for SystemC. Formal Methods in System Design 41(3):236–268.
  • [Thomas, Wilke, and others2002] Thomas, W.; Wilke, T.; et al. 2002. Automata, logics, and infinite games: A guide to current research, volume 2500. Springer Science & Business Media.
  • [Wimmer et al.2006] Wimmer, R.; Herbstritt, M.; Hermanns, H.; Strampp, K.; and Becker, B. 2006. Sigref–a symbolic bisimulation tool box. In Proc. of ATVA, 477–492. Springer.
  • [Zhu et al.2017a] Zhu, S.; Tabajara, L. M.; Li, J.; Pu, G.; and Vardi, M. Y. 2017a. A symbolic approach to safety LTL synthesis. In Haifa Verification Conference, 147–162. Springer.
  • [Zhu et al.2017b] Zhu, S.; Tabajara, L. M.; Li, J.; Pu, G.; and Vardi, M. Y. 2017b. Symbolic LTLf synthesis. In IJCAI, 1362–1369. AAAI Press.

Appendix

: Hybrid compositional DFA generation

The DFA-construction algorithm used in is described in (§ 4). We give its psuedo-code here in Algorithm 1.

To recall, is split into two phases. First is the decomposition phase, which splits the formula into smaller subformulas and converts each subformula into its minimal DFA in explicit-state representation (Line 2-6). Second is the composition phase (Line 2 onwards), which begins by performing explicit-state composition (Line 8-20). When explicit-state composition becomes prohibitive (condition in Line 11), all DFAs are converted into symbolic-state representation (Line 22-26). Finally, after this, symbolic composition is conducted (Line 28-32). Crucial heuristics adopted in the algorithm are dynamic smallest-first and selective DFA minimization.

We implement the smallest-first heuristic using a priority queue, as using this data structure we can efficiently obtain the smallest elements in the collection. Priority queues and store DFAs in explicit and symbolic representations, respectively. The priority queues are implemented such that they give higher priority to ones with fewer number of states and fewer number of BDD nodes in the transition relation, respectively.

Theorem 1.

Let be threshold values. Given an formula , returns a DFA for with symbolic state.

1:  // Decomposition phase
2:  
3:  
4:  for  do
5:     
6:     
7:  // Composition phase
8:  // Explicit-state composition
9:  
10:  
11:  while  and  do
12:     )
13:     
14:     
15:     if  then
16:        
17:        
18:        return  
19:     
20:     
21:  // Change state representation
22:  
23:  while  do
24:     
25:     
26:     
27:  // Begin symbolic composition
28:  while  do
29:     
30:     
31:     
32:     
33:  
34:  return  
Algorithm 1
Input: formula ,
DFA size threshold values
Output: A DFA for with symbolic states

Benchmark descriptions

We evaluate on a set of 454 benchmarks split into four classes:

Randomly generated.

We adopt the random formula generation procedure from literature [Zhu et al.2017b, Camacho et al.2018]. For a length parameter , it selects base cases from a pool of benchmarks interpreted with semantics [Jobstmann and Bloem2006], takes their conjunction and renames propositions so that they are shared across conjuncts. It must be noted that a large value of does not guarantee a larger minimal DFA.

In our experiments, ranges from 3-10. For each , we create 50 benchmarks, adding up to 400 random benchmarks.

Single counter/Double counters.

These benchmarks represent games played over binary counters, and are parameterized by the number of bits  [Tabajara and Vardi2019]. These benchmarks can have either a single counter, which the system must increment when signaled by the environment, or two counters, one controlled by each player, where the goal of the system is to reach the value in the environment counter. In these cases, larger results in a larger minimal DFA.

In our experiments ranges from 1-20 and 1-10 for single and double-counter benchmarks, respectively.

Nim game.

These benchmarks model a generalized version of the game of Nim [Bouton1901] with heaps and tokens per heap, taken from [Tabajara and Vardi2019].

We create a total of 24 such benchmarks.

Experimental evaluation

Optimizing winning-set computation is the next challenge in synthesis.

Fig. 6 shows that most of the time spent in synthesis for the double-counter benchmarks was spent in the winning-set computation. In fact, for counters with bits, we observed and can show that the single- and double-counter benchmark will take and iterations to reach the fixed point in the winning-set computation.

Winning-set computation takes almost 100% of the time when since for those cases DFA construction requires less than one tenth of a millisecond to solve. As a consequence, winning-set computation takes very long in comparison.

The plot in Fig. 6 is based on runtimes from . Although similar behavior was observed with +, it solves fewer benchmarks, therefore we chose our tool to plot this graph. It must be noted that without dynamic variable ordering did not scale as far as .

Figure 6: Synthesis. Percentage of time spent in winning set computation for the sequential counters.
Benchmark name +
DFA () WS Total DFA () WS Total
Single counters
counter-1 0.0845936 0.0063098 0.0909034 0.0 4.7624e-05 4.7624e-05
counter-2 0.0905504 0.0077225 0.0982729 0.01 0.000124862 0.010124862
counter-3 0.0831357 0.0196413 0.102777 0.01 0.000266569 0.010266569
counter-4 0.0818164 0.0125855 0.0944019 0.01 0.000994192 0.010994192
counter-5 0.113252 0.067858 0.18111 0.02 0.00242963 0.02242963
counter-6 0.104034 0.149737 0.253771 0.03 0.0172029 0.0472029
counter-7 0.163252 0.463409 0.626661 0.02 0.0793083 0.0993083
counter-8 0.120607 1.477873 1.59848 0.03 0.386659 0.416659
counter-9 0.260158 7.952932 8.21309 0.07 2.22855 2.29855
counter-10 0.383924 40.069576 40.4535 0.09 9.22808 9.31808
counter-11 0.853298 67.246602 68.0999 0.22 50.9045 51.1245
counter-12 1.83989 276.44511 278.285 0.75 270.575 271.325
counter-13 4.77399 474.14001 478.914 1.48 2378.66 2380.14
counter-14 11.19 3.25 6439.63 6442.88
counter-15 9.93 28250.9 28260.83
counter-16 43.01
counter-17 174.04
counter-18 191.31
counter-19 2062.73
counter-20 17499.8
Double counters
counters-1 0.0372115 0.0088049 0.0460164 0.0 7.9756e-05 7.9756e-05
counters-2 0.0684606 0.0199104 0.088371 0.01 0.00051927 0.01051927
counters-3 0.140466 0.130129 0.270595 0.03 0.00467434 0.03467434
counters-4 0.206383 0.852017 1.0584 0.03 0.0713502 0.1013502
counters-5 0.488736 11.261164 11.7499 0.05 0.61636 0.66636
counters-6 2.71628 160.86572 163.582 0.44 7.42408 7.86408
counters-7 13.768 1871.902 1885.67 0.68 80.2925 80.9725
counters-8 2.16 1032.24 1034.4
counters-9 25.11 12372.5 12397.61
counters-10 12885
Nim benchmarks
nim-1-1 0.165076 0.010547 0.175623 0.08 4.133e-05 0.08004133
nim-1-2 0.19 0.000140955 0.190140955
nim-1-3 0.06 0.000197191 0.060197191
nim-1-4 0.11 0.000382304 0.110382304
nim-1-5 0.22 0.000372384 0.220372384
nim-1-6 0.43 0.000530033 0.430530033
nim-1-7 0.82 0.00065162 0.82065162
nim-1-8 1.38 0.000906586 1.380906586
nim-2-1 0.08 0.000329557 0.080329557
nim-2-2 0.46 0.000856596 0.460856596
nim-2-3 2.67 0.00185017 2.67185017
nim-2-4 13.43 0.00650035 13.43650035
nim-2-5 52.9 0.0162642 52.9162642
nim-2-6 170.92 0.0330554 170.9530554
nim-2-7 497.52 0.0535121 497.5735121
nim-2-8 1257.86 0.106837 1257.966837
nim-3-1 1.09 0.000209125 1.090209125
nim-3-2 27.15 0.00700274 27.15700274
nim-3-3 393.38 0.0396764 393.4196764
nim-3-4 3820.42 10.8698 3831.2898
nim-4-1 29.32 0.00661889 29.32661889
nim-4-2
nim-5-1
nim-5-2
Table 2: Runtime chart comparing runtimes of + and , and the time taken in DFA construction by and , respectively, inside the respective synthesis tool. Runtime are given in seconds. – indicates time/memout. Timeout = .

outperforms the current state-of-the-art +.

This is clear from Table 2. The main reason + fails to solve a large number of benchmarks is that fails to generate the DFAs for larger inputs. For example, failed to construct the DFA for single counters and double counters for and , where is the number of bits. On the contrary, is able to generate the DFAs for almost all benchmarks.

There are cases in Table 2 where has generated the DFA but times-out. This is because the winning set computation did not terminate on those cases. Recall, the number of iterations for the counter benchmarks grows exponentially with the number of bits. These cases will be solved by as long as enough time is given to conduct all iterations.

Even for the benchmarks that are solved by both tools + and , shows lower runtime. Note that for both tools the number of iterations taken to compute the winning set is the same. This may indicate that the time taken for winning set computation for each iteration in + takes longer that . However, currently we do not have concrete evidence to back-up this conjecture. We leave this to future work as it may also lead to a better understanding of how to improve the winning-set computation.