1 Introduction
Verification of infinitestate systems has been an important area of research in the past few decades. In the late 1990s and early 2000s, an important stride advancing the verification of infinitestate systems was made when an elegant, simple, but powerful framework for modelling and verifying infinitestate systems, dubbed
regular model checking (e.g. [1, 25, 12, 2, 3, 27, 46]), was developed.Regular model checking, broadly construed, is the idea of reasoning about the infinitestate systems using regular languages as symbolic representations. This means that configurations of the infinite systems are encoded as finite words over some finite alphabet , while other important infinite sets (e.g. of initial and final configurations) will be represented as regular languages over . The transition relation of the system is, then, represented a finitestate transducer of some sort.
Example 1
As a simple illustration, we have a unidirectional token passing protocol with processes arranged in a linear array. Here is a parameter, regardless of whose value (so long as it is a positive integer) the correctness property has to hold. This is also one reason why such systems are referred to as parameterized systems. Multiple tokens might exist at any given time, but at most one is held by a process. At each point in time, a process holding a token can pass it to the process to its right. If a process holding a token receives a token from its left neighbor, then it discards one of the two tokens. Each configuration of the system can be encoded as a word over , where (resp. ) denotes that process holds (resp. does not hold) a token. The set of all configurations is, therefore, , i.e., a regular language. Various correctness properties can be mentioned for this system. An example of the safety property is that if the system starts with a configuration in (i.e. with only one token), then it will never visit a configuration in (i.e. with at least two tokens). An example of a liveness property is that it always terminates with configurations in the regular set . ∎
This basic idea of regular model checking was already present in the work of Pnueli et al. [27] and Boigelot and Wolper [46]. The term “regular model checking” was coined by Abdulla et al. [12]. A lot of the initial work in regular model checking focussed on developing scalable algorithms (mostly via acceleration and widening) for verifying safety, while unfortunately going beyond safety (e.g. to liveness) posed a significant challenge; see [3, 44]. It is now 20 years since the publication of the seminal paper [12] on regular model checking. The area of computeraided verification has undergone some paradigm shifts including the rise of SATsolvers and SMTsolvers (e.g. see the textbooks [13, 28]), as well synthesis algorithms [5]. In the meantime, regular model checking was also affected by this in some fashion. In 2013 Neider and Jansen [36] proposed an automata synthesis algorithm for verifying safety in regular model checking using SATsolvers to guide the search of an inductive invariant. This new way of looking at regular model checking has inspired a new class of regular model checking algorithms, which could solve old regular model checking benchmarks that could not be solved automatically by any known automatic techniques (e.g. liveness, even for probabilistic distributed protocols [34, 30]), as well as new correctness properties (e.g. safety games [37] and probabilistic bisimulation with applications to proving anonymity [24]). Despite these recent successes, these techniques are rather adhoc, and often difficult to adapt to new correctness properties.
Contributions.
We provide a new and clean reformulation of regular model checking inspired by deductive verification. More precisely, we show how to express RMC as satisfaction of existential secondorder logic (ESO) over automatic structures. Among others, this new framework puts virtually all interesting correctness properties (e.g. safety, liveness, safety games, bisimulation, etc.) in regular model checking under one broad umbrella. We provide new automata synthesis algorithms for solving any regular model checking that is expressed in this framework.
In deductive verification, we encode correctness properties of a program as formulas in some (firstorder) logic, commonly called verification conditions, and then check the conditions using a theorem prover. This approach provides a clean separation of concerns between generating and checking “correctness proofs,” and underlies several verification methodologies and systems, for instance in deductive verification (with systems like Dafny [29] or Key [4]) or termination checkers (e.g., AProVE [21] or T2 [14]). For practical reasons, the most attractive case is of course the one where all verification conditions can be kept within decidable theories. We propose to use firstorder logic over universal automatic structures [9, 10, 8, 15] for the decidable theories expressing the verification conditions. Furthermore, we show that the correctness properties can be shown as satisfactions of ESO formulas over automatic structures, where the secondorder variables express the existence of proofs such that the verification conditions are satisfied. Finally, we show that restricting to regular proofs (i.e. proofs that can be expressed by finite automata) is sufficient in practice, and allows us to have powerful verification algorithms that unify the recent successful automata synthesis algorithms [36, 34, 30, 24] for safety, liveness, reachability games, and other interesting correctness properties.
Organization.
Section 2 contains preliminaries. We provide our reformulation of regular model checking in terms of existential secondorder logic (ESO) over automatic structures in Section 3. We provide a synthesis algorithm for solving formulas in ESO over automatic structures in Section 4. We conclude in Section 5 with research challenges.
2 Preliminaries
2.1 Automata
We assume basic familiarity with finite automata (e.g. see [40]). We use to denote a finite alphabet. In this paper, we exclusively deal with automata over finite words, but the framework and techniques extend to other classes of structures (e.g. trees) and finite automata (e.g. finite tree automata). An automaton over is a tuple , where is a finite set of states, is the transition relation, is the initial state, and is the set of final states. In this way, our automata are by default assumed to be nondeterministic. The notion of runs of on an input word is standard (e.g. see, i.e., a function so that , , and the transition relation is respected. We use to denote the language (i.e. subset of ) accepted by .
2.2 Regular Model Checking
Regular Model Checking (RMC) is a generic symbolic framework for modelling and verifying infinitestate [25, 12, 3]. The basic principle behind the framework is to use finite automata to represent an infinitestate system, and witnesses for a correctness property. For example, an infinite set of states can be represented as a regular language over . How do we represent a transition relation ? In the basic setting (as described in the seminal papers [25, 12]), we can use lengthpreserving transducers for representing . A lengthpreserving transducer is simply an automaton over the alphabet . Given an input tuple , an acceptance of by is defined to be the acceptance of the “product” word by the automaton . In this way, a transition relation can now be represented by an automaton.
In this paper, we will deal mostly with systems whose transition relations can be represented by lengthpreserving transducers. This is not a problem in practice because this is already applicable for a lot of applications, including reasoning about distributed algorithms (arguably the most important class of applications of RMC), where the number of processes is typically fixed at runtime. That said, we will show how to easily extend the definition to nonlengthpreserving relations (called automatic relations [9, 10, 8, 15]
) since they are needed in our decidable logic. This is done by the standard trick of padding the shorter strings a special padding symbol. More precisely, given two words
and , we define the convolution to be the word (where and ) such that , for all (for , ), and for all (for all , ). For example, is the word . Whether is accepted by now is synonymous with acceptance of by . In this way, transition relations that relate words of different lengths can still be represented using finite automata.2.3 WeaklyFinite Systems
In this paper, we will restrict ourselves to transition systems that systems whose domain is a regular subset of , and whose transition relations can be described by lengthpreserving transducers. That is, since is finite, from any given configuration of the system there is a finite number of configurations that are reachable from (in fact, there is at most reachable configurations). Such transition systems (which can be infinite, but where the number of reachable configurations from any given configuration is finite) are typically referred to as weaklyfinite systems [19]. As we previously mentioned, this restriction is not a big problem in practice since many practical examples (including those from distributed algorithms) can be captured. The restriction is, however, useful when developing a clean framework that is unencumbered by a lot of extra assumptions, and at the same time captures a a lot of interesting correctness properties.
2.4 Existential SecondOrder Logic
In this paper, we will use Existential SecondOrder Logic (ESO) to reformulate RMC. Secondorder Logic (e.g. see [31]) is an extension of firstorder logic by quantifications over relations. Let be a vocabulary consisting of relations (i.e. relational vocabulary). A relational variable will be denoted by capital letters , etc. Each relational variable has an arity . ESO over is simply the fragment of secondorder logic over consisting of formulas of the form
where is a firstorder logic over the vocabulary , where is a relation symbol of arity . Given a structure over and an ESO formula (as above), checking whether amounts to finding relations over the domain of such that is satisfied (with the standard definition of firstorder logic); in other words, extending to a structure over such that .
3 RMC as ESO Satisfaction over Automatic Structures
As we previously described, our new reformulation of RMC is inspired by deductive verification, which provides a separation between generating and checking correctness proofs. The verification conditions should be describable in decidable logical theories. As a concrete example, suppose we want to prove a safety property for a program . Then, a correctness proof would be a finitelyrepresentable inductive invariant that contains all initial states of , and is disjoint from the set of all bad states of . The termination of a program can similarly be proven by finding a wellfounded relation that subsumes the transition relation of a program. In both cases, a correctness proof corresponds to a solution for existentially quantified secondorder variables that encode the desired correctness property; in the spirit of Section 2.4, the correctness of a proof can be verified by evaluating just the firstorder part of a formula. The generation of the candidate proofs will then be taken care of separately, which we will talk about in the next section. Suffice to say for now that the counterexample guided inductive synthesis (CEGIS) framework [5] would be appropriate for the proof generation. In this section, we provide a reformulation of RMC in the aforementioned framework for software verification.
3.1 Automatic Structures
What is the right decidable theory to capture regular model checking? We venture that the answer is the firstorder theory of an automatic structure [9, 10, 8, 15]. An automatic structure over the vocabulary consisting of relations with arities is a structure whose universe is the set of all strings over some finite alphabet , and where each relation is regular, i.e., the set is regular. The following wellknown closure and algorithmic property is what makes the theory of automatic structures appealing.
Theorem 3.1
There is an algorithm which, given a firstorder formula and an automatic structure over the vocabulary , computes a finite automaton for consisting of tuples of words, such that .
The algorithm is a standard automata construction (e.g. see [41] for details), which is in fact so similar to the standard automata construction from the weak secondorder theory of one successor [22]. [In fact, firstorder logic over automatic structures can be encoded to (and vice versa) to weak secondorder theory of one successor via the socalled finite set interpretations [18], which would allow us to use tools like MONA to check firstorder formulas over automatic structures.]
Automatic structures are extremely powerful. We can encode the linear integer arithmetic theory as an automatic structure [15]. In fact, we can even add the predicate (where iff divides and for some natural number ) to , while still preserving decidable. This essentially implies that ESO over automatic structures is undecidable; in fact, this is the case even when formulas are restricted to monadic predicates.
We are now ready to describe our framework for RMC in ESO over automatic structures:

Specification:
Express the verification problem as a formulain ESO over automatic structures.

Specification Checking:
Search for regular witnesses for that satisfy .
Note that while the specification (Item 1) would provide a complete and faithful encoding of the verification problem, our method for checking the specification (Item 2) restricts to regular proofs. It is expected that this is an incomplete proof rule, i.e., for to be satisfied, it is not sufficient in general to restrict to regular relations. Therefore, two important questions arise. Firstly, how expressive is the framework of regular proofs? Numerous results suggest that the answer is that it is very expressive. On the practical side, many benchmarks (especially from paramterized systems) have indicated this to be the case, e.g., see [36, 17, 34, 30, 37, 3, 44, 38, 24, 33]
. On the theoretical side, this framework is in fact complete for important properties like safety and liveness for many classes of infinitestate systems that can be captured by regular model checking, including pushdown systems, reversalbounded counter systems, twodimensional vector addition systems, communicationfree Petri nets, and treerewrite systems (for the extension to trees), among others, e.g., see
[41, 42, 7, 32, 23, 35]. In addition, the restriction to regular proofs is also attractive since it gives rise to a simple method to enumerate all regular proofs that check . This naive method would not work in practice, but smart enumeration techniques of regular proofs (e.g., using automata learning and CEGIS) are available, which we will discuss in the Section 4.3.2 Safety
We start with the most straightforward example: safety. We assume that our transition system is represented by a lengthpreserving system with domain and a transition relation given by a lengthpreserving transducer. Furthermore, we assume that the system contains two regular languages , representing the set of initial and bad states. As we mentioned earlier in this section, safety amounts to checking the existence of an invariant that contains but is disjoint from . That is, the safety property holds iff there exists a set such that:



is an inductive, i.e., for every configuration , if , then .
The above formulation immediately leads to a firstorder formula over the vocabulary of . Therefore, the desired ESO formula over the original vocabulary (i.e. ) is
Example 2
Fix . Consider the transition relation generated by the regular expression . Intuitively, nondeterministically picks a substring 10 in an input word and rewrites it to 01. Let and . Observe that there is a regular proof for this safety property: . Note that this is despite the fact that in general is not a regular set.
3.3 Liveness
A second class of properties are liveness properties, for instance checking whether a program is guaranteed to terminate, guaranteed to answer requests eventually, or guaranteed to visit certain states infinitely often. In the context of RMC, liveness has been studied a lot less than safety, and methods sucessful for proving safety usually do not lend themselves to an easy generalisation to liveness.
We consider the case of program termination. As before, we assume that a transition system is defined by a domain , a transition relation , and a set of initial states. Proving termination amounts to showing that no infinite runs starting from a state in exist; to this end, we can search for a pair consisting of an inductive invariant and a wellfounded ranking relation:

;

is inductive (as in Section 3.2);

the relation covers the reachable transitions: ;

is transitive: and imply ;

is irreflexive: for every .
The last two conditions ensure that is a strict partial order, and therefore is even wellfounded on fixedlength subsets of the domain. All five conditions can easily be expressed by a firstorder formula over the relations . Now, for lengthpreserving relations , expressing in firstorder logic that a transitive relation is wellfounded is simple: it is not the case that there are words such that and . This “lasso” shape is owing to the fact that in every finite system every infinite path always leads to one state that is visited infinitely often. In summary, termination of a system is therefore captured by the following ESO formula:
where is the firstorder part that encodes the aforementioned verification conditions.
Example 3
We consider here the same example as Example 2, but we instead want to prove termination. It is quite easy to see that every configuration will always lead to a configuration of the form , which is a dead end. Termination of the system can be proven using the trivial inductive invariant , and a lexicographic ranking relation , represented as a transducer with two states and shown in Fig. 1. Using the algorithms proposed in Section 4, this ranking relation can be computed fully automatically in a few milliseconds.
3.4 Winning Strategies for TwoPlayer Games on Infinite Graphs
We only need to slightly modify the ESO formula for program termination, given in the previous section, to reason about the existence of winning strategies in a reachability game. Instead of a single transition relation , for a twoplayer game we assume that two relations are given, encoding the possible moves of Player 1 and Player 2, respectively. A reachability game starts in any configuration in the set . The players move in alternation, with Player 2 winning if the game eventually reaches a configuration in , whereas Player 1 wins if the game never enters . The first move in a game is always done by Player 1.
As in the previous section, we formulate the existence of a winning strategy for Player 2 (for any initial configuration in ) in terms of a pair of relations. The set now represents the possible configurations that Player 1 visits during games, whereas the ranking relation expresses progress made by Player 2 towards the region .

;

is transitive and irreflexive (as in Section 3.3);

Player 2 can force the game to progress: for every , and every move of Player 1 with , there is a move of Player 2 such that and .
It is again easy to see that all conditions can be expressed by a firstorder formula over the relations , and the existence of a winning strategy as an ESO formula:
A similar encoding has been used in previous work of the authors to reason about almostsure termination of parameterised probabilistic systems [34, 30]. In this setting, the two players characterise nondeterminism (demonic choice, e.g., the scheduler) and probabilistic choice (angelic choice, e.g., randomisation).
Example 4
We consider a classical takeaway game [20] with two players. In the beginning of the game, there are chips on the table. In alternating moves, with Player 1 starting, each player can take 1, 2, or 3 chips from the table. The first player who has no more chips to take loses. It can be observed that Player 2 has a winning strategy whenever the initial number is a multiple of .
Configurations of this game can be modelled as words , in which the first letter ( or ) indicates the next player to make a move, and the number of s represents the number of chips left. To prove that Player 2 can win whenever , we choose as the initial states, and , i.e., we check whether Player 2 can move first to a configuration in which no chips are left. The transitions of the two players are described by the regular expressions
The witnesses proving that Player 2 indeed has a winning strategy are shown in Fig. 3 and Fig. 3, respectively. The ranking relation in Fig. 3 is similar to the one proving termination in Example 3, and expresses that the number of s is monotonically decreasing. The invariant in Fig. 3 expresses that Player 2 should move in such a way that the number of chips on the table remains divisible by ; and in combination encode the strategy that Player 2 should follow to win. The witness relations were found by the tool SLRP, presented in [34], in around 3 seconds on an Intel Core i5 computer with 3.2 GHz.
3.5 Isomorphism and Bisimulation
We now describe how we can compare the behaviour of two given systems described by lengthpreserving transducers. There are many natural notions of “similarity”, but we target isomorphism, bisimulation, and probabilistic bisimulation (or variants thereof). All of these are important properties since they show indistinguishability of two systems, which are applicable to proving anonymity, e.g., in the case of the Dining Cryptographer Protocol [16]. Isomorphism can also be used to detect symmetries in systems, which can be used to speed up regular model checking [33]. Here, we only describe how to express isomorphism of two systems. Encoding bisimulation and probabilistic bisimulation for parameterized systems is a bit trickier since we will need infinitely many action labels (i.e. to distinguish the action of the th process), but this can also be encoded in our framework; see the firstorder proof rules over automatic structures in the recent paper [24].
We are given two systems , , whose domains and whose transition relations and are described by transducers. We would like to show that and are the same up to isomorphism. The desired ESO formula is of the form
where says that describes the desired isomorphism between and . To this end, we will first need to say that is a bijective function. This can easily be described in firstorder logic over the vocabulary . For example, is a function can be described as
Note that is describable by a simple transducer, so this is a valid firstorder formula over automatic structures. We then need to add some more conjuncts in saying that is a homomorphism and its reverse is also a homomorphism. This is also easily described in firstorder logic, e.g.,
says that is a homomorphism.
Example 5
We describe the Dining Cryptographer example [16], and how to prove this by reasoning about isomorphism. [There is a cleaner way to do this using probabilistic bisimulation [24].] In this protocol there are cryptographers sitting at a round table. The cryptographers knew that the dinner was paid by NSA, or exactly one of the cryptographers at the table. The protocol aims to determine which one of these without revealing the identity of the cryptographer who pays. The th cryptographer is in state (resp.
) if he did not pay for the dinner. Any two neighbouring cryptographers keep a private fair coin (that is only visible to themselves). There is a transition to toss any of the coins (in this case, probability is replaced by nondeterminism). Let us use
to denote the value of the coin that is shared by the th and st cryptographers. If the th cryptographer paid, it will announce (here is the XOR operator); otherwise, it will announce the negation of this. We call the value announced by the th cryptographer . At the end, we take the XOR of , which is 0 iff none of the cryptographers paid.This example can easily be encoded by a lengthpreserving transducer . For example, the domain is a word of the form
where and . Here, the symbol ’?’ is used to denote that the value of is not yet determined. In the case of , the symbol ’?’ means that it is not yet announced. Although it is a bit cumbersome, it is possible to describe the dynamics of the system by a transducer. The desired property to prove then is whether there is an isomorphism between and for every , i.e., that the first cryptographer, who did not pay, cannot distinguish if it were the second or the third cryptographer who paid. There is a transducer describing the isomorphism that maps to , which is done by inverting the value of .
4 How to Satisfy Existential SecondOrder Quantifiers
We have given several examples for the Specification step in Section 3.1, but the question remains how one can solve the Specification Checking step and automatically compute witnesses for the existential quantifiers in a formula . We present two solutions for this problem, two approaches to automata learning whose respective applicability depends on the shape of the matrix . Both methods have in previous work proven to be useful for analysing complex parameterised systems. On the one hand, it has been shown that automata learning is competitive with tailormade algorithms, for instance with Abstract Regular Model Checking (ARMC) [11], for safety proofs [43, 17]; on the other hand, automata learning is general and can help to automate the verification of properties for which no bespoke approaches exist, for instance liveness properties or properties of games.
4.1 Active Automata Learning
The more efficient, though also more restricted approach is to use classical automata learning, for instance Angluin’s algorithm [6], or one of its variants (e.g., [39, 26]), to compute witnesses for . In all those algorithms, a learner attempts to reconstruct a regular language known to the teacher by repeatedly asking two kinds of queries: membership, i.e., whether a word should be in ; and equivalence, i.e., whether coincides with some candidate language constructed by the learner. When equivalence fails, the teacher provides a positive or negative counterexample, which is a word in the symmetric difference between and .
This leads to the question how membership and equivalence can be implemented in the ESO setting, in order to let a learner search for . In general, it is clearly not possible to answer membership queries about , since there can be many choices of relations satisfying , some of which might contain a word, while others do not; in other words, the relations are in general not uniquely determined by . We need to make additional assumptions.
As the simplest case, active automata learning can be used if two properties are satisfied: (i) the relations are uniquely defined by and the structure ; and (ii) for any , the subrelations can be effectively computed from and . Given those two assumptions, automata learning can be used to approximate the genuine solution up to any length bound , resulting in a candidate solution . It can also be verified whether coincide with the genuine solution by evaluating , i.e., by checking whether . If this check succeeds, learning has been successful; if it fails, the bound can be increased and a better approximation computed. Whenever the unique solution exists and is regular, this algorithm is guaranteed to terminate and produce a correct answer.
What can be done when the relations are not unique? Depending on the shape of , a simple trick can be applicable, namely the learning algorithm can be generalised to search for a unique smallest or unique largest solution (in the settheoretic sense) of , provided those solutions exist. This is the case in particular when can be rephrased as a fixedpoint equation
for some monotonic function ; for instance, if can be written as a set of Horn clauses. We still require property (ii), however, and need to be able to compute subrelations of the smallest or largest solution to answer membership queries.
In order to check whether a solution candidate is correct (for equivalence queries), we can as before evaluate , and terminate the search if is satisfied. In general, however, there is no way to verify that is indeed the smallest solution of , which affects termination and completeness in a somewhat subtle way. If the smallest solution of exists and is regular, then termination of the overall search is guaranteed, and the produced solution will indeed satisfy ; but what is found is not necessarily the smallest solution of .
This method has been implemented in particular for proving safety [43, 17] and probabilistic bisimulations [24] of lengthpreserving systems, cases in which
is naturally monotonic, and where active learning methods are able to compute witnesses with hundreds (sometimes 1000s) of states within minutes.
4.2 SATBased Automata Learning
style learning is not applicable if the matrix of an ESO formula does not have a smallest or largest solution, or if those solutions cannot be computed up to bounds .^{1}^{1}1Which is usually the case when the transducers defining a system are not lengthpreserving. An example of such nonmonotonic formulas are the formulas characterising winning strategies of reachability games presented in Section 3.4; indeed, multiple minimal but incomparable strategies can exist to win a game, so that in general there is no smallest solution. A more general learning strategy to solve ESO formulas in the nonmonotonic case is SATbased learning, i.e., using a Boolean encoding of finitestate automata to systematically search for solutions of [36, 34, 45]. SATbased learning is a more general solution than active automata learning for constructing ESO proofs, although experiments show that it is also a lot slower for simpler analysis tasks like safety proofs [17].
We outline how a SAT solver can be used to construct deterministic finitestate automata (DFAs), following the encoding used in [34]. The encoding assumes that a finite alphabet and the number of states of the automaton are fixed. The states of the automaton are assumed to be , and without loss of generality is the unique initial state. The Boolean decision variables of the encoding are (i) variables that determine which of the states are accepting; and (ii) variables that determine, for any letter and states , whether the automaton has a transition from to with label .
A number of Boolean constraints are then asserted to ensure that only wellformed DFAs are considered: determinism; reachability of every automaton state from the initial state; reachability of an accepting state from every state; and symmetrybreaking constraints.
Next, the formula can be translated to Boolean constraints over the decision variables. This translation can be done eagerly for all conjuncts of that can be represented succinctly:

a positive atom in which the length of is bounded can be translated to constraints that assert the existence of a run accepting ;

a negative atom can similarly be encoded as a run ending in a nonaccepting state, thanks to the determinism of the automaton;

for automata representing binary relations , several universally quantified formulas can be encoded as a polynomialsize Boolean constraint as well, including:
Reflexivity: Irreflexivity: Functional consistency: Transitivity:
Other conjuncts in can be encoded lazily with the help of a refinement loop, resembling the classical CEGAR approach. The SAT solver is first queried to produce a candidate automaton that satisfies a partial encoding of . It is then checked whether the candidate indeed satisfies ; if this is the case, SATbased learning has been successful and terminates; otherwise, a blocking constraint is asserted that rules out the candidate in subsequent queries.
It should be noted that this approach can in principle be implemented for any formula , since it is always possible to generate a naïve blocking constraint that blocks exactly the observed assignment of the variables , i.e., that exactly matches the automaton . It is wellknown in Satisfiability Modulo Theories, however, that good blocking constraints are those which eliminate as many similar candidate solutions as possible, and need to be designed carefully and specifically for a theory (or, in our case, based on the shape of ).
Several implementations of SATbased learning have been described in the literature, for instance for computing inductive invariants [36], synthesising state machines satisfying given properties [45], computing symmetries of parameterised systems [33], and for solving various kinds of games [34]. Experiments show that the automata that can be computed using SATbased learning tend to be several order of magnitudes smaller than with active automata learning methods (typically, at most 10–20 states), but that SATbased learning can solve a more general class of synthesis problems as well.
4.3 Stratification of ESO Formulas
The two approaches to compute regular languages can sometimes be combined. For instance, in [34] active automata learning is used to approximate the reachable configurations of a twoplayer game (in the sense of computing an inductive invariant), whereas SATbased learning is used to compute winning strategies; the results of the two procedures in combination represent a solution of an ESO formula with two secondorder quantifiers.
More generally, since the active automata learning approach in Section 4.1 is able to compute smallest or greatest solutions of formulas, a combined approach is possible when the matrix of an ESO formula can be stratified. Suppose can be decomposed into in such a way that (i) has a unique smallest solution in , and (ii) contains only in literals
in negative positions, i.e., underneath an odd number of negations. In this situation, one can clearly proceed by first computing a smallest relation
satisfying , using the methods in Section 4.1, and then solve the remaining formula given this fixed solution for . The case where has a greatest solution, and contains only positively can be handled similarly.5 Conclusions
In this paper, we have proposed existential secondorder logic (ESO) over automatic structures as an umbrella covering a large number of regular model checking tasks. We have shown that many important correctness properties can be represented elegantly in ESO, and developed unified algorithms that can be applied to any correctness property captured using ESO. Experiments showing the practicality of this approach have been presented in several recent publications, including computation of inductive invariants [43, 36, 17], of symmetries and simulation relations of parameterised systems [33], of winning strategies of games [34, 30], and of probabilistic bisimulations [24].
Several challenges remain. One bottleneck that has been identified in several of the studies is the size of alphabets necessary to model systems, to which the algorithms presented in Section 4 are very sensitive. This indicates that some analysis tasks require more compact or more expressive automata representations, for instance symbolic automata, and generalised learning methods; or abstraction to reduce the size of alphabets. Another lessthansatisfactory point is the handling of wellfoundedness in the ESO framework. When restricting the class of considered systems to weakly finite systems, as done here, wellfoundedness of relations can be replaced by acyclicity, which can be expressed easily in ESO (as shown in Section 3.3). It is not obvious, however, in which way ESO should be extended to also handle systems that are not weakly finite, without sacrificing the elegance of the approach.
Acknowledgment.
We thank our numerous collaborators in our work on regular model checking that led to this work including, Parosh Abdulla, YuFang Chen, Lukas Holik, ChihDuo Hong, Bengt Jonsson, Ondrej Lengal, Leonid Libkin, Rupak Majumdar, and Tomas Vojnar. This research was sponsored in part by the ERC Starting Grant 759969 (AVSMP), the Swedish Research Council (VR) under grant 201804727, and by the Swedish Foundation for Strategic Research (SSF) under the project WebSec (Ref. RIT170011).
References
 [1] P. A. Abdulla, A. Bouajjani, B. Jonsson, and M. Nilsson. Handling global conditions in parameterized system verification. In Computer Aided Verification, 11th International Conference, CAV ’99, Trento, Italy, July 610, 1999, Proceedings, pages 134–145, 1999.
 [2] P. A. Abdulla, B. Jonsson, P. Mahata, and J. d’Orso. Regular tree model checking. In CAV, pages 555–568, 2002.
 [3] P. A. Abdulla, B. Jonsson, M. Nilsson, and M. Saksena. A survey of regular model checking. In CONCUR, pages 35–48, 2004.
 [4] W. Ahrendt, B. Beckert, R. Bubel, R. Hähnle, P. H. Schmitt, and M. Ulbrich, editors. Deductive Software Verification  The KeY Book  From Theory to Practice, volume 10001 of Lecture Notes in Computer Science. Springer, 2016.
 [5] R. Alur, R. Bodík, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. SolarLezama, E. Torlak, and A. Udupa. Syntaxguided synthesis. In Formal Methods in ComputerAided Design, FMCAD 2013, Portland, OR, USA, October 2023, 2013, pages 1–8, 2013.
 [6] D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87–106, Nov. 1987.
 [7] S. Bardin, A. Finkel, J. Leroux, and P. Schnoebelen. Flat acceleration in symbolic model checking. In ATVA, pages 474–488, 2005.
 [8] M. Benedikt, L. Libkin, T. Schwentick, and L. Segoufin. Definable relations and firstorder query languages over strings. J. ACM, 50(5):694–751, 2003.
 [9] A. Blumensath and E. Grädel. Automatic structures. In Logic in Computer Science, 2000. Proceedings. 15th Annual IEEE Symposium on, pages 51–62. IEEE, 2000.
 [10] A. Blumensath and E. Grädel. Finite presentations of infinite structures: Automata and interpretations. Theory of Computing Systems, 37(6):641–674, 2004.
 [11] A. Bouajjani, P. Habermehl, and T. Vojnar. Abstract regular model checking. In CAV’04, pages 372–386.
 [12] A. Bouajjani, B. Jonsson, M. Nilsson, and T. Touili. Regular model checking. In Computer Aided Verification, 12th International Conference, CAV 2000, Chicago, IL, USA, July 1519, 2000, Proceedings, pages 403–418, 2000.
 [13] A. R. Bradley and Z. Manna. The Calculus of Computation: Decision Procedures with Applications to Verification. Springer, 1998.
 [14] M. Brockschmidt, B. Cook, S. Ishtiaq, H. Khlaaf, and N. Piterman. T2: temporal property verification. In M. Chechik and J. Raskin, editors, Tools and Algorithms for the Construction and Analysis of Systems  22nd International Conference, TACAS 2016, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands, April 28, 2016, Proceedings, volume 9636 of Lecture Notes in Computer Science, pages 387–393. Springer, 2016.
 [15] V. Bruyere, G. Hansel, C. Michaux, and R. Villemaire. Logic and recognizable sets of integers. Bull. Belg. Math. Soc., 1:191–238, 1994.
 [16] D. Chaum. The dining cryptographers problem: Unconditional sender and recipient untraceability. Journal of cryptology, 1(1):65–75, 1988.
 [17] Y. Chen, C. Hong, A. W. Lin, and P. Rümmer. Learning to prove safety over parameterised concurrent systems. In 2017 Formal Methods in Computer Aided Design, FMCAD 2017, Vienna, Austria, October 26, 2017, pages 76–83, 2017.
 [18] T. Colcombet and C. Löding. Transforming structures by set interpretations. Logical Methods in Computer Science, 3(2), 2007.
 [19] J. Esparza, A. Gaiser, and S. Kiefer. Proving termination of probabilistic programs using patterns. In Computer Aided Verification  24th International Conference, CAV 2012, Berkeley, CA, USA, July 713, 2012 Proceedings, pages 123–138, 2012.
 [20] T. S. Ferguson. Game Theory. Online Book, second edition edition, 2014.
 [21] J. Giesl, R. Thiemann, P. SchneiderKamp, and S. Falke. Automated termination proofs with AProVE. In V. van Oostrom, editor, Rewriting Techniques and Applications, 15th International Conference, RTA 2004, Aachen, Germany, June 35, 2004, Proceedings, volume 3091 of Lecture Notes in Computer Science, pages 210–220. Springer, 2004.
 [22] E. Grädel, W. Thomas, and T. Wilke, editors. Automata, Logics, and Infinite Games: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001], volume 2500 of Lecture Notes in Computer Science. Springer, 2002.
 [23] M. Hague, A. W. Lin, and C. L. Ong. Detecting redundant CSS rules in HTML5 applications: a tree rewriting approach. In Proceedings of the 2015 ACM SIGPLAN International Conference on ObjectOriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 2530, 2015, pages 1–19, 2015.
 [24] C. Hong, A. W. Lin, R. Majumdar, and P. Rümmer. Probabilistic bisimulation for parameterized systems  (with applications to verifying anonymous protocols). In Computer Aided Verification  31st International Conference, CAV 2019, New York City, NY, USA, July 1518, 2019, Proceedings, Part I, pages 455–474, 2019.
 [25] B. Jonsson and M. Nilsson. Transitive closures of regular relations for verifying infinitestate systems. In Tools and Algorithms for Construction and Analysis of Systems, 6th International Conference, TACAS 2000, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS 2000, Berlin, Germany, March 25  April 2, 2000, Proceedings, pages 220–234, 2000.

[26]
M. J. Kearns and U. V. Vazirani.
An Introduction to Computational Learning Theory
. MIT press, 1994.  [27] Y. Kesten, O. Maler, M. Marcus, A. Pnueli, and E. Shahar. Symbolic model checking with rich assertional languages. In Computer Aided Verification, 9th International Conference, CAV ’97, Haifa, Israel, June 2225, 1997, Proceedings, pages 424–435, 1997.
 [28] D. Kroening and O. Strichman. Decision Procedures: An Algorithmic Point of View. Springer Publishing Company, Incorporated, 1 edition, 2008.

[29]
K. R. M. Leino.
Dafny: An automatic program verifier for functional correctness.
In E. M. Clarke and A. Voronkov, editors,
Logic for Programming, Artificial Intelligence, and Reasoning  16th International Conference, LPAR16, Dakar, Senegal, April 25May 1, 2010, Revised Selected Papers
, volume 6355 of Lecture Notes in Computer Science, pages 348–370. Springer, 2010.  [30] O. Lengál, A. W. Lin, R. Majumdar, and P. Rümmer. Fair termination for parameterized probabilistic concurrent systems. In TACAS, pages 499–517, 2017.
 [31] L. Libkin. Elements of Finite Model Theory. Springer, 2004.
 [32] A. W. Lin. Accelerating treeautomatic relations. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2012, December 1517, 2012, Hyderabad, India, pages 313–324, 2012.
 [33] A. W. Lin, T. K. Nguyen, P. Rümmer, and J. Sun. Regular symmetry patterns. In Verification, Model Checking, and Abstract Interpretation  17th International Conference, VMCAI 2016, St. Petersburg, FL, USA, January 1719, 2016. Proceedings, pages 455–475, 2016.
 [34] A. W. Lin and P. Rümmer. Liveness of randomised parameterised systems under arbitrary schedulers. In CAV’16 (2), volume 9779 of LNCS, pages 112–133. Springer, 2016.
 [35] C. Löding and A. Spelten. Transition graphs of rewriting systems over unranked trees. In Mathematical Foundations of Computer Science 2007, 32nd International Symposium, MFCS 2007, Ceský Krumlov, Czech Republic, August 2631, 2007, Proceedings, pages 67–77, 2007.
 [36] D. Neider and N. Jansen. Regular model checking using solver technologies and automata learning. In NASA Formal Methods, 5th International Symposium, NFM 2013, Moffett Field, CA, USA, May 1416, 2013. Proceedings, pages 16–31, 2013.
 [37] D. Neider and U. Topcu. An automaton learning approach to solving safety games over infinite graphs. In TACAS, pages 204–221, 2016.
 [38] M. Nilsson. Regular Model Checking. PhD thesis, Uppsala Universitet, 2005.
 [39] R. L. Rivest and R. E. Schapire. Inference of finite automata using homing sequences. Inf. Comput., 103(2):299–347, 1993.
 [40] M. Sipser. Introduction to the Theory of Computation. PWS Publishing Company, 1997.
 [41] A. W. To. Model Checking InfiniteState Systems: Generic and Specific Approaches. PhD thesis, School of Informatics, University of Edinburgh, 2010.
 [42] A. W. To and L. Libkin. Algorithmic metatheorems for decidable LTL model checking over infinite systems. In Foundations of Software Science and Computational Structures, 13th International Conference, FOSSACS 2010, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010, Paphos, Cyprus, March 2028, 2010. Proceedings, pages 221–236, 2010.
 [43] A. Vardhan, K. Sen, M. Viswanathan, and G. Agha. Learning to verify safety properties. In ICFME’04, pages 274–289.
 [44] T. Vojnar. Cutoffs and automata in formal verification of infinitestate systems, 2007. Habilitation Thesis, Faculty of Information Technology, Brno University of Technology.
 [45] N. Walkinshaw, R. Taylor, and J. Derrick. Inferring extended finite state machine models from software executions. Empirical Software Engineering, 21(3):811–853, 2016.
 [46] P. Wolper and B. Boigelot. Verifying systems with infinite but regular state spaces. In Computer Aided Verification, 10th International Conference, CAV ’98, Vancouver, BC, Canada, June 28  July 2, 1998, Proceedings, pages 88–97, 1998.
Comments
There are no comments yet.