A Decidable Fragment of Second Order Logic With Applications to Synthesis

We propose a fragment of many-sorted second order logic ESMT and show that checking satisfiability of sentences in this fragment is decidable. This logic has an ∃^*∀^* quantifier prefix that is conducive to modeling synthesis problems. Moreover, it allows reasoning using a combination of background theories provided that they have a decidable satisfiability problem for the ∃^*∀^* FO-fragment (e.g., linear arithmetic). Our decision procedure reduces the satisfiability of ESMT formulae to satisfiability queries of the background theories, allowing us to use existing efficient SMT solvers for these theories; hence our procedure can be seen as effectively SMT (ESMT) reasoning.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/26/2019

CLS-SMT: Bringing Together Combinatory Logic Synthesis and Satisfiability Modulo Theories

We introduce an approach that aims to combine the usage of satisfiabilit...
09/17/2021

On the Convexity of a Fragment of Pure Set Theory with Applications within a Nelson-Oppen Framework

The Satisfiability Modulo Theories (SMT) issue concerns the satisfiabili...
04/28/2022

Linear Temporal Logic Modulo Theories over Finite Traces (Extended Version)

This paper studies Linear Temporal Logic over Finite Traces (LTLf) where...
04/30/2021

Temporal Stream Logic modulo Theories

Temporal Stream Logic (TSL) is a temporal logic that extends LTL with up...
06/08/2021

Interpolation and Model Checking for Nonlinear Arithmetic

We present a new model-based interpolation procedure for satisfiability ...
09/15/2017

Variant-Based Decidable Satisfiability in Initial Algebras with Predicates

Decision procedures can be either theory-specific, e.g., Presburger arit...
04/15/2020

Computing Tropical Prevarieties with Satisfiability Modulo Theories (SMT) Solvers

A novel way to use SMT (Satisfiability Modulo Theories) solvers to compu...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The goal of program synthesis is to automatically construct a program that satisfies a given specification. This problem has received a lot of attention from the research community in recent years [34, 4, 14]. Several different approaches have been proposed to address this challenge (see [4, 17] for some of these). One approach to program synthesis is to reduce the problem to the satisfiability problem in a decidable logic by constructing a sentence whose existentially quantified variables identify the program to be synthesized, and the inner formula expresses the requirements that the program needs to meet.

This paper furthers this research program by identifying a decidable second-order logic that is suitable for encoding problems in program synthesis. To get useful results, one needs to constrain the semantics of functions and relations used in encoding the synthesis problem. Therefore our logic has a set of background theories, where each of the background theories is assumed to be independently axiomatized and equipped with a solver. Finally, to leverage the advances made by logic solvers, our aim is to develop a decision procedure for our logic that makes black-box calls to the decision procedures (for satisfiability) for the background theories.

With the above goal in mind, let us describe our logic. It is a many-sorted logic that can be roughly described as an uninterpreted combination of theories (UCT) [20]. A UCT has a many-sorted universe where there is a special sort that is declared to be a foreground sort, while the other sorts () are declared to be background sorts. We assume that there is some fixed signature of functions, relations, and constants over each individual background sort that is purely over that sort. Furthermore, we assume that each background sort () comes with an associated background theory ; can be arbitrary, even infinite, but is constrained to formulae with functions, relations and constants that only involve the background sort . Our main contribution is a decidability result for the satisfiability problem modulo these theories for boolean combinations of sentences of the form

(1)
  • is a set of existentially quantified first order variables. These variables can admit values in any of the sorts (background or foreground);

  • is a set of existentially quantified relational variables, whose arguments are restricted to be over the foreground sort ;

  • is a set of existentially quantified function variables, which take as arguments elements from the foreground sort , and return a value in any of the background sorts ;

  • is a set of universally quantified first order variables over any of the sorts;

  • is a set of universally quantified relational variables, whose arguments could be of any of the sorts; and

  • is a set of universally quantified function variables, whose arguments can be from any sort and could return values of any sort.

Thus our logic has sentences with prefix , allowing for quantification over both first order variables and second-order variables (relational and functional). To obtain decidability, we have carefully restricted the sorts (or types) of second-order variables that are existentially and universally quantified, as described above.

Our decidability result proceeds as follows. By crucially exploiting the disjointness of the universes of background theories and through a series of transformations, we reduce the satisfiability problem for our logic to the satisfiability of several pure first-order logic formulas over the individual background theories . Consequently, if the background theories admit (individually) a decidable satisfiability problem for the first-order fragment, then satisfiability for our logic is decidable. Examples of such background theories include Presburger arithmetic, the theory of real-closed fields, and the theory of linear real arithmetic. Our algorithm for satisfiability makes finitely many black-box calls to the engines for the individual background theories.

Salient aspects of our logic and our decidability result

Design for decidability: Our logic is defined to carefully avoid the undecidability that looms in any logic of such power. We do not know of any decidable second-order logic fragment that supports background theories such as arithmetic and uninterpreted functions. While quantifier-free decidable logics can be combined to get decidable logics using Nelson-Oppen combinations [24], or local theory extensions [33], combining quantified logics is notoriously hard, and there are only few restricted classes of first-order logic that are known to be decidable.

Our design choice forces communication between theories using the foreground sort, keeping the universes of the different sorts disjoint, which allows a decidable combination of theories. We emphasize that, unlike existing work on quantified first-order theories that are decidable by reduction to quantifier-free SMT, our logic allows existential and universal quantification over the background theories as well, and the decision procedure reduces satisfiability to fragment of the underlying theories. Our result can hence be seen as a decidable combination of theories that further supports second-order quantification.

Undecidable Extensions: We show that our logic is on the edge of the decidability barrier, by showing that lifting some of the restrictions we have will render the logic undecidable. In particular, we show that if we allow outer existential quantification over functions (which is related to the condition demanding that all function variables are universally quantified in the inner block of quantifiers), then satisfiability of the logic is undecidable. Second, if we lift the restriction that the underlying background sorts are pairwise disjoint, then again the logic becomes undecidable. The design choices that we have made hence seem crucial for decidability.

Expressing Synthesis Problems: Apart from decidability, a primary motivational design principle of our logic is to express synthesis problems. Synthesis problems typically can be expressed in fragments, where we ask whether there exists an object of the kind we wish to synthesize (using the block of existential quantifiers) such that the object satisfies certain properties (expressed by a universally quantified formula). For instance, if we are synthesizing a program snippet that is required to satisfy a Hoare triple (pre/post condition), we can encode this by asking whether there is a program snippet such that for all values of variables (modeling the input to the snippet), the verification condition corresponding to the Hoare triple holds. In this context, the existentially quantified variables (first order and second order) can be used to model program snippets. Furthermore, since our logic allows second-order universal quantification over functions, we can model aspects of the program state that require uninterpreted functions, in particular pointer fields that model the heap.

Evaluation on Synthesis Problems: We illustrate the applicability of our logic for two classes of synthesis problems. The first class involves synthesizing recursive programs that work over inductive data-structures. Given the precise pre/post condition for the program to be synthesized, we show how to model recursive program synthesis by synthesizing only a straight-line program (by having the output of recursive calls provided as inputs to the straightline program). The verification condition of the program requires universal quantification over both scalar variables as well as heap pointers, modeled as uninterpreted functions. Since such verification-conditions are already very expressive (even for the purpose of verification), we adapt a technique in the literature called natural proofs [20, 29, 26], that soundly abstracts the verification condition to a decidable theory. This formulation still has universal quantification over variables and functions, and combines standard background theories such as arithmetic and theory of uninterpreted functions. We then show that synthesis of bounded-sized programs (possibly involving integer constants that can be unbounded) can be modeled in our logic. In this modeling, the universal quantification over functions plays a crucial role in modeling the pointers in heaps, and modeling uninterpreted relations that capture inductive data-structure predicates (such as lseg, bstree, etc.).

The second class of synthesis involves taking a recursive definition of a function, and synthesizing a non-recursive (and iteration free) function equivalent to it. In our modeling, the existential quantification over the foreground sort as well as the background sort of integers is utilized, as the synthesized function involves integers.

The crux of our contribution, therefore, is providing a decidable logic that can express synthesis problems succinctly. Such a logic promises to provide a useful interface between researchers working on practical synthesis applications and researchers working on engineering efficient tools for solving them, similar to the role SMT plays in verification.

2 Motivating Eqsmt for synthesis applications

In program synthesis, the goal is to search for programs, typically of bounded size, that satisfy a given specification. The -Block of an EQSMT formula can be used to express the search for the syntactic program. The inner formula, then, must interpret the semantics of this syntactic program, and express that it satisfies the specification. If the specification is a universally quantified formula, then, we can encode the synthesis problem in EQSMT.

One of the salient features of the fragment EQSMT is the ability to universally quantify over functions and relations. Often, specifications for programs, such as those that manipulate heaps, involve a universal quantification over uninterpreted functions (that model pointers). EQSMT aptly provides this functionality, while still remaining within the boundaries of decidability. Further, EQSMT supports combination of background theories/sorts; existential quantification over these sorts can thus be used to search for programs with arbitrary elements from these background sorts. As a result, the class of target programs that can be expressed by an EQSMT formula is infinite. Consequently, when our decision procedure returns unsatisfiable, we are assured that no program (from an infinite class of programs) exists, (most CEGIS solvers for program synthesis cannot provide such a guarantee.)

We now proceed to give a concrete example of a synthesis problem which will demonstrate the power of EQSMT. Consider the specification of the following function , which is a slight variant of the classical McCarthy’s function [23], whose specification is given below.

(2)

We are interested in synthesizing a straight line program that implements the recursive function , and can be expressed as a term over the grammar specified in Figure 0(a).

(a) Grammar for . stands for if-then-else.

(b) Program skeleton
Figure 1: Synthesizing using EQSMT. 

Here, we only briefly discuss how to encode this synthesis problem in EQSMT, and the complete details can be found in Appendix A. First, let us fix the maximum height of the term we are looking for, say to be 2. Then, the program we want to synthesize can be represented as a tree of height at most such that every node in the tree can have child nodes (because the maximum arity of any function in the above grammar is , corresponding to ite). The skeleton of such an expression tree is shown in Figure 0(b). Every node in the tree is named according to its path from the root node.

The synthesis problem can then be encoded as the following formula

Here, the nodes are elements of the foreground sort . The binary relations over the foreground sort will be used to assert that a node is the left,middle, right child respectively of node : , , . The operators or labels for nodes belong to the background sort , and can be one of , , , , , INPUT (denoting the input to our program), or constants (for which we will synthesize natural constants in the (infinite) background sort ). The function assigns a label to every node in the program, and the formula asserts some sanity conditions:

The formula asserts that the ‘meaning’ of the program can be inferred from the meaning of the components of the program. We will use the function , that assigns value to nodes from , for this purpose :

Finally, the formula expresses the specification of the program as in Equation (2). A complete description is provided in Appendix A.

Observe that the formula has existential and universal quantification over functions and relations, as allowed by our decidable fragment EQSMT. The existentially quantified functions map the foreground sort to one of the background sorts, and the existentially quantified relations span only over the foreground sort.

We encoded the above EQSMT formula in the z3 [12] SMT solver (see Section 6 for details), which synthesized the expression . In Section 6, we show that we can synthesize a large class of such programs amongst others.

3 Many-sorted Second Order Logic and the Eqsmt Fragment

We briefly recall the syntax and semantics of general many-sorted second order logic, and then present the  fragment of second order logic.

Many-sorted second-order logic

A many-sorted signature is a tuple where, is a nonempty finite set of sorts, , , , , are, respectively, sets of function symbols, relation symbols, first order variables, function variables and relation variables. Each variable is associated with a sort , represented as . Each function symbol or function variable also has an associated type , and each relation symbol and relation variable has a type . We assume that the set of symbols in and are either finite or countably infinite, and that , , and are all countably infinite. Constants are modeled using -ary functions. We say that is unsorted if consists of a single sort.

Terms over a many-sorted signature have an associated sort and are inductively defined by the grammar

where , and . Formulae over are inductively defined as

where , are relation variables, is a function variable, of appropriate types. Note that equality is allowed only for terms of same sort. A formula is said to be first-order if it does not use any function or relation variables.

The semantics of many sorted logics are described using many-sorted structures. A -structure is a tuple where is a collection of pairwise disjoint indexed universes, and is an interpretation function that maps each each variable to an element in the universe , each function symbol and each function variable to a function of the appropriate type on the underlying universe. Similarly, relation symbols and relation variables are also assigned relations of the appropriate type on the underlying universe. For an interpretation , as is standard, we use to denote the interpretation that maps to , and is otherwise identical to . For function variable and relation variable , and are defined analogously.

Interpretation of terms in a model is the usual one obtained by interpreting variables, functions, and function variables using their underlying interpretation in the model; we skip the details. The satisfaction relation is also defined in the usual sense, and we will skip the details.

A first-order theory is a tuple , where is a set of (possibly infinite) first-order sentences. Theory is complete if every sentence or its negation is entailed by , i.e., either every model satisfying satisfies , or every model satisfying satisfies . A theory is consistent if it is not the case that there is a sentence such that both and are entailed.

The logic Eqsmt 

We now describe EQSMT, the fragment of many-sorted second order logic that we prove decidable in this paper and that we show can model synthesis problems.

Let be a many sorted signature. is a pure signature if (a) the type of every function symbol and every relation symbol is over a single sort (however, function variables and relation variables are allowed to mix sorts), (b) there is a special sort (which we call the foreground sort, while other sorts are called background sorts) and (c) there are no function or relation symbols involving .

The fragment EQSMT is the set of sentences defined over a pure signature , with foreground sort and background sorts , by the following grammar

where, , (i.e., only foreground sort), , and is a universally quantified formula defined by the grammar

where, , , and is quantifier free over .

The formulas above consist of an existential quantification block followed by a universal quantification block. The existential block can have first-order variables of any sort, relation variables that are over the foreground sort only, and function variables that map tuples of the foreground sort to a background sort. The inner universal block allows all forms of quantification — first-order variables, function variables, and relation variables of all possible types. The inner formula is quantifier-free. We will retrict our attention to sentences in this logic, i.e., we will assume that all variables (first-order/function/relation) are quantified. We will denote by (resp. ), the set of existentially (resp. universally) quantified first order variables of sort , for every .

The problem.

The problem we consider is that of deciding satisfiability of EQSMT sentences with background theories for the background sorts. First we introduce some concepts.

An uninterpreted combination of theories (UCT) over a pure signature, with as the set of sorts, is the union of theories , where each is a theory over signature . A sentence is -satisfiable if there is a multi-sorted structure that satisfies and all the sentences in .

The satisfiability problem for EQSMT with background theories is the following. Given a UCT and a sentence , determine if is -satisfiable. We show that this is a decidable problem, and furthermore, there is a decision procedure that uses a finite number of black-box calls to satisfiability solvers of the underlying theories to check satisfiability of EQSMT sentences.

For the rest of this paper, for technical convenience, we will assume that the boolean theory is one of the background theories. This means and the constants . The set of sentences in is . Note that checking satisfiability of a sentence over is decidable.

4 The Decision Procedure for Eqsmt 

In this section we present our decidability result for sentences over EQSMT in presence of background theories. Let us first state the main result of this paper.

Let be a pure signature with foreground sort and background sorts . Let be a UCT such that, for each , checking -satisfiability of first-order sentences is decidable. Then the problem of checking -satisfiability of EQSMT sentences is decidable.∎

We will prove the above theorem by showing that any given EQSMT sentence over a UCT signature can be transformed, using a sequence of satisfiability preserving transformation steps, to the satisfiability of first-order formulae over the individual theories.

We give a brief overview of the sequence of transformations (Steps through ). In Step 1, we replace the occurrence of every relation variable (quantified universally or existentially) of sort by a function variable of sort . Note that doing this for the outer existentially quantified relation variables keeps us within the syntactic fragment. In Step 2, we eliminate function variables that are existentially quantified. This crucially relies on the small model property for the foreground universe, similar to EPR [5]. This process, however, adds both existential first-order variables and universally quantified function variables. In Step 3, we eliminate the universally quantified function variables using a standard Ackermann reduction [28], which adds more universally quantified first-order variables.

The above steps result in a first-order sentence over the combined background theories, and the empty theory for the foreground sort. In Step 4, we show that the satisfiability of such a formula can be reduced to a finite number of satisfiability queries of sentences over individual theories.

Step 1 : Eliminating relation variables

The idea here is to introduce, for every relational variable (with type ), a function variable (with type

) that corresponds to the characteristic function of

.

Let be EQSMT formula over . We will transform to an EQSMT formula over the same signature . Every occurrence of an atom of the form in , is replaced by in . Further, every quantification is replaced by , where . Thus, the resultant formula has no relation variables. Further, it is a EQSMT formula, since the types of the newly introduced existentially quantified function variables are of the form . The correctness of the above transformation is captured by the following lemma.

is -satisfiable iff is -satisfiable.

Step 2: Eliminating existentially quantified function variables

We first note a small-model property with respect to the foreground sort for sentences. This property crucially relies on the fact that existentially quantified function variables do not have their ranges over the foreground sort.

[Small-model property for ] Let be an EQSMT sentence with foreground sort and background sorts . Let be the number of existentially quantified first-order variables of sort in . Then, is -satisfiable iff there is a structure , such that , and .

Proof (Sketch)..

We present the more interesting direction here. Consider a model such that and . Let be the interpretation function that extends so that , where is the inner universally quantified subformula of . Let be the restriction of the foreground universe to the interpretations of the variables . Clearly, .

Let us first show that . For this, first see that for every extension of with interpretations of all the universal FO variables, we must have have , where is the quantifier free part of (and thus also of ). Now, clearly must also hold for those extensions which map all universal variables in to the set and maps all universally quantified function variables of range sort to function interpretations whose ranges are limited to the set .

Thus, it must also be the case that when we restrict the universe to the set , we have that . This is because every universal extension of is also a projection of one of these interpretations. ∎

The proof of the above statement shows that if there is a model that satisfies (in Lemma 4), then there is a model that satisfies and in which the foreground universe contains only elements that are interpretations of the first-order variables over the foreground sort (and hence bounded). Consequently, instead of existentially quantifying over a function (of arity ) from the foreground sort to some background sort , we can instead quantify over first-order variables of sort that capture the image of these functions for each -ary combination of .

Let be the EQSMT sentence over obtained after eliminating relation variables. Let be the quantifier free part (also known as the matrix) of . Let be the set of all sub-terms of sort that occur in . Note that, also includes the set of variables and and . Now, define888This is a correction from the previous version of the paper [21].

Let the sentence obtained by replacing the matrix in , by . Then, the correctness of this transformation is noted below.

is -satisfiable iff is -satisfiable.

We now eliminate the existentially quantified function variables in , one by one. Let , where is a background sort. For every -tuple over the set , we introduce a variable of sort . Let be the set of all such variables, where is the number of existential first order variables of sort in . Next, we introduce a fresh function variable of sort , and quantify it universally. will be used to emulate the function . Let us define

where, and is obtained by replacing all occurrences of in by . Now define to be the sentence

The following lemma states the correctness guarantee of this transformation. is -satisfiable iff is -satisfiable.

Step 3 : Eliminating universal function variables

The recipe here is to perform Ackermann reduction [2] for every universally quantified function variable.

Let , where is the quantifier free part of , and let . For every term of the form in , we introduce a fresh first order variable of sort , and replace every occurrence of the term in with . Let be the resulting quantifier free formula. Let be the collection of all the newly introduced variables. Let us now define ). Here, where, . Then, the transformed formula is correct: is -satisfiable iff is -satisfiable.

Step-4: Decomposition and black box calls to Theory solvers

The EQSMT sentence obtained after the sequence of steps 1 through 3 is a first order sentence over . This sentence, however, may possibly contain occurrences of variables of the foreground sort . Intuitively, the objective of this step is to decompose into sentences, one for each sort, and then use decision procedures for the respective theories to decide satisfiability of the decomposed (single sorted) sentences. Since such a decomposition can result into sentences over the foreground sort, we must ensure that there is indeed a decision procedure to achieve this. For this purpose, let us define be the empty theory (that is ). Checking satisfiability of sentences over is decidable. Also, satisfiability is preserved in the presence of in the following sense.

is -satisfiable iff is -satisfiable.

We first transform the quantifier free part of into an equivalent CNF formula . Let be obtained by replacing by . Let , where and each is a disjunction of atoms. Since is a first order formula over a pure signature, all atoms are either of the form or (with possibly a leading negation). Now, equality atoms are restricted to terms of the same sort. Also since is pure, the argument terms of all relation applications have the same sort. This means, for every atom , there is a unique associated sort , which we will denote by .

For a clause in , let be the set of atoms in . Let , and let . Then, we have the identity . We now state our decomposition lemma.

is -satisfiable iff there is a mapping such that for each , the formula is -satisfiable.

Proof (Sketch)..

We present the more interesting direction here. Let be an equi-satisfiable Skolem norm form of . That is, , where is obtained from by replacing all existential variables by Skolem constants. We will use the same notation for the clause of . Then, consider a structure such that and . Now, suppose, on the contrary, that there is a clause such that for every sort , we have . This means, for every sort , there is a interpretation (that extends with valuations of ), such that either leads to falsity of or the clause . Let be the values assigned to the universal variables in . Then, construct an interpretation by extending with the variables interpreted with ’s . This interpretation can be shown to either violate one of the theory axioms or the formula . In either case, we have a contradiction. ∎

The contract above identifies, for each clause , one sort such that the restriction of to can be set to true. Thus, in order to decide satisfiability of , a straightforward decision procedure involves enumerating all contracts, . For each contract and for each sort , we construct the sentence , and make a black-box call to the theory solver for . If there is a contract for which each of these calls return ‘SATISFIABLE’, then (and thus, the original formula ) is satisfiable. Otherwise, is unsatisfiable.

5 Undecidability Results

The logic that we have defined was carefully chosen to avoid undecidability of the satisfiability problem. We now show that natural generalizations or removal of restrictions in our logic renders the satisfiability problem undecidable. We believe our results are hence not simple to generalize any further.

One restriction that we have is that the functions that are existentially quantified cannot have as their range sort. A related restriction is that the universal quantification block quantifies all uninterpreted function symbols, as otherwise they must be existentially quantified on the outside block.

Let us now consider the fragment of logic where formulas are of the form where in fact we do not even have any background theory. Since the formula is over a single sort, we have dropped the sort annotations on the variables. It is not hard to see that this logic is undecidable. Consider signature with a single sort (and no background sorts). The satisfiability problem for sentences of the following form is undecidable.

Proof (Sketch).

We can show this as a mild modification of standard proofs of the undecidability of first-order logic. We can existentially quantify over a variable and a function succ, demand that for any element , is not , and for every , if , then . This establishes an infinite model with distinct elements , for every . We can then proceed to encode the problem of non-halting of a 2-counter machine using a relation , which stands for the 2CM is in state at time with counters and , respectively. It is easy to see that all this can be done using only universal quantification (the relation can be modeled as a function easily). ∎

The theorem above has a simple proof, but the theorem is not new; in fact, even more restrictive logics are known to be undecidable (see [8]).

Based on Theorem 5, when the only sort we have is the foreground sort, quantifying functions existentially results in undecidability of the satisfiability problem. On the other hand, if we allow only relation symbols (over the foreground sort) to be existentially quantified, the satisfiability problem becomes decidable [5]. However, for certain theories different from the empty theory (such as Presburger arithmetic), existential quantification over relation symbols (without any quantified function symbols) can lead to undecidability, despite the decidability of checking satisfiability of the entire first order fragment. This is the content of Theorem 5 and sheds light on the importance of the restriction that the foreground sort is disjoint from the various background sorts in EQSMT . Consider a signature with a single sort and let be the theory of Presburger arithmetic. The satisfiability problem is decidable for sentences of the form

Proof (Sketch).

We can use a similar proof as the theorem above, except now that we use the successor function available in Presburger arithmetic. We can again reduce non-halting of Turing machines (or 2-counter machines) to satisfiability of such formulas. ∎

Stepping further back, there are very few subclasses of first-order logic with equality that have a decidable satisfiability problem, and the only standard class that admits prefixes is the Bernays-Schönfinkel-Ramsey class (see [5]). Our results can be seen as an extension of this class with background theories, where the background theories admit locally a decidable satisfiability problem for the fragment.

6 Applications to Synthesis

6.1 Synthesis: Validity or Satisfiability?

Though we argued in Section 2 that synthesis problems can be modeled using satisfiability of EQSMT sentences, there is one subtlety that we would like to highlight. In synthesis problems, we are asked to find an expression such that the expression satisfies a specification expressed as a formula in some logic. Assuming the specification is modeled as a universally quantified formula over background theories, we would like to know if holds for the synthesized expression . However, in a logical setting, we have to qualify what “holds” means; the most natural way of phrasing this is that is valid over the underlying background theories, i.e., holds in all models that satisfy the background theories. However, the existential block that models the existence of an expression is clearly best seen as a satisfiability problem, as it asks whether there is some foreground model that captures an expression. Requiring that it holds in all foreground models (including those that might have only one element) would be unreasonable.

To summarize, the synthesis problem is most naturally modeled as a logical problem where we ask whether there is some foreground model (capturing a program expression) such that all background models, that satisfy their respective background theories, also satisfy the quantifier free formula expressing that the synthesized expression satisfies the specification. This is, strictly speaking, neither a satisfiability problem nor a validity problem!

We resolve this by considering only complete and consistent background theories. Hence validity of a formula under a background theory is equivalent to -satisfiability. Consequently, synthesis problems using such theories can be seen as asking whether there is a foreground universe (modeling the expression to be synthesized) and some background models where the specification holds for the expression. We can hence model synthesis purely as a satisfiability problem of EQSMT, as described in Section 2.

Many of the background theories used in verification/synthesis and SMT solvers are complete theories (like Presburger arithmetic, FOL over reals, etc.). One incomplete theory often used in verification is the theory of uninterpreted functions. However, in this case, notice that since the functions over this sort are uninterpreted, validity of formulas can be modeled using a universal quantification over functions, which is supported in EQSMT ! The only other adjustment is to ensure that this background theory has only infinite models (we can choose this background theory to be the theory of , which has a decidable satisfiability problem). Various scenarios such as modeling pointers in heaps, arrays, etc., can be naturally formulated using uninterpreted functions over this domain.

The second issue in modeling synthesis problems as satisfiability problems for EQSMT is that in synthesis, we need to construct the expression, rather than just know one exists. It is easy to see that if the individual background theory solvers support finding concrete values for the existentially quantified variables, then we can pull back these values across our reductions to give the values of the existentially quantified first-order variables (over all sorts), the existentially quantified function variables as well as the existentially quantified relation variables, from which the expression to be synthesized can be constructed.

6.2 Evaluation

We illustrate the applicability of our result for solving synthesis problems.

Synthesis of recursive programs involving lists.

We model the problem of synthesizing recursive programs with lists, that will meet a pre/post contract assuming that recursive calls on smaller data-structures satisfy the same contract . Though the programs we seek are recursive, we can model certain classes of programs using straight-line programs.

To see this, let us take the example of synthesizing a program that finds a particular key in a linked list (list-find). We can instead ask whether there is a straight-line program which takes an additional input which models the return value of a possible recursive call made on the tail of the list. The straight-line program must then work on the head of the list and this additional input (which is assumed to satisfy the contract ) to produce an output that meets the same contract .

For this problem, we modeled the program to be synthesized using existential quantification (over a grammar that generates bounded length programs) as described in Section 2. The pointer next and recursive data structures list, lseg in the verification condition were modeled using universal quantification over function variables and relation variables, respectively. Moreover, in order to have a tractable verification condition, we used the technique of natural proofs [20, 26, 29] that soundly formulates the condition in a decidable theory. We used z3 [12] to ackermanize the universally quantified functions/relations (lseg, list and next). We encoded the resulting formula as a synthesis problem in the SyGuS format [4] and used an off-the-shelf enumerative counter-example guided synthesis (CEGIS) solver. A program was synthesized within s, which was manually verified to be correct.

We also encoded other problems involving lists : list-length (calculating the length of a list), list-sum (computing sum of the keys in a list), list-sorted (checking if the sequence of keys in the list is sorted) and list-count-occurrence (counting the number of occurrences of a key in the list), using a CEGIS solver, and report the running times and the number of programs explored in Table 1.

We are convinced that EQSMT can handle recursive program synthesis (of bounded size) against separation logics specifications expressed using natural proofs (as in [26]).

Synthesis of straight-line programs equivalent to given recursive programs.

In the second class of examples, we turn to synthesizing straight-line programs given a recursive function as their specification. For example, consider Knuth’s generalization of the recursive McCarthy 91 function:

for every integer , and where . For the usual McCarthy function, we have , , , and .

   Program # Programs Explored Time(s) 
  in SyGuS  
   list-find 5k  
   list-length 40k  
   list-sum 160k  
   list-sorted 206k  
   list-count-occurrence 1.3 million  
   Knuth : , , , -  
   Knuth : ,  , , -  
   Knuth : ,  , , - 27 
   Knuth : ,  , , - 49 
   Knuth : ,  , , - 224 
   Takeuchi - 100 
 
Table 1: Synthesis of list programs and recursive programs.

Consider the problem of synthesizing an equivalent recursion-free expression. The programs we consider may have if-then-else statements of nesting depth 2, with conditionals over linear expressions having unbounded constants. Existential quantification over the background arithmetic sort allowed us to model synthesizing these unbounded constants. Our specification demanded that the value of the expression for satisfy the recursive equations given above.

We modeled the foreground sort inside arithmetic, and converted our synthesis problem to a first-order sentence over Presburger arithmetic and Booleans. We experimented with several values for (with ), and interestingly, solutions were synthesized only when . Given Knuth’s result that a closed form expression involves taking remainder modulo this expression (and since we did not have the modulo operation in our syntax), it turns out that simple expressions do not exist otherwise. Also, whenever the solution was found, it matched the recursion-free expression given by Knuth (see Theorem 1 in [19]). In Table 1, we provide the running times of our implementation on various parameters. We also compared our implementation with the popular synthesis tool Sketch [34] on these examples. For the purpose of comparison, we used the same template for both Sketch and our implementation. Further, since Sketch does not allow encoding integers with unbounded size (unlike our encoding in integer arithmetic), we represented these constants, to be synthesized, using bitvectors of size . Sketch does not return an answer within the set time-limit of 10 minutes for most of these programs.

We also modeled the Tak function (by Takeuchi) given by the specification below.

Our implementation synthesized the program in about 100s.

7 Related Work

There are several logics known in the literature that can express synthesis problems and are decidable. The foremost example is the monadic second-order theory over trees, which can express Church’s synthesis problem [10] and other reactive synthesis problems over finite data domains, and its decidability (Rabin’s theorem [31]) is one of the most celebrated theorems in logic that is applicable to computer science. Reactive synthesis has been well studied and applied in computer science (see, for example, [7]). The work reported in [22] is a tad closer to program synthesis as done today, as it synthesizes syntactically restricted programs with recursion that work on finite domains.

Caulfield et al [11] have considered the decidability of syntax-guided synthesis (SyGuS) problems, where the synthesized expressions are constrained to belong to a grammar (with operators that have the usual semantics axiomatized by a standard theory such as arithmetic) that satisfy a universally quantified constraint. They show that the problem is undecidable in many cases, but identify a class that asks for expressions satisfying a regular grammar with uninterpreted function theory constraints to be decidable.

The fragment of pure predicate logic (without function symbols) was shown to be decidable by Bernays and Schönfinkel (without equality) and by Ramsey (with equality) [5], and is often called Effectively Propositional Reasoning (EPR) class. It is one of the few fragments of first-order logic known to be decidable. The EPR class has been used in program verification [16, 25], and efficient SMT solvers supporting EPR have been developed [27].

The work by [1] extends EPR to stratified typed logics, which has some similarity with our restriction that the universes of the foreground and background be disjoint. However, the logic therein does not allow background SMT theories unlike ours and restricts the communication between universally and existentially quantified variables via equality between existential variables and terms with universally quantified variables as arguments. In [15], EPR with simple linear arithmetic (without addition) is shown to be decidable.

Theory extensions [33] and model theoretic and syntactic restrictions theoreof [32] have been explored to devise decidable fragment for quantified fragments of first order logic. Here, reasoning in local

theory extensions of a base theory can be reduced to the reasoning in the base theory (possibly with an additional quantification). Combination of theories which are extensions of a common base theory can similarly be handled by reducing the reasoning to a decidable base theory. Similar ideas have been employed in the context of combinations of linear arithmetic and the theory of uninterpreted functions with applications to construct interpolants 

[18] and invariants [6] for program verification. EQSMT does not require the background theories to be extensions of a common base theory.

Verification of programs with arrays and heaps can be modeled using second order quantification over the arrays/heaps and quantifier alternation over the elements of the array/heaps which belong to the theory of Presburger arithmetic. While such a logic is, in general, undecidable, careful syntactic restrictions such as limiting quantifier alternation [9] and flatness restrictions [3]. We do not restrict the syntax of our formulae, but ensure decidability via careful sort restrictions. A recent paper [20] develops sound and complete reasoning for a so-called safe FO fragment of an uninterpreted combination of theories. However, the logic is undecidable, in general, and also does not support second-order quantification.

The SyGuS format has recently been proposed as a language to express syntax guided synthesis problems, and there have been several synthesis engines developed for various tracks of SyGuS [4]. However, the syntax typically allows unbounded programs, and hence the synthesis problem is not decidable. In [13], the candidate program components are ‘decorated’ with annotations that represent transformers of the components in a sound abstract domain. This reduces the synthesis problem to the search for a proof in the abstract domain.

When expressing synthesis problems for programs that manipulate heaps, we rely on natural-proofs style sound abstraction of the verification conditions. Natural synthesis [30] extends this idea to an inductive synthesis procedure.

8 Conclusions and Future Work

The logic EQSMT defined herein is meant to be a decidable logic for communication between researchers modeling program synthesis problems and researchers developing efficient logic solvers. Such liaisons have been extremely fruitful in verification, where SMT solvers have served this purpose. We have shown the logic to be decidable and its efficacy in modeling synthesis problems. However, the decision procedure has several costs that should not be paid up front in any practical synthesis tool. Ways to curb such costs are known in the literature of building efficient synthesis tools. In particular, searching for foreground models is similar to EPR where efficient engines have been developed [27], and the search can also be guided by CEGIS-like approaches [4]. And the exponential blow-up caused by guessing contracts between solvers (in Step 4 of our procedure) is similar to arrangements agreed upon by theories combined using the Nelson-Oppen method, again for which efficient solvers have been developed. Our hope is that researchers working on logic engines will engineer an efficient decision procedure for EQSMT that can solve synthesis problems.

References

  • [1]

    Abadi, A., Rabinovich, A., Sagiv, M.: Decidable fragments of many-sorted logic. In: Proceedings of the 14th International Conference on Logic for Programming, Artificial Intelligence and Reasoning. pp. 17–31. LPAR’07, Springer-Verlag, Berlin, Heidelberg (2007),

    http://dl.acm.org/citation.cfm?id=1779419.1779423
  • [2] Ackermann, W., Ackermann, F.W.: Solvable cases of the decision problem (1954)
  • [3]

    Alberti, F., Ghilardi, S., Sharygina, N.: Decision procedures for flat array properties. Journal of Automated Reasoning 54(4), 327–352 (Apr 2015),

    https://doi.org/10.1007/s10817-015-9323-7
  • [4] Alur, R., Bodík, R., Dallal, E., Fisman, D., Garg, P., Juniwal, G., Kress-Gazit, H., Madhusudan, P., Martin, M.M.K., Raghothaman, M., Saha, S., Seshia, S.A., Singh, R., Solar-Lezama, A., Torlak, E., Udupa, A.: Syntax-guided synthesis. In: Dependable Software Systems Engineering, pp. 1–25. IOS Press (2015), https://doi.org/10.3233/978-1-61499-495-4-1
  • [5] Bernays, P., Schönfinkel, M.: Zum entscheidungsproblem der mathematischen logik. Mathematische Annalen (1928)
  • [6] Beyer, D., Henzinger, T.A., Majumdar, R., Rybalchenko, A.: Invariant synthesis for combined theories. In: Cook, B., Podelski, A. (eds.) Verification, Model Checking, and Abstract Interpretation. pp. 378–394. Springer Berlin Heidelberg, Berlin, Heidelberg (2007)
  • [7] Bloem, R., Galler, S., Jobstmann, B., Piterman, N., Pnueli, A., Weiglhofer, M.: Interactive presentation: Automatic hardware synthesis from specifications: A case study. In: Proceedings of the Conference on Design, Automation and Test in Europe. DATE ’07 (2007)
  • [8] Börger, E., Grädel, E., Gurevich, Y.: The classical decision problem. Springer Science & Business Media (2001)
  • [9] Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emerson, E.A., Namjoshi, K.S. (eds.) Verification, Model Checking, and Abstract Interpretation. pp. 427–442. Springer Berlin Heidelberg, Berlin, Heidelberg (2006)
  • [10] Buchi, J.R., Landweber, L.H.: Solving sequential conditions by finite-state strategies. Transactions of the American Mathematical Society (1969)
  • [11] Caulfield, B., Rabe, M.N., Seshia, S.A., Tripakis, S.: What’s decidable about syntax-guided synthesis? CoRR abs/1510.08393 (2015)
  • [12] De Moura, L., Bjørner, N.: Z3: An efficient smt solver. In: TACAS (2008)
  • [13] Gascón, A., Tiwari, A., Carmer, B., Mathur, U.: Look for the proof to find the program: Decorated-component-based program synthesis. In: Computer Aided Verification (2017)
  • [14] Gulwani, S.: Dimensions in program synthesis. In: Proceedings of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming. pp. 13–24. PPDP ’10, ACM, New York, NY, USA (2010), http://doi.acm.org/10.1145/1836089.1836091
  • [15] Horbach, M., Voigt, M., Weidenbach, C.: On the combination of the Bernays–Schönfinkel–Ramsey fragment with simple linear integer arithmetic. In: Proceedings of the International Conference on Automated Deduction. pp. 202–219 (2017)
  • [16] Itzhaky, S., Banerjee, A., Immerman, N., Nanevski, A., Sagiv, M.: Effectively-propositional reasoning about reachability in linked data structures. In: International Conference on Computer Aided Verification (2013)
  • [17] Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided component-based program synthesis. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1. pp. 215–224. ACM (2010)
  • [18] Kapur, D., Majumdar, R., Zarba, C.G.: Interpolation for data structures. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering. pp. 105–116. SIGSOFT ’06/FSE-14, ACM, New York, NY, USA (2006), http://doi.acm.org/10.1145/1181775.1181789
  • [19] Knuth, D.E.: Textbook examples of recursion. Artificial Intelligence and Mathematical Theory of Computation: Papers in Honor of John McCarthy (1991)
  • [20] Löding, C., Madhusudan, P., Peña, L.: Foundations for natural proofs and quantifier instantiation. Proc. ACM Program. Lang. 2(POPL), 10:1–10:30 (Dec 2017), http://doi.acm.org/10.1145/3158098
  • [21] Madhusudan, P., Mathur, U., Saha, S., Viswanathan, M.: A Decidable Fragment of Second Order Logic With Applications to Synthesis. In: Ghica, D., Jung, A. (eds.) 27th EACSL Annual Conference on Computer Science Logic (CSL 2018). Leibniz International Proceedings in Informatics (LIPIcs), vol. 119, pp. 31:1–31:19. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), http://drops.dagstuhl.de/opus/volltexte/2018/9698
  • [22] Madhusudan, P.: Synthesizing Reactive Programs. In: Computer Science Logic (CSL’11) - 25th International Workshop/20th Annual Conference of the EACSL (2011)
  • [23] Manna, Z., McCarthy, J.: Properties of programs and partial function logic. Tech. rep. (1969)
  • [24] Nelson, G., Oppen, D.C.: Simplification by cooperating decision procedures. ACM Transactions on Programming Languages and Systems (TOPLAS) 1(2), 245–257 (1979)
  • [25] Padon, O., McMillan, K.L., Panda, A., Sagiv, M., Shoham, S.: Ivy: safety verification by interactive generalization. ACM SIGPLAN Notices (2016)
  • [26] Pek, E., Qiu, X., Madhusudan, P.: Natural proofs for data structure manipulation in c using separation logic. In: ACM SIGPLAN Notices (2014)
  • [27] Piskac, R., de Moura, L., Bjørner, N.: Deciding effectively propositional logic with equality. Tech. rep., Technical Report MSR-TR-2008-181, Microsoft Research (2008)
  • [28] Pnueli, A., Rodeh, Y., Strichman, O., Siegel, M.: The small model property: How small can it be? Information and computation (2002)
  • [29] Qiu, X., Garg, P., Ştefănescu, A., Madhusudan, P.: Natural proofs for structure, data, and separation. ACM SIGPLAN Notices (2013)
  • [30] Qiu, X., Solar-Lezama, A.: Natural synthesis of provably-correct data-structure manipulations. Proc. ACM Program. Lang. 1(OOPSLA), 65:1–65:28 (Oct 2017), http://doi.acm.org/10.1145/3133889
  • [31] Rabin, M.O.: Decidability of second-order theories and automata on infinite trees. Transactions of the american Mathematical Society (1969)
  • [32] Sofronie-Stokkermans, V.: Hierarchic reasoning in local theory extensions. In: Nieuwenhuis, R. (ed.) Automated Deduction – CADE-20. pp. 219–234. Springer Berlin Heidelberg, Berlin, Heidelberg (2005)
  • [33] Sofronie-Stokkermans, V.: On combinations of local theory extensions. In: Programming Logics: Essays in Memory of Harald Ganzinger. pp. 392–413 (2013)
  • [34] Solar-Lezama, A., Tancau, L., Bodik, R., Seshia, S., Saraswat, V.: Combinatorial sketching for finite programs. ACM SIGOPS Operating Systems Review (2006)

Appendix A Encoding in Eqsmt 

We are interested in synthesizing a straight line program that implements the function , and can be expressed as a term over the grammar in Figure 0(a).

Let us see how to encode this synthesis problem in EQSMT. First, let us fix the maximum height of the term we are looking for, say to be 2. Then, the program we want to synthesize can be represented as a tree of height at most such that every node in the tree can have child nodes (because the maximum arity of any function in the above grammar is , corresponding to ite). A skeleton of such a expression tree is shown in Figure 0(b). Every node in the tree is named according to its path from the root node.

The synthesis problem can then be encoded as the formula

(3)

Here, the nodes are elements of the foreground sort . The binary relations over the foreground sort will be used to assert that a node is the left,middle, right child respectively of node : , , . The operators or labels for nodes belong to the background sort , and can be one of , , , , , INPUT (denoting the input to our program), or constants