Constrained Horn Clauses (CHCs) constitute a fragment of the first order predicate calculus, where the Horn clause format is extended by allowing constraints on specific domains to occur in clause premises. CHCs have gained popularity as a suitable logical formalism for automatic program verification . Indeed, many verification problems can be reduced to the satisfiability problem for CHCs.
Satisfiability of CHCs is a particular case of Satisfiability Modulo Theories (SMT), understood here as the general problem of determining the satisfiability of (possibly quantified) first order formulas where the interpretation of some function and predicate symbols is defined in a given constraint (or background) theory . Recent advances in the field have led to the development of a number of very powerful SMT (and, in particular, CHC) solvers, which aim at solving satisfiability problems with respect to a large variety of constraint theories. Among SMT solvers, we would like to mention CVC4 , MathSAT , and Z3 , and among solvers with specialized engines for CHCs, we recall Eldarica , HSF , RAHFT , and Spacer .
Even if SMT algorithms for unrestricted first order formulas suffer from incompleteness limitations due to general undecidability results, most of the above mentioned tools work well in practice when acting on constraint theories, such as Booleans, Uninterpreted Function Symbols, Linear Integer or Real Arithmetic, Bit Vectors, and Arrays. However, when formulas contain universally quantified variables ranging over inductively definedalgebraic data types (ADTs), such as lists and trees, then the SMT/CHC solvers often show a very poor performance, as they do not incorporate induction principles relative to the ADT in use.
To mitigate this difficulty, some SMT/CHC solvers have been enhanced by incorporating appropriate induction principles [38, 43, 44], similarly to what has been done in automated theorem provers . The most creative step which is needed when extending SMT solving with induction, is the generation of the auxiliary lemmas that are required for proving the main conjecture.
An alternative approach, proposed in the context of CHCs , consists in transforming a given set of clauses into a new set: (i) where all ADT terms are removed (without introducing new function symbols), and (ii) whose satisfiability implies the satisfiability of the original set of clauses. This approach has the advantage of separating the concern of dealing with ADTs (at transformation time) from the concern of dealing with simpler, non-inductive constraint theories (at solving time), thus avoiding the complex interaction between inductive reasoning and constraint solving. It has been shown  that the transformational approach compares well with induction-based tools in the case where lemmas are not needed in the proofs. However, in some satisfiability problems, if suitable lemmas are not provided, the transformation fails to remove the ADT terms.
The main contributions of this paper are as follows.
(1) We extend the transformational approach by proposing a new rule, called differential replacement, based on the introduction of suitable difference predicates, which play a role similar to that of lemmas in inductive proofs. We prove that the combined use of the fold/unfold transformation rules  and the differential replacement rule is sound, that is, if the transformed set of clauses is satisfiable, then the original set of clauses is satisfiable.
(2) We develop a transformation algorithm that removes ADTs from CHCs by applying the fold/unfold and the differential replacement rules in a fully automated way.
(3) Due to the undecidability of the satisfiability problem for CHCs, our technique for ADT removal is incomplete. Thus, we evaluate its effectiveness from an experimental point of view, and in particular we discuss the results obtained by the implementation of our technique in a tool, called AdtRem. We consider a set of CHC satisfiability problems on ADTs taken from various benchmarks which are used for evaluating inductive theorem provers. The experiments show that AdtRem is competitive with respect to Reynolds and Kuncak’s tool that augments the CVC4 solver with inductive reasoning .
The paper is structured as follows. In Section 2 we present an introductory, motivating example. In Section 3 we recall some basic notions about CHCs. In Section 4 we introduce the rules used in our transformation technique and, in particular, the novel differential replacement rule, and we show their soundness. In Section 5 we present a transformation algorithm that uses the transformation rules for removing ADTs from sets of CHCs. In Section 6 we illustrate the AdtRem tool and we present the experimental results we have obtained. Finally, in Section 7 we discuss the related work and make some concluding remarks.
2 A Motivating Example
Let us consider the following functional program , which we write using the OCaml syntax :
type list = Nil | Cons of int * list;; let rec append l ys = match l with | Nil -> ys | Cons(x,xs) -> Cons(x,(append xs ys));; let rec rev l = match l with | Nil -> Nil | Cons(x,xs) -> append (rev xs) (Cons(x,Nil));; let rec len l = match l with | Nil -> 0 | Cons(x,xs) -> 1 + len xs;;
The functions append, rev, and len compute list concatenation, list reversal, and list length, respectively. Suppose we want to prove the following property:
xs,ys. len (rev (append xs ys)) = (len xs) + (len ys)
Inductive theorem provers construct a proof of this property by induction on the structure of the list , by assuming the knowledge of the following lemma:
x,l. len (append l (Cons(x,Nil))) = (len l) + 1
The approach we follow in this paper avoids the explicit use of induction principles and also the knowledge of ad hoc lemmas. First, we consider the translation of Property () into a set of constrained Horn clauses [10, 43], as follows111In the examples, we use Prolog syntax for writing clauses, instead of the more verbose SMT-LIB syntax. The predicates \= (different from), = (equal to), < (less-than), >= (greater-than-or-equal-to) denote constraints between integers. The last argument of a Prolog predicate stores the value of the corresponding function.:
1. false :- N2\=N0+N1, append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), len(Rs,N2).
2. append(,Ys,Ys). 3. append([X|Xs],Ys,[X|Zs]) :- append(Xs,Ys,Zs). 4. rev(,). 5. rev([X|Xs],Rs) :- rev(Xs,Ts), append(Ts,[X],Rs). 6. len(,N) :- N=0. 7. len([X|Xs],N1) :- N1=N0+1, len(Xs,N0).
The set of clauses 1–7 is satisfiable if and only if Property holds. However, state-of-the-art CHC solvers, such as Z3 or Eldarica, fail to prove the satisfiability of clauses 1–7, because those solvers do not incorporate any induction principle on lists. Moreover, some tools that extend SMT solvers with induction [38, 43] fail on this particular example because they are not able to introduce Lemma .
To overcome this difficulty, we apply the transformational approach based on the fold/unfold rules , whose objective is to transform a given set of clauses into a new set without occurrences of list variables, whose satisfiability can be checked by using CHC solvers based on the theory of Linear Integer Arithmetic only. The soundness of the transformation rules ensures that the satisfiability of the transformed clauses implies the satisfiability of the original ones. We apply the Elimination Algorithm  as follows. First, we introduce a new clause:
8. new1(N0,N1,N2) :- append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), len(Rs,N2).
whose body is made out of the atoms of clause 1 which have at least one list variable, and whose head arguments are the integer variables of the body. By folding, from clause 1 we derive a new clause without occurrences of lists:
9. false :- N2\=N0+N1, new1(N0,N1,N2).
We proceed by eliminating lists from clause 8. By unfolding clause 8, we replace some predicate calls by their definitions and we derive the two new clauses:
10. new1(N0,N1,N2) :- N0=0, rev(Zs,Rs), len(Zs,N1), len(Rs,N2). 11. new1(N01,N1,N21) :- N01=N0+1, append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), append(Rs,[X],R1s), len(R1s,N21).
We would like to fold clause 11 using clause 8, so as to derive a recursive definition of new1 without lists. Unfortunately, this folding step cannot be performed because the body of clause 11 does not contain a variant of the body of clause 8, and hence the Elimination Algorithm fails to eliminate lists in this example.
Thus, now we depart from the Elimination Algorithm and we continue our derivation by observing that the body of clause 11 contains the subconjunction ‘append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1)’ of the body of clause 8. Then, in order to find a variant of the whole body of clause 8, we may replace in clause 11 the remaining subconjunction ‘append(Rs,[X],R1s), len(R1s,N21)’ by the new subconjunction ‘len(Rs,N2), diff(N2,X,N21)’, where diff is a predicate, called difference predicate, defined as follows:
12. diff(N2,X,N21) :- append(Rs,[X],R1s), len(R1s,N21), len(Rs,N2).
From clause 11, by performing that replacement, we derive the following clause:
13. new1(N01,N1,N21) :- N01=N0+1, append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), len(Rs,N2), diff(N2,X,N21).
Now, we can fold clause 13 using clause 8 and we derive a new clause without list arguments:
14. new1(N01,N1,N21) :- N01=N0+1, new1(N0,N1,N2), diff(N2,X,N21).
At this point, we are left with the task of removing list arguments from clauses 10 and 12. As the reader may verify, this can be done by applying the Elimination Algorithm without the need of introducing additional difference predicates. By doing so, we get the following final set of clauses without list arguments:
false :- N2\=N0+N1, new1(N0,N1,N2). new1(N0,N1,N2) :- N0=0, new2(N1,N2). new1(N0,N1,N2) :- N0=N+1, new1(N,N1,M), diff(M,X,N2). new2(M,N) :- M=0, N=0. new2(M1,N1) :- M1=M+1, new2(M,N), diff(N,X,N1). diff(N0,X,N1) :- N0=0, N1=1. diff(N0,X,N1) :- N0=N+1, N1=M+1, diff(N,X,M).
The Eldarica CHC solver proves the satisfiability of this set of clauses by computing the following model (here we use a Prolog-like syntax):
new1(N0,N1,N2) :- N2=N0+N1, N0>=0, N1>=0, N2>=0. new2(M,N) :- M=N, M>=0, N>=0. diff(N,X,M) :- M=N+1, N>=0.
Finally, we note that if in clause 12 we substitute the atom diff(N2,X,N21) by its model computed by Eldarica, namely the constraint ‘N21=N2+1, N2>=0’, we get exactly the CHC translation of Lemma . Thus, in some cases, the introduction of the difference predicates can be viewed as a way of automatically introducing the lemmas needed when performing inductive proofs.
3 Constrained Horn Clauses
Let LIA be the theory of linear integer arithmetic and Bool be the theory of boolean values. A constraint is a quantifier-free formula of . Let denote the set of all constraints. Let be a typed first order language with equality  which includes the language of . Let Pred be a set of predicate symbols in not occurring in the language of .
The integer and boolean types are said to be the basic types. For reasons of simplicity we do not consider any other basic types, such as real number, arrays, and bit-vectors, which are usually supported by SMT solvers [1, 13, 21]. We assume that all non-basic types are specified by suitable data-type declarations (such as the declare-datatypes declarations adopted by SMT solvers), and are collectively called algebraic data types (ADTs).
An atom is a formula of the form , where is a typed predicate symbol in Pred, and are typed terms constructed out of individual variables, individual constants, and function symbols. A constrained Horn clause (or simply, a clause, or a CHC) is an implication of the form
(for clauses we use the logic programming notation, where comma denotes conjunction). The conclusion (orhead) is either an atom or false, the premise (or body) is the conjunction of a constraint , and a (possibly empty) conjunction of atoms. If is an atom of the form , the predicate is said to be a head predicate. A clause whose head is an atom is called a definite clause, and a clause whose head is false is called a goal.
We assume that all variables in a clause are universally quantified in front, and thus we can freely rename those variables. Clause is said to be a variant of clause if can be obtained from by renaming variables and rearranging the order of the atoms in its body. Given a term , by we denote the set of all variables occurring in . Similarly for the set of all free variables occurring in a formula. Given a formula in , we denote by its universal closure.
Let be the usual interpretation for the symbols in , and let a -interpretation be an interpretation of that, for all symbols occurring in , agrees with .
A set of CHCs is satisfiable if it has a -model and it is unsatisfiable, otherwise. Given two -interpretations and we say that is included in if for all ground atoms , implies . Every set of definite clauses is satisfiable and has a least (with respect to inclusion) -model, denoted . If is any set of constrained Horn clauses and is the set of the goals in , then we define . We have that is satisfiable if and only if .
We will often use tuples of variables as arguments of predicates and write , instead of , whenever the values of and are not relevant. Whenever the order of the variables is not relevant, we will feel free to identify tuples of distinct variables with finite sets, and we will extend to finite tuples the usual operations and relations on finite sets. Given two tuples and of distinct elements, (i) their union is obtained by concatenating them and removing all duplicated occurrences of elements, (ii) their intersection is obtained by removing from the elements which do not occur in , (iii) their difference is obtained by removing from the elements which occur in , and (iv) holds if every element of occurs in . For all , equality of -tuples is defined as follows: iff . The empty tuple is identified with the empty set .
By , where and are disjoint tuples of distinct variables, we denote an atom such that . Let be a set of definite clauses. We say that the atom is functional from to with respect to if
The reference to the set of definite clauses is omitted, when understood from the context. Given a functional atom , we say that and are its input and output (tuples of) variables, respectively. The atom is said to be total from to with respect to if
If is a total, functional atom from to , we may write , instead of . For instance, append(Xs,Ys,Zs) is a total, functional atom from (Xs,Ys) to Zs with respect to the set of clauses 1–7 of Section 2.
Now we extend the above notions from atoms to conjunctions of atoms. Let be a conjunction such that: (i) , (ii) , and (iii) for is disjoint from . Then, the conjunction is said to be a total, functional conjunction from to and it is also written as . For , the above properties (F1) and (F2) hold if we replace by . For instance, append(Xs,Ys,Zs), rev(Zs,Rs) is a total, functional conjunction from (Xs,Ys) to (Zs,Rs) with respect to the set of clauses 1–7 of Section 2.
4 Transformation Rules for Constrained Horn Clauses
In this section we present the rules that we propose for transforming CHCs, and in particular, for introducing difference predicates, and we prove their soundness. We refer to Section 2 for examples of how the rules are applied.
First, we introduce the following notion of a stratification for a set of clauses. Let denote the set of the natural numbers. A level mapping is a function . For every predicate , the natural number is said to be the level of . Level mappings are extended to atoms by stating that the level of an atom is the level of its predicate symbol. A clause is stratified with respect to if, for , . A set of CHCs is stratified with respect to if all clauses in are stratified with respect to . Clearly, for every set of CHCs, there exists a level mapping such that is stratified with respect to .
A transformation sequence from to is a sequence of sets of CHCs such that, for is derived from , denoted , by applying one of the following Rules R1–R7. We assume that is stratified with respect to a given level mapping .
(R1) Definition Rule. Let be the clause , where: (i) newp is a predicate symbol in Pred not occurring in the sequence , (ii) is a constraint, (iii) the predicate symbols of occur in , and (iv) . Then, . We set .
For , by we denote the set of clauses, called definitions, introduced by Rule R1 during the construction of . Thus, However, by using Rules R2–R7 we can replace a definition in , for , and hence it may happen that .
(R2) Unfolding Rule. Let : be a clause in , where is an atom. Without loss of generality, we assume that . Let Cls: , , be the set of clauses in , such that: for , (1) there exists a most general unifier of and , and (2) the conjunction of constraints is satisfiable. Let . Then, by unfolding with respect to , we derive the set of clauses and we get .
When we apply Rule R2, we say that, for the atoms in the conjunction are derived from , and the atoms in the conjunction are inherited from the corresponding atoms in the body of .
(R3) Folding Rule. Let : be a clause in , and let : be a clause in . Suppose that: (i) either is or , and (ii) there exists a substitution such that and . By folding using definition , we derive clause : , and we get .
(R4) Clause Deletion Rule. Let : be a clause in such that the constraint is unsatisfiable. Then, we get .
(R5) Functionality Rule. Let : be a clause in , where is a total, functional conjunction in . By functionality, from we derive clause : , and we get .
(R6) Totality Rule. Let : be a clause in such that and is a total, functional conjunction in . By totality, from we derive clause : and we get .
Since the initial set of clauses is obtained by translating a terminating functional program, the functionality and totality properties hold by construction and we do not need to prove them when we apply Rules R5 and R6.
(R7) Differential Replacement Rule. Let : be a clause in , and let : be a definition clause in , where: (i) and are total, functional conjunctions with respect to , (ii) , (iii) , and (iv) . By differential replacement, we derive : and we get .
Note that no assumption is made on the set of variables, apart from the one deriving from the fact that is a definition, that is,
Rule R7 has a very general formulation that eases the proof of the Soundness Theorem (see Theorem 4.1), which extends to Rules R1–R7 correctness results for transformations of (constraint) logic programs [16, 17, 39, 42]. In the transformation algorithm of Section 5, we will use a specific instance of Rule R7 which is sufficient for ADT removal (see, in particular, the Diff-Introduce step).
Theorem 4.1 (Soundness)
Let be a transformation sequence using Rules R1–R7. Suppose that the following condition holds
(U)for , if by folding a clause in using a definition in , then, for some , by unfolding with respect to an atom such that .
If is satisfiable, then is satisfiable.
Thus, to prove the satisfiability of a set of clauses, it suffices: (i) to construct a transformation sequence , and then (ii) to prove that is satisfiable. Note, however, that if Rule R7 is used, it may happen that is satisfiable and is unsatisfiable, that is, some false counterexamples to satisfiability, so-called false positives, may be generated, as we now show.
Let us consider the following set of clauses derived by adding the definition clause D to the initial set C,1,2,3} of clauses:
C. false :- X=0, Y>0, a(X,Y). 1. a(X,Y) :- X=<0, Y=0. 2. a(X,Y) :- X>0, Y=1. 3. r(X,W) :- W=1. D. diff(Y,W) :- a(X,Y), r(X,W).
where: (i) a(X,Y) is a total, functional atom from X to Y, (ii) r(X,W) is a total, functional atom from X to W, and (iii) D is a definition in . By applying Rule R7, from we derive the set E,1,2,3,D} of clauses where:
E. false :- X=0, Y>0, r(X,W), diff(Y,W).
Now we have that is satisfiable, while is unsatisfiable.
5 An Algorithm for ADT Removal
Now we present Algorithm for eliminating ADT terms from CHCs by using the transformation rules presented in Section 4 and automatically introducing suitable difference predicates. If terminates, it transforms a set Cls of clauses into a new set where the arguments of all predicates have basic type. Theorem 4.1 guarantees that if is satisfiable, then also Cls is satisfiable.
Algorithm (see Figure 1) removes ADT terms starting from the set of goals in . collects these goals in and stores in the definitions of new predicates introduced by Rule R1.
Before describing the procedures used by Algorithm , let us first introduce the following notions.
Given a conjunction of atoms, (or ) denotes the set of variables in that have a basic type (or an ADT type, respectively). We say that an atom (or clause) has basic types if all its arguments (or atoms, respectively) have a basic type. An atom (or clause) has ADTs if at least one of its arguments (or atoms, respectively) has an ADT type.
Given a set (or a conjunction) of atoms, denotes the partition of with respect to the reflexive, transitive closure of the relation defined as follows. Given two atoms and in , holds iff . The elements of the partition are called the sharing blocks of .
A generalization of a pair of constraints is a constraint such that and . In particular, we consider the following generalization operator based on widening . Suppose that is the conjunction of atomic constraints, then is defined as the conjunction of all ’s in such that . For any constraint and tuple of variables, the projection of onto is a constraint such that: (i) , and (ii) . In our implementation, is computed from , where , by a quantifier elimination algorithm in the theory of booleans and rational numbers. This implementation is safe in our context, and avoids relying on modular arithmetic, as is often done when eliminating quantifiers in LIA .
For two conjunctions and of atoms, holds if and there exists a subconjunction of (modulo reordering) such that, for is an instance of . A conjunction of atoms is connected if it consists of a single sharing block.
Procedure (see Figure 2). At each iteration of the body of the for loop, the procedure removes the ADT terms occurring in a sharing block of the body of a clause of . This is done by possibly introducing some new definitions (using Rule R1) and applying the Folding Rule R3. To allow folding, some applications of the Differential Replacement Rule R7 may be needed. We have the following four cases.
(Fold). We remove the ADT arguments occurring in by folding using a definition introduced at a previous step. Indeed, the head of each definition introduced by Algorithm is by construction a tuple of variables of basic type.
(Generalize). We introduce a new definition whose constraint is obtained by generalizing , where is the constraint occurring in an already available definition whose body is . Then, we remove the ADT arguments occurring in by folding using .
(Diff-Introduce). Suppose that partially matches the body of an available definition : , that is, for some substitution , , and . Then, we introduce a difference predicate through the new definition : where and, by Rule R7, we replace the conjunction by in the body of , thereby deriving . Finally, we remove the ADT arguments in by folding using either or a clause whose constraint is a generalization of the pair of constraints.
The example of Section 2 allows us to illustrate this (Diff-Introduce) case. With reference to that example, clause : that we want to fold is clause 11, whose body has the single sharing block : ‘append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), append(Rs,[X],R1s), len(R1s,N21)’. Block partially matches the body ‘append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1), len(Rs,N2)’ of clause 8 of Section 2 which plays the role of definition : in this example. Indeed, we have that:
= (append(Xs,Ys,Zs), rev(Zs,Rs), len(Xs,N0), len(Ys,N1)),
= (append(Rs,[X],R1s), len(R1s,N21)), where =(Rs,X), =(R1s,N21),
= len(Rs,N2), where = (Rs), = (N2).
In this example, is the identity substitution. Morevover, the condition on the level mapping required in the Diff-Define-Fold Procedure of Figure 2 can be fulfilled by stipulating that new1append and new1len. Thus, the definition to be introduced is:
Ψ12. diff(N2,X,N21) :- append(Rs,[X],R1s), len(R1s,N21), len(Rs,N2). Ψ
Indeed, we have that: (i) the projection is N01=N0+1, (Rs,X), that is, the empty conjunction, (ii) is the body of clause 12, and (iii) the head variables N2, X, and N21 are the integer variables in that body. Then, by applying Rule R7, we replace in clause 11 the conjunction ‘append(Rs,[X],R1s), len(R1s,N21)’ by the new conjunction ‘len(Rs,N2), diff(N2,X,N21)’, hence deriving clause , which is clause 13 of Section 2. Finally, by folding clause 13 using clause 8, we derive clause 14 of Section 2, which has no list arguments.
(Project). If none of the previous three cases apply, then we introduce a new definition : where . Then, we remove the ADT arguments occurring in by folding using .
The procedure may introduce new definitions with ADTs in their bodies, which are added to NewDefs and processed by the procedure. In order to present this procedure, we need the following notions.
The application of Rule R2 is controlled by marking some atoms in the body of a clause as unfoldable. If we unfold with respect to atom clause : the marking of the clauses in is handled as follows: the atoms derived from are not marked as unfoldable and each atom inherited from an atom in the body of is marked as unfoldable iff is marked as unfoldable.
An atom in a conjunction of atoms is said to be a source atom if . Thus, a source atom corresponds to an innermost function call in a given functional expression. For instance, in clause 1 of Section 2, the source atoms are append(Xs,Ys,Zs), len(Xs,N0), and len(Ys,N1). Indeed, the body of clause 1 corresponds to len(rev(append xs ys)) =/ (len xs)+(len ys).
An atom in the body of clause : is a head-instance with respect to a set Ds of clauses if, for every clause in Ds such that: (1) there exists a most general unifier of and , and (2) the constraint is satisfiable, we have that . Thus, the input variables of are not instantiated by unification. For instance, the atom append([X|Xs],Ys,Zs) is a head-instance, while append(Xs,Ys,Zs) is not.
In a set Cls of clauses, predicate immediately depends on predicate , if in Cls there is a clause of the form The depends on relation is the transitive closure of the immediately depends on relation. Let be a well-founded ordering on tuples of terms such that, for all terms if , then, for all substitutions . A predicate is descending with respect to if, for all clauses, for if depends on then . An atom is descending if its predicate is descending. The well-founded ordering we use in our implementation is based on the subterm relation and is defined as follows: if every is a subterm of some and there exists which is a strict subterm of some . For instance, the predicates append, rev, and len in the example of Section 2 are all descending.
Procedure Unfold (see Figure 3) repeatedly applies Rule R2 in two phases. In Phase 1, the procedure unfolds the clauses in with respect to at least one source atom. Then, in Phase 2, clauses are unfolded with respect to head-instance atoms. Unfolding is repeated only w.r.t descending atoms. The termination of the Unfold procedure is ensured by the fact that the unfolding with respect to a non-descending atom is done at most once in each phase.
Procedure Replace simplifies some clauses by applying Rules R5 and R6 as long as possible. Replace terminates because each application of either rule decreases the number of atoms.
Thus, each execution of the Diff-Define-Fold, Unfold, and Replace procedures terminates. However, Algorithm might not terminate because new predicates may be introduced by Diff-Define-Fold at each iteration of the while-do of . Soundness of follows from soundness of the transformation rules (see Appendix).
Theorem 5.1 (Soundness of Algorithm )
Suppose that Algorithm terminates for an input set of clauses, and let be the output set of clauses. Then, every clause in has basic types, and if is satisfiable, then is satisfiable.
Algorithm is not complete, in the sense that, even if is a satisfiable set of input clauses, then may not terminate or, due to the use of Rule R7, it may terminate with an output set of unsatisfiable clauses, thereby generating a false positive (see Example 1 in Section 4). However, due to well-known undecidability results for the satisfiability problem of CHCs, this limitation cannot be overcome, unless we restrict the class of clauses we consider. The study of such restricted classes of clauses is beyond the scope of the present paper and, instead, in the next section, we evaluate the effectiveness of Algorithm from an experimental viewpoint.
6 Experimental Evaluation
In this section we present the results of some experiments we have performed for assessing the effectiveness of our transformation-based CHC solving technique. We compare our technique with the one proposed by Reynolds and Kuncak , which extends the SMT solver CHC4 with inductive reasoning.
Implementation. We have developed the AdtRem tool for ADT removal, which is based on an implementation of Algorithm in the VeriMAP system .
Benchmark suite and experiments. Our benchmark suite consists of 169 verification problems over inductively defined data structures, such as lists, queues, heaps, and trees, which have been adapted from the benchmark suite considered by Reynolds and Kuncak . These problems come from benchmarks used by various theorem provers: (i) 53 problems come from CLAM , (ii) 11 from HipSpec , (iii) 63 from IsaPlanner [14, 24], and (iv) 42 from Leon . We have performed the following experiments, whose results are summarized in Table 1 222The tool and the benchmarks are available at https://fmlab.unich.it/adtrem/.
(1) We have considered Reynolds and Kuncak’s dtt encoding of the verification problems, where natural numbers are represented using the built-in SMT type Int, and we have discarded: (i) problems that do not use ADTs, and (ii) problems that cannot be directly represented in Horn clause format. Since AdtRem does not support higher order functions, nor user-provided lemmas, in order to make a comparison between the two approaches on a level playing field, we have replaced higher order functions by suitable first order instances and we have removed all auxiliary lemmas from the input verification problems. We have also replaced the basic functions recursively defined over natural numbers, such as the plus and less-or-equal functions, by LIA constraints.
(2) Then, we have translated each verification problem into a set, call it , of CHCs in the Prolog-like syntax supported by AdtRem by using a modified version of the SMT-LIB parser of the ProB system . We have run Eldarica and Z3 333More specifically, Eldarica v2.0.1 and Z3 v4.8.0 with the Spacer engine ., which use no induction-based mechanism for handling ADTs, to check the satisfiability of . Rows ‘’ and ‘’ show the number of solved problems, that is, problems whose CHC encoding has been proved satisfiable.
(3) We have run algorithm on to produce a set of CHCs without ADTs. Row ‘’ reports the number of problems for which Algorithm terminates.
(4) We have converted into the SMT-LIB format, and then we have run Eldarica and Z3 for checking its satisfiability. Rows ‘’ and ‘’ report outside round parentheses the number of solved problems. There was only one false positive (that is, a satisfiable set of clauses transformed into an unsatisfiable set
), which we have classified as an unsolved problem.
(5) In order to assess the improvements due to the use of the differential replacement rule we have applied to a modified version, call it , of the ADT removal algorithm that does not introduce difference predicates, that is, the Diff-Introduce case of the Diff-Define-Fold Procedure of Figure 2 is never executed. The number of problems for which terminates and the number of solved problems using Eldarica and Z3 are shown within round parentheses in rows ‘’, ‘’, and ‘’, respectively.
(6) Finally, we have run the cvc4+ig configuration of the CVC4 solver extended with inductive reasoning  on the 169 problems in SMT-LIB format obtained at Step (1). Row ‘CVC4+Ind’ reports the number of solved problems.
|number of problems||53||11||63||42||169|
|(18) 36||(2) 4||(56) 59||(18) 30||(94) 129|
|(18) 32||(2) 4||(56) 57||(18) 29||(94) 122|
|(18) 29||(2) 3||(55) 56||(18) 26||(93) 114|
Evaluation of Results. The results of our experiments show that ADT removal considerably increases the effectiveness of CHC solvers without inductive reasoning support. For instance, Eldarica is able to solve 15 problems out of 169, while it solves 122 problems after the removal of ADTs. When using Z3, the improvement is slightly lower, but still very considerable. Note also that, when the ADT removal terminates (129 problems out of 169), the solvers are very effective (95% successful verifications for Eldarica). The improvements specifically due to the use of the difference replacement rule are demonstrated by the increase of the number of problems for which the ADT removal algorithm terminates (from 94 to 129), and of the number of problems solved (from 94 to 122, for Eldarica).
AdtRem compares favorably to CVC4 extended with induction (compare rows ‘’ and ‘’ to row ‘CVC4+Ind’). Interestingly, the effectiveness of CVC4 may be increased if one extends the problem formalization with extra lemmas which may be used for proving the main conjecture. Indeed, CVC4 solves 100 problems when auxiliary lemmas are added, and 134 problems when, in addition, it runs on the dti encoding, where natural numbers are represented using both the built-in type Int and the ADT definition with the zero and successor constructors. Our results show that in most cases AdtRem needs neither those extra axioms nor that sophisticated encoding.
Finally, in Table 2 we report some problems solved by AdtRem with Eldarica that are not solved by CVC4 with induction (using any encoding and auxiliary lemmas), or vice versa. For details, see https://fmlab.unich.it/adtrem/.
|Problem||Property proved by AdtRem and not by CVC4|
|Problem||Property proved by CVC4 and not by AdtRem|
7 Related Work and Conclusions
Inductive reasoning is supported, with different degrees of human intervention, by many theorem provers, such as ACL2 , CLAM , Isabelle , HipSpec , Zeno , and PVS . The combination of inductive reasoning and SMT solving techniques has been exploited by many tools for program verification [29, 36, 38, 41, 43, 44].
Leino  integrates inductive reasoning into the Dafny program verifier by implementing a simple strategy that rewrites user-defined properties that may benefit from induction into proof obligation to be discharged by Z3. The advantage of this technique is that it fully decouples inductive reasoning from SMT solving. Hence, no extensions to the SMT solver are required.
In order to extend CVC4 with induction, Reynolds and Kuncak  also consider the rewriting of formulas that may take advantage from inductive reasoning, but this is done dynamically, during the proof search. This approach allows CVC4 to perform the rewritings lazily, whenever new formulas are generated during the proof search, and to use the partially solved conjecture, to generate lemmas that may help in the proof of the initial conjecture.
The issue of generating suitable lemmas during inductive proofs has been also addressed by Yang et al.  and implemented in AdtInd. In order to conjecture new lemmas, their algorithm makes use of a syntax-guided synthesis strategy driven by a grammar, which is dynamically generated from user-provided templates and the function and predicate symbols encountered during the proof search. The derived lemma conjectures are then checked by the SMT solver Z3.
In order to take full advantage of the efficiency of SMT solvers in checking satisfiability of quantifier-free formulas over LIA, ADTs, and finite sets, the Leon verification system  implements an SMT-based solving algorithm to check the satisfiability of formulas involving recursively defined first-order functions. The algorithm interleaves the unrolling of recursive functions and the SMT solving of the formulas generated by the unrolling. Leon can be used to prove properties of Scala programs with ADTs and integrates with the Scala compiler and the SMT solver Z3. A refined version of that algorithm, restricted to catamorphisms, has been implemented into a solver-agnostic tool, called RADA .
In the context of CHCs, Unno et al.  have proposed a proof system that combines inductive theorem proving with SMT solving. This approach uses Z3-PDR  to discharge proof obligations generated by the proof system, and has been applied to prove relational properties of OCaml programs.
The distinctive feature of the technique presented in this paper is that it does not make use of any explicit inductive reasoning, but it follows a transformational approach. First, the problem of verifying the validity of a universally quantified formula over ADTs is reduced to the problem of checking the satisfiability of a set of CHCs. Then, this set of CHCs is transformed with the aim of deriving a set of CHCs over basic types (i.e., integers) only, whose satisfiability implies the satisfiability of the original set. In this way, the reasoning on ADTs is separated from the reasoning on satisfiability, which can be performed by specialized engines for CHCs on basic types (e.g. Eldarica  and Z3-Spacer ). Some of the ideas presented here have been explored in [11, 12], but there neither formal results nor an automated strategy were presented.
A key success factor of our technique is the introduction of difference predicates, which can be viewed as the transformational counterpart of lemma generation. Indeed, as shown in Section 6, the use of difference predicates greatly increases the power of CHC solving with respect to previous techniques based on the transformational approach, which do not use difference predicates .
As future work, we plan to apply our transformation-based verification technique to more complex program properties, such as relational properties .
-  C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovic, T. King, A. Reynolds, and C. Tinelli. CVC4. In G. Gopalakrishnan and S. Qadeer, editors, Computer Aided Verification, Proceedings of the 23rd International Conference CAV ’11, Snowbird, UT, USA, July 14–20, 2011, Lecture Notes in Computer Science 6806, pages 171–177. Springer, 2011.
-  C. W. Barrett and C. Tinelli. Satisfiability modulo theories. In E. M. Clarke, T. A. Henzinger, H. Veith, and R. Bloem, editors, Handbook of Model Checking, pages 305–343. Springer, 2018.
-  N. Bjørner, A. Gurfinkel, K. L. McMillan, and A. Rybalchenko. Horn clause solvers for program verification. In L. D. Beklemishev, A. Blass, N. Dershowitz, B. Finkbeiner, and W. Schulte, editors, Fields of Logic and Computation II - Essays Dedicated to Yuri Gurevich on the Occasion of His 75th Birthday, Lecture Notes in Computer Science 9300, pages 24–51, Switzerland, 2015. Springer.
The automation of proof by mathematical induction.
In A. Robinson and A. Voronkov, editors,
Handbook of Automated Reasoning, volume I, pages 845–911. North Holland, 2001.
-  A. Cimatti, A. Griggio, B. Schaafsma, and R. Sebastiani. The MathSAT5 SMT Solver. In N. Piterman and S. Smolka, editors, Proceedings of the 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’13, Lecture Notes in Computer Science 7795, pages 93–107. Springer, 2013.
K. Claessen, M. Johansson, D. Rosén, and N. Smallbone.
Automating inductive proofs using theory exploration.
In M. P. Bonacina, editor, Automated Deduction – CADE-24,
Proceedings of the 24th International Conference on Automated Deduction, Lake
Placid, NY, USA, June 9–14, 2013
, Lecture Notes in Artificial Intelligence 7898, pages 392–406. Springer, 2013.
-  P. Cousot and N. Halbwachs. Automatic discovery of linear restraints among variables of a program. In Proceedings of the Fifth ACM Symposium on Principles of Programming Languages, POPL ’78, pages 84–96. ACM, 1978.
-  E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. VeriMAP: A Tool for Verifying Programs through Transformations. In Proceedings of the 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’14, Lecture Notes in Computer Science 8413, pages 568–574. Springer, 2014. Available at http://www.map.uniroma2.it/VeriMAP.
-  E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Relational verification through Horn clause transformation. In X. Rival, editor, Proceedings of the 23rd International Symposium on Static Analysis, SAS ’16, Edinburgh, UK, September 8–10, 2016, Lecture Notes in Computer Science 9837, pages 147–169. Springer, 2016.
-  E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Solving Horn clauses on inductive data types without induction. Theory and Practice of Logic Programming, 18(3-4):452–469, 2018. Special Issue on ICLP ’18.
-  E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Lemma generation for Horn clause satisfiability: A preliminary study. In A. Lisitsa and A. P. Nemytykh, editors, Proceedings Seventh International Workshop on Verification and Program Transformation, VPT@Programming 2019, Genova, Italy, 2nd April 2019, volume 299 of EPTCS, pages 4–18, 2019.
-  E. De Angelis, F. Fioravanti, A. Pettorossi, and M. Proietti. Proving properties of sorting programs: A case study in Horn clause verification. In E. De Angelis, G. Fedyukovich, N. Tzevelekos, and M. Ulbrich, editors, Proceedings of the Sixth Workshop on Horn Clauses for Verification and Synthesis and Third Workshop on Program Equivalence and Relational Reasoning, HCVS/PERR@ETAPS 2019, Prague, Czech Republic, 6–7th April 2019, volume 296 of EPTCS, pages 48–75, 2019.
-  L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Proceedings of the 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’08, Lecture Notes in Computer Science 4963, pages 337–340. Springer, 2008.
-  L. Dixon and J. D. Fleuriot. IsaPlanner: A prototype proof planner in Isabelle. In F. Baader, editor, Automated Deduction – CADE-19, Proceedings of the 19th International Conference on Automated Deduction, Miami Beach, FL, USA, July 28 – August 2, 2003, Lecture Notes in Computer Science 2741, pages 279–283. Springer, 2003.
-  H. Enderton. A Mathematical Introduction to Logic. Academic Press, 1972.
-  S. Etalle and M. Gabbrielli. Transformations of CLP modules. Theoretical Computer Science, 166:101–146, 1996.
-  F. Fioravanti, A. Pettorossi, and M. Proietti. Transformation rules for locally stratified constraint logic programs. In K.-K. Lau and M. Bruynooghe, editors, Program Development in Computational Logic, Lecture Notes in Computer Science 3049, pages 292–340. Springer-Verlag, 2004.
-  F. Fioravanti, A. Pettorossi, M. Proietti, and V. Senni. Generalization strategies for the verification of infinite state systems. Theory and Practice of Logic Programming. Special Issue on the 25th Annual GULP Conference, 13(2):175–199, 2013.
-  S. Grebenshchikov, N. P. Lopes, C. Popeea, and A. Rybalchenko. Synthesizing software verifiers from proof rules. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, pages 405–416, 2012.
-  K. Hoder and N. Bjørner. Generalized property directed reachability. In A. Cimatti and R. Sebastiani, editors, Proceedings of the 15th International Conference on Theory and Applications of Satisfiability Testing, SAT ’12, Lecture Notes in Computer Science 7317, pages 157–171. Springer, 2012.
-  H. Hojjat and P. Rümmer. The ELDARICA Horn solver. In N. Bjørner and A. Gurfinkel, editors, 2018 Formal Methods in Computer Aided Design, FMCAD 2018, Austin, TX, USA, October 30 – November 2, 2018, pages 1–7. IEEE, 2018.
-  A. Ireland and A. Bundy. Productive use of failure in inductive proof. Journal of Automated Reasoning, 16(1):79–111, Mar. 1996.
-  J. Jaffar and M. Maher. Constraint logic programming: A survey. Journal of Logic Programming, 19/20:503–581, 1994.
-  M. Johansson, L. Dixon, and A. Bundy. Case-analysis for rippling and inductive proof. In M. Kaufmann and L. C. Paulson, editors, Interactive Theorem Proving, Lecture Notes in Computer Science 6172, pages 291–306. Springer, 2010.
-  B. Kafle, J. P. Gallagher, and J. F. Morales. RAHFT: A tool for verifying Horn clauses using abstract interpretation and finite tree automata. In Computer Aided Verification, Proceedings of the 28th International Conference CAV ’16, Toronto, ON, Canada, July 17–23, 2016, Proceedings, Part I, Lecture Notes in Computer Science 9779, pages 261–268. Springer, 2016.
-  M. Kaufmann, P. Manolios, and J. S. Moore. Computer-Aided Reasoning: An Approach. Kluwer Academic Publishers, 2000.
-  A. Komuravelli, A. Gurfinkel, and S. Chaki. SMT-based model checking for recursive programs. In Computer Aided Verification, Proceedings of the 26th International Conference CAV ’14, Vienna, Austria, July 18–22, 2014, Lecture Notes in Computer Science 8559, pages 17–34. Springer, 2014.
-  A. Komuravelli, A. Gurfinkel, S. Chaki, and E. M. Clarke. Automatic abstraction in SMT-based unbounded software model checking. In N. Sharygina and H. Veith, editors, Computer Aided Verification, Proceedings of the 25th International Conference CAV ’13, Saint Petersburg, Russia, July 13–19, 2013, Lecture Notes in Computer Science 8044, pages 846–862. Springer, 2013.
-  K. Leino. Automating induction with an SMT solver. In V. Kuncak and A. Rybalchenko, editors, Verification, Model Checking, and Abstract Interpretation, Proceedings of the 13th International Conference VMCAI 2012, Philadelphia, PA, USA, January 22–24, 2012, pages 315–331. Springer, 2012.
-  X. Leroy, D. Doligez, A. Frisch, J. Garrigue, D. Rémy, and J. Vouillon. The OCaml system, Release 4.06. Documentation and user’s manual, Institut National de Recherche en Informatique et en Automatique, France, 2017.
-  M. Leuschel and M. Butler. ProB: A model checker for B. In FME 2003: Formal Methods, pages 855–874. Springer, 2003.
-  J. W. Lloyd. Foundations of Logic Programming. Springer-Verlag, Berlin, 1987. Second Edition.
-  T. Nipkow, M. Wenzel, and L. C. Paulson. Isabelle/HOL: A Proof Assistant for Higher-Order Logic. Springer, 2002.
-  S. Owre, J. M. Rushby, and N. Shankar. PVS: A prototype verification system. In D. Kapur, editor, Automated Deduction – CADE-11, Proceedings of the 11th International Conference on Automated Deduction, Saratoga Springs, NY, USA, June 15–18, 1992, pages 748–752. Springer, 1992.
-  A. Pettorossi and M. Proietti. Totally correct logic program transformations via well-founded annotations. Higher-Order and Symbolic Computation, 21:193–234, 2008.
-  T.-H. Pham, A. Gacek, and M. W. Whalen. Reasoning about algebraic data types with abstractions. J. Autom. Reason., 57(4):281–318, Dec. 2016.
-  M. O. Rabin. Decidable theories. In J. Barwise, editor, Handbook of Mathematical Logic, pages 595–629. North-Holland, 1977.
-  A. Reynolds and V. Kuncak. Induction for SMT solvers. In D. D’Souza, A. Lal, and K. G. Larsen, editors, Verification, Model Checking, and Abstract Interpretation, Proceedings of the 16th International Conference VMCAI 2015, Mumbai, India, January 12–14, 2015, Lecture Notes in Computer Science 8931, pages 80–98. Springer, 2015.
-  H. Seki. On inductive and coinductive proofs via unfold/fold transformations. In D. De Schreye, editor, Proceedings of the 19th International Symposium on Logic Based Program Synthesis and Transformation LOPSTR ’09, Coimbra, Portugal, September 9–11, 2009, Lecture Notes in Computer Science 6037, pages 82–96. Springer, 2010.
-  W. Sonnex, S. Drossopoulou, and S. Eisenbach. Zeno: An automated prover for properties of recursive data structures. In C. Flanagan and B. König, editors, Proceedings of the 18th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’12, Tallinn, Estonia, March 24 – April 1, 2012, pages 407–421. Springer, 2012.
-  P. Suter, A. S. Köksal, and V. Kuncak. Satisfiability modulo recursive programs. In E. Yahav, editor, Proceedings of the 18th International Symposium on Static Analysis, SAS ’11, Lecture Notes in Computer Science 6887, pages 298–315. Springer, 2011.
-  H. Tamaki and T. Sato. A generalized correctness proof of the unfold/fold logic program transformation. Technical Report 86-4, Ibaraki University, Japan, 1986.
-  H. Unno, S. Torii, and H. Sakamoto. Automating induction for solving Horn clauses. In R. Majumdar and V. Kuncak, editors, Computer Aided Verification, Proceedings of the 29th International Conference CAV ’17, Heidelberg, Germany, Part II, Lecture Notes in Computer Science 10427, pages 571–591. Springer, 2017.
-  W. Yang, G. Fedyukovich, and A. Gupta. Lemma synthesis for automating induction over algebraic data types. In T. Schiex and S. de Givry, editors, Principles and Practice of Constraint Programming, Proceedings of the 25th International Conference CP 2019, Stamford, CT, USA, September 30 – October 4, 2019, Lecture Notes in Computer Science 11802, pages 600–617. Springer, 2019.
In this appendix we show the proofs of the results presented in Sections 4 and 5. First, we recall some definitions and facts from the literature . The least -model of a set of clauses is the set, denoted , of all ground atoms which are true in .
A reverse-implication-based transformation sequence is a sequence of sets of clauses where, for is derived from by applying one of the following rules (see Section 4): Definition (Rule R1), Unfolding (Rule R2), Folding (Rule R3), and the following rule, called Body Weakening (Rule W).
(Rule W) Body Weakening. Let : be a clause in , and suppose that the following condition holds for some constraint and conjunction of atoms:
where . Suppose also that , for every atom occurring in and not in . By body weakening, from clause we derive clause : , and we get .
If is a reverse-implication-based transformation sequence for which Condition (U) of Theorem 4.1 holds. For , let . Then .
The proof of this theorem  is based on the fact that, for , , hence the term reverse-implication-based transformation sequence. Note that, in particular, if we apply the body weakening rule to a clause of a given set of clauses whereby we replace a conjunction in the body of by a new a conjunction such that