Interpolation and the Array Property Fragment

04/25/2019 ∙ by Jochen Hoenicke, et al. ∙ 0

Interpolation based software model checkers have been successfully employed to automatically prove programs correct. Their power comes from interpolating SMT solvers that check the feasibility of potential counterexamples and compute candidate invariants, otherwise. This approach works well for quantifier-free theories, like equality theory or linear arithmetic. For quantified formulas, there are SMT solvers that can decide expressive fragments of quantified formulas, e. g., EPR, the array property fragment, and the finite almost uninterpreted fragment. However, these solvers do not support interpolation. It is already known that in general EPR does not allow for interpolation. In this paper, we show the same result for the array property fragment.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Several software model checkers [5, 6, 7, 10, 11, 13, 14, 16, 17] use interpolating SMT solvers for various subtasks of software verification. In counterexample-based approaches, for instance, paths through the program that lead to an error are encoded as formulas. If the formula is satisfiable, the resulting model can be translated into a concrete counterexample to correctness. If the formula is unsatisfiable, a Craig interpolant can be generated that serves to compute a candidate invariant. These candidate invariants can then be checked for inductiveness.

The software model checkers are powered by interpolating SMT solvers. Ideally, the fragment supported by the SMT solver is decidable and supports interpolation, to ensure completeness of the interpolation procedure. Additionally, the fragment should be closed under the usual logical operations like conjunction and negation, to facilitate inductiveness checks in the solver. Many quantifier-free fragments of SMT theories and their combinations are decidable and closed under logical operations and interpolation [15, 18, 4].

The full first-order logic is closed under all operations and interpolation, but it is not decidable. There are decidable fragments, for example, EPR, the array property fragment, and the finite almost uninterpreted fragment. For each of the fragments, the full fragment that supports -formulas is not closed under negation111For example is in the array property and the finite almost uninterpreted fragment (after Skolemisation), but its negation is not.. If restricted to the alternation-free fragment, EPR and the array property fragment are closed under all logical operations. In [8], Drews and Albarghouti show that the alternation free EPR fragment is not closed under interpolation. We show in this paper a similar result for the (alternation free) array property fragment: we give an example for an interpolation problem where the input formulas are in the array property fragment and prove that there exists no interpolant within the array property fragment.

2 Notation and Basic Definitions

We use sorted first-order logic and the model-theoretic approach to define theories from the SMT-LIB standard [1]. The sort symbols with arity inductively define the set of sorts: if are sorts and , then is a sort. The function symbols with rank define the set of terms: if are terms of sort respectively, then is a term of sort . A signature is given by a set of sort symbols and a set of ranked function symbols. A -formula is a first-order formula built from the sorts and function symbols in . A -structure A maps sorts to a non-empty set and function symbols with rank to a corresponding function . A theory is given by its signature and a class of -structures, which are also called the models of . A theory fragment of a theory with signature is a subset of -formulas.

The theory of arrays is parameterized by a signature defining other sort symbols that can be used for index and element sorts. The signature of contains in addition a sort symbol of arity . For each index sort and element sort , the sort represents the sort of arrays with the given index and element sort. The signature contains a select function of rank and a store function of rank . For array , index , and element , returns the element stored in at index , and returns a fresh array that is a copy of where the element at is replaced by . The models of fix the meaning of , , and as follows.

The theory of integers contains a sort symbol of arity as well as the usual arithmetic functions in its signature. Its models define and fix the meaning of the arithmetic functions as usual. In the following we use the combined theory , where the signature contains all symbols from and and the meaning of the theory symbols is defined as above.

An interpolation problem is a pair of formulas where the conjunction is unsatisfiable. Given an interpolation problem , the symbols shared between and are called shared, symbols only occurring in are called -local and symbols only occurring in , -local. We call a term or formula shared if it contains only shared symbols. A Craig interpolant for an interpolation problem is a formula such that (i) implies in the theory , (ii) and are -unsatisfiable and (iii) is shared between and .

A theory fragment is closed under interpolation if for each interpolation problem in there exists an interpolant in .

3 The Array Property Fragment

The theory of arrays is often used in software verification to model heap memory or arrays in C, for instance. The quantifier-free fragment of the theory of arrays is decidable, and there exist interpolation methods for an extension of the quantifier-free fragment of the theory of arrays [4, 12]. However, in many verification tasks, it is not sufficient to consider quantifier-free formulas. For instance, to prove correctness of a sorting algorithm, it is necessary to reason about all elements of an array within a given range.

Quantifiers are challenging as general First-Order logic with theories is undecidable. The same holds for the theory of arrays with quantifiers. The decidable array property fragment was introduced by Bradley et al. in [3] as a subset of the -fragment of the theory of arrays and integers.

An array property is a quantified formula of the form

where are index variables, and the form of the index guard and the value constraint are restricted syntactically as follows: the index guard consists of ground literals and literals containing quantified variables of the form

where is a ground term and are quantified variables. The literals can be connected by and but the literals containing quantified variables must not appear negated. The value constraint consists of ground literals and literals containing quantified index variables only within array reads . Array reads on quantified variables must not be nested, i.e., must not occur in arguments of the select function or the store function .

The array property fragment of consists of all Boolean combinations of array properties and quantifier-free formulae. It is sufficiently expressive to describe properties such as sortedness or equality of arrays in a given range. In the original presentation, one quantifier alternation is allowed, i.e., an array property can be existentially quantified. However, the resulting fragment is not closed under negation which is crucial for interpolation-based invariant generation.

We follow here the more restricted definition in [2] that does not allow alternation of quantifiers. Under this restriction, the fragment is closed under negation. However, it is not closed under interpolation222If one allows -formulae, one might find an interpolant in this form, but the negation does not lie in the fragment and hence cannot be used for inductiveness checks., i.e., there exist formulas within the array property fragment for which no interpolant in the array property fragment exists, as we show in the next section.

4 Interpolation in the Array Property Fragment

In the following, we show that the array property fragment as defined above is not closed under interpolation by giving a concrete counterexample.

Example 1

Consider the following interpolation problem in the array property fragment.

Clearly, is unsatisfiable: implies that must hold by instantiating with , in contradiction to which results from instantiating with in . Possible interpolants are

Both do not lie in the array property fragment due to quantifier alternation.

In fact, there does not exist an interpolant within the array property fragment. Intuitively, the only shared terms that can be used in the interpolant, are the arrays and . There is no shared term to capture the indices and . One can only obtain index terms by using a quantifier, which will lead to quantifier alternation.

It is well known that the quantifier-free fragment of the theory of arrays is not closed under interpolation, but if one adds an auxiliary function the extension allows for interpolation [4]. This can be achieved by adding the function that returns some index where two arrays differ. The meaning of the function is not fixed; it only needs to satisfy the property

()

However, in the above example the function is not sufficient to define an interpolant without quantifier alternation. The informal reason is that it is only required to return some index where two arrays differ (if they differ), hence, to capture the correct index, more information on the index, expressible in shared terms, would be needed. The following theorem will prove this formally.

Theorem 4.1

The array property fragment is not closed under interpolation.

Proof

Consider again Example 1. In the following, we show that there does not exist an interpolant without quantifier alternation for this interpolation problem. The proof follows the idea of Drews and Albarghouti for showing a similar result for EPR [8].

We construct a sequence of models that are alternatingly models for and for and show that no formula in the array property fragment that contains only shared terms can distinguish between models for and models for from a certain point on.

For , let

For any even number (including 0), is a model for

, and for any odd number

, is a model for : the maximum value of both and is stored at index . If is even, and hence for , the value is greater than all values in . If is odd, and for , the value is greater or equal to all values in .

Note that max and min in the semantics of are well-defined: in the second case, is a non-empty set of negative integers, and hence has a maximum element. In the last case, is a non-empty set of non-negative integers and has a minimal element. By definition, if , returns an index where and differ. Hence, property (4) is satisfied.

We will now show that any formula in the array property fragment only containing shared symbols cannot distinguish between and for large and . Therefore it cannot be an interpolant of : an interpolant evaluates to true for all even and to false for all odd .

We first consider quantifier-free terms and distinguish between array-valued terms and scalar terms . The latter includes also Boolean terms. The following properties hold:

  1. For all shared scalar terms , there exists a number such that for all models with , the value of does not change, i.e., .

  2. For all shared array terms , there exists a number such that

    1. the prefix of the array does not change for subsequent models, i.e., for all with , and for all indices with , , and

    2. for all with , the suffix of the array repeats the element at index , i.e., for all with , it holds .

Note that if a property holds for one number , then it also holds for all larger numbers by definition.

We show Properties 1 and 2 by induction over the term and , respectively.

Base case: For integer constants, Property 1 holds for all . For the shared terms and , Property 2 holds for all .

Induction step: For function applications and predicates that do not involve arrays, e.g. , we assume that Property 1 holds for and with and . Then for , Property 1 holds for .

For a select term , we assume that Property 1 holds for with and Property 2 holds for with . Then Property 1 holds for with : for all with , we derive from Property 2(a) since and for , we have .

For a term , we assume that Property 2 holds for and with and , respectively, and thus, it holds for both and with . If for some with , , then because of Property 2(b), and differ at some index with . By definition, . Because of Property 2(a) and the definition of , for with , we have . If for all with , , then by definition, and Property 1 holds for with .

For a store term , we assume that Property 1 holds for and with and , and Property 2 holds for with . With , Property 2 holds for : (a) holds for because it holds for , and for , (a) follows from Property 1 for . Property 2(b) holds for because it holds for and .

Next we show that for an array property , there exists a number such that the value of stays constant, i.e., for .

First, we collect all subterms of that do not contain and compute the corresponding that satisfy Property 1 or 2, respectively. Let be the maximum of all these numbers. For all ground terms in the index guard , compute and let be the maximum of and all numbers .

If for all , is true, the value of obviously stays constant in all subsequent models.

If there exists such that is false, there is some such that is false under . If we replace all components of that are greater than by , the formula is still false. The index guard is still true: Let be greater than . As is greater than the maximum of all values for the ground terms in the index guard, literals of the form or must evaluate to false in . For a literal , the replacement will evaluate to true because of the definition of . If a literal evaluates to true in , then we replace both and by and the resulting equality holds trivially. This means, by replacing by we can only obtain more literals that evaluate to true in the index guard. The evaluation under of the value guard is unchanged because of Property 2(b) for the arrays in the select terms containing , and Property 1 for the other terms. Note that quantified variables cannot appear in store or terms, because array reads must not be nested.

Thus, we can assume that all components of are smaller or equal to . Then, for all with , is still false. This follows from Property 2(a) for the arrays in select terms, and Property 1 for all other terms. Thus, for all with , is constantly false.

Every formula in the array property fragment over shared symbols is a Boolean combination of array properties and quantifier-free formulas. For each of these formulas, there exists a number from which on the formulas do not change their value. If we choose the maximum of all these numbers , the whole formula does not change its value between and and as one of and is a model for and the other is a model for , the formula cannot be an interpolant for . ∎

5 Conclusion

The array property fragment is an expressive but still decidable fragment for the theory of arrays and therefore useful for checking program correctness. In this paper, we have shown that the array property fragment is not closed under interpolation. Our proof also shows that, in contrast to the quantifier-free fragment, the function does not establish closedness under interpolation for the array property fragment. Thus, it is not sufficient to restrict the solver to the array property fragment, if one wants to use interpolants to derive new invariants used in later solver queries.

As our example shows, the problem is that the array property fragment cannot express interpolants of simple quantified formulas. Therefore, for interpolation based software model checking a more expressive fragment is needed. One possible candidate is the almost uninterpreted fragment [9]. This fragment allows for quantifier alternation and can express the interpolants in our example. However, this fragment is undecidable. One can achieve decidability by using the finite almost uninterpreted fragment, however, this fragment also does not have nice closure properties: it is not even closed under conjunction.

References

  • [1] Clark Barrett, Pascal Fontaine, and Cesare Tinelli. The SMT-LIB Standard: Version 2.6. Technical report, Department of Computer Science, The University of Iowa, 2017. Available at www.SMT-LIB.org.
  • [2] Aaron R. Bradley and Zohar Manna. The calculus of computation - decision procedures with applications to verification. Springer, 2007.
  • [3] Aaron R. Bradley, Zohar Manna, and Henny B. Sipma. What’s decidable about arrays? In VMCAI, volume 3855 of Lecture Notes in Computer Science, pages 427–442. Springer, 2006.
  • [4] Roberto Bruttomesso, Silvio Ghilardi, and Silvio Ranise. Quantifier-free interpolation of a theory of arrays. Logical Methods in Computer Science, 8(2), 2012.
  • [5] Franck Cassez, Christian Müller, and Karla Burnett. Summary-based inter-procedural analysis via modular trace refinement. In FSTTCS, volume 29 of LIPIcs, pages 545–556. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2014.
  • [6] Franck Cassez, Anthony M. Sloane, Matthew Roberts, Matthew Pigram, Pongsak Suvanpong, and Pablo González de Aledo Marugán. Skink: Static analysis of programs in LLVM intermediate representation - (competition contribution). In TACAS (2), volume 10206 of Lecture Notes in Computer Science, pages 380–384, 2017.
  • [7] Matthias Dangl, Stefan Löwe, and Philipp Wendler.

    Cpachecker with support for recursive programs and floating-point arithmetic - (competition contribution).

    In TACAS, volume 9035 of Lecture Notes in Computer Science, pages 423–425. Springer, 2015.
  • [8] Samuel Drews and Aws Albarghouthi. Effectively propositional interpolants. In CAV (2), volume 9780 of Lecture Notes in Computer Science, pages 210–229. Springer, 2016.
  • [9] Yeting Ge and Leonardo Mendonça de Moura. Complete instantiation for quantified formulas in satisfiabiliby modulo theories. In CAV, volume 5643 of Lecture Notes in Computer Science, pages 306–320. Springer, 2009.
  • [10] Matthias Heizmann, Yu-Fang Chen, Daniel Dietsch, Marius Greitschus, Jochen Hoenicke, Yong Li, Alexander Nutz, Betim Musa, Christian Schilling, Tanja Schindler, and Andreas Podelski. Ultimate automizer and the search for perfect interpolants - (competition contribution). In TACAS (2), volume 10806 of Lecture Notes in Computer Science, pages 447–451. Springer, 2018.
  • [11] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Kenneth L. McMillan. Abstractions from proofs. In POPL, pages 232–244. ACM, 2004.
  • [12] Jochen Hoenicke and Tanja Schindler. Efficient interpolation for the theory of arrays. In IJCAR, volume 10900 of Lecture Notes in Computer Science, pages 549–565. Springer, 2018.
  • [13] Dejan Jovanovic and Bruno Dutertre. Property-directed k-induction. In FMCAD, pages 85–92. IEEE, 2016.
  • [14] Andrey Kupriyanov and Bernd Finkbeiner. Causal termination of multi-threaded programs. In CAV, volume 8559 of Lecture Notes in Computer Science, pages 814–830. Springer, 2014.
  • [15] Kenneth L. McMillan. An interpolating theorem prover. Theor. Comput. Sci., 345(1):101–121, 2005.
  • [16] Kenneth L. McMillan. Lazy abstraction with interpolants. In CAV, volume 4144 of Lecture Notes in Computer Science, pages 123–136. Springer, 2006.
  • [17] Alexander Nutz, Daniel Dietsch, Mostafa Mahmoud Mohamed, and Andreas Podelski. ULTIMATE KOJAK with memory safety checks - (competition contribution). In TACAS, volume 9035 of Lecture Notes in Computer Science, pages 458–460. Springer, 2015.
  • [18] Greta Yorsh and Madanlal Musuvathi. A combination method for generating interpolants. In CADE, volume 3632 of Lecture Notes in Computer Science, pages 353–368. Springer, 2005.