Verifying Bit-vector Invertibility Conditions in Coq (Extended Abstract)

08/26/2019 ∙ by Burak Ekici, et al. ∙ Leopold Franzens Universität Innsbruck The University of Iowa Stanford University 0

This work is a part of an ongoing effort to prove the correctness of invertibility conditions for the theory of fixed-width bit-vectors, which are used to solve quantified bit-vector formulas in the Satisfiability Modulo Theories (SMT) solver CVC4. While many of these were proved in a completely automatic fashion for any bit-width, some were only proved for bit-widths up to 65, even though they are being used to solve formulas over arbitrary bit-widths. In this paper we describe our initial efforts in proving a subset of these invertibility conditions in the Coq proof assistant. We describe the Coq library that we use, as well as the extensions that we introduced to it.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Reasoning logically about bit-vectors is useful for many applications in hardware and software verification. While Satisfiability Modulo Theories (SMT) solvers are able to reason about bit-vectors of fixed width, they currently require all widths to be expressed concretely (by a numeral) in their input formulas. For this reason, they cannot be used to prove properties of bit-vector operators that are parametric in the bit-width such as, for instance, the associativity of bit-vector concatenation. Proof assistants such as Coq [14], that have direct support for dependent types are better suited for such tasks.

Bit-vector formulas that are parametric in the bit-width arise in the verification of parametric Boolean functions and circuits (see, e.g., [9]). In our case, we are mainly interested in parametric lemmas that are relevant to internal techniques of SMT solvers for the theory of fixed-width bit-vectors. Such techniques are developed a priori for every possible bit-width, even though they are applied on a particular bit-width. Meta-reasoning about the correctness of such solvers then requires bit-width independent reasoning.

An example of the latter kind, which is the focus of the current paper, is the notion of invertibility conditions [10] as a basis for a quantifier-instantiation technique to reason about the satisfiability of quantified bit-vector formulas. For a trivial case of an invertibility condition consider the equation where , and are variables of the same bit-vector sort, and is bit-vector addition. In the terminology of Niemetz et al. [10], this equation is “invertible” for , i.e., solvable for , for any value of and . A general solution is represented by the term . Since the solution is unconditional, the invertibility condition for is simply the universally true formula . The formula stating this fact, referred to here as an invertibility equivalence, is , a valid formula in the theory of fixed-width bit-vectors for any bit-width for , and . In contrast, the equation is not always invertible for ( stands for bit-vector multiplication). A necessary and sufficient condition for invertibility is meaning that the invertibility equivalence is valid for any bit-width for , and  [10]. Notice that this invertibility condition involves the operations , and , and not that occurs in the literal itself. Niemetz et al. [10] provide a total of 160 invertibility conditions covering several bit-vector operators for both equations and inequations. However, they were able to verify, using SMT solvers, the corresponding invertibility equivalences only for concrete bit-widths up to 65, given the reasoning limitations of SMT solvers mentioned earlier. A recent paper by Niemetz et al. [11] addresses this challenge by translating these invertibility equivalences into quantified formulas over the combined theory of non-linear integer arithmetic and uninterpreted functions — a theory supported by a number of SMT solvers. While partially successful, this approach failed to verify over a quarter of the invertibility equivalences.

In this work, we approach the task of verifying the invertibility equivalences proposed in [10] by proving them interactively with the Coq proof assistant. We extend a rich Coq library for bit-vectors we developed in previous work [7] with additional operators and lemmas to facilitate the task of verifying invertibility equivalences for arbitrary bit-widths, and prove a representative subset of them. Our results offer evidence that proof assistants can support automated theorem provers in meta-verification tasks.

Our Coq library models the theory of fixed-width bit-vectors adopted by the SMT-LIB 2 standard [2].111 The SMT-LIB 2 theory is defined at http://www.smt-lib.org/theories.shtml. It represents bit-vectors as lists of Booleans. The bit-vector type is dependent on a positive integer that represents the length of the list. Underneath the dependent representation is a simply-typed or raw bit-vector type with a size function which is used to explicitly state facts on the length of the list. A functor translates an instance of a raw bit-vector along with specific information about its size into a dependently-typed bit-vector. For this work, we extended the library with the arithmetic right shift operation and the unsigned weak less-than and greater-than predicates and proved 18 invertibility equivalences. We initially proved these equivalences over raw bit-vectors and then used these proofs when proving the invertibility equivalences over dependent bit-vectors, as we explain in Section 4.

The remainder of this paper is organized as follows. After some technical preliminaries in Section 2, we provide an overview of invertibility conditions for the theory of fixed-width bit-vectors in Section 3 and discuss previous attempts to verify them. Then, in Section 4, we describe the bit-vector Coq library and our current extensions to it. In Section 5, we outline how we used the extended library to prove the correctness of a representative subset of invertibility equivalences. We conclude in Section 6 with directions for future work.

2 Preliminaries

We assume the usual terminology of many-sorted first-order logic with equality (see, e.g., [8] for more details). We denote equality by , and use as an abbreviation for . The signature of the SMT-LIB 2 theory of fixed-width bit-vectors includes a unique sort for each positive integer , which we denote here by . For every positive integer and a bit-vector of width , the signature includes a constant of sort in representing that bit-vector, which we denote as a binary string of length . The function and predicate symbols of are as described in the SMT-LIB 2 standard. Formulas of are built from variables (sorted by the sorts ), bit-vector constants, and the function and predicate symbols of , along with the usual logical connectives and quantifiers. We write to represent a formula whose free variables are from the set .

The semantics of -formulas is given by interpretations that extend a single many-sorted first-order structure so that the domain of every sort is the set of bit-vectors of bit-width , and the function and predicate symbols are interpreted as specified by the SMT-LIB 2 standard. A -formula is valid in the theory of fixed-width bit-vectors if it evaluates to true in every such interpretation.

In what follows, we denote by the sub-signature of containing the predicate symbols , , , (corresponding to strong and weak unsigned comparisons between bit-vectors, respectively), as well as the function symbols (bit-vector addition), , , (bit-wise conjunction, disjunction and negation), (2’s complement unary negation), and , and (left shift, and logical and arithmetical right shifts). We also denote by the extension of with the predicate symbols , , , and (corresponding to strong and weak signed comparisons between bit-vectors, respectively), as well as the function symbols , , , (corresponding to subtraction, multiplication, division and remainder), and (concatenation). We use to represent the bit-vectors composed of all -bits. Its numerical or bit-vector interpretation should be clear from context. Using bit-wise negation , we can express the bit-vectors composed of all -bits by .

3 Invertibility Conditions And Their Verification

Many applications rely on bit-precise reasoning and thus can be modeled using the SMT-LIB 2 theory of fixed-width bit-vectors. For certain applications, such as verification of safety properties for programs, quantifier-free reasoning is not enough, and the combination of bit-precise reasoning with the ability to handle quantifiers is needed. Niemetz et al. present a technique to solve quantified bit-vector formulas, which is based on invertibility conditions [10]. An invertibility condition for a variable in a -literal is a formula such that is valid in the theory of fixed-width bit-vectors. For example, consider the bit-vector literal where , and are distinct variables of the same sort. The invertibility condition for given in [10] is .

Niemetz et al. [10] define invertibility conditions for a representative set of literals having a single occurrence of , that involve the bit-vector operators of . The soundness of the technique proposed in that work relies on the correctness of the invertibility conditions. Every literal and its corresponding invertibility condition induce the invertibility equivalence

(1)

The correctness of invertibility equivalences should be verified for all possible sorts for the variables for which the condition is well sorted. More concretely, for the case where are all of sort , say, this means that one needs to prove, for all , the validity of

This was done in Niemetz et al. [10] using an SMT solver but only for concrete values of from to . A proof of Equation 1 that is parametric in the bit-width cannot be done with SMT solvers, since they currently only support the theory of fixed-width bit-vectors, where Equation 1 cannot even be expressed. To overcome this limitation, a later paper by Niemetz et al. [11] suggested a translation from bit-vector formulas with parametric bit-widths to the theory of (non-linear) integer arithmetic with uninterpreted functions. Thanks to this translation, the authors were able to verify, with the aid of SMT solvers for the theory of integer arithmetic with uninterpreted functions, the correctness of 110 out of 160 invertibility equivalences. None of the solvers used in that work were able to prove the remaining equivalences. For those, it then seems appropriate to use a proof-assistant, as this allows for more intervention by the user who can provide crucial intermediate steps. It goes without saying that even for the 110 invertibility equivalences that were proved, the level of confidence achieved by proving them in a proof-assistant such as Coq would be greater than a verification (without a verified formal proof) by an SMT solver.

In the rest of this paper we describe our initial efforts and future plans for proving the invertibility equivalences, starting with those that were not proved in [11].

4 The Coq Bit-vector Library

In this section, we describe the Coq library we use and the extensions we developed with the goal of formalizing and proving invertibility equivalences. The original library was developed for SMTCoq [7], a Coq plugin that enables Coq to dispatch proofs to external proof-producing solvers. It is used to represent SMT-LIB 2 bit-vectors in Coq. Coq’s own library of bit-vectors [6] was an alternative, but it has only definitions and no lemmas. A more suitable substitute could have been the Bedrock Bit Vectors Library [4] or the SSRBit Library [3]. We chose the SMTCoq library mainly because it was explicitly developed to represent SMT-LIB 2 bit-vectors in Coq and comes with a rich set of lemmas relevant to proving the invertibility equivalences.

The SMTCoq library contains both a simply-typed and dependently-typed theory of bit-vectors implemented as module types. The former, which we also refer to as a theory of raw bit-vectors

, formalizes bit-vectors as Boolean lists while the latter defines a bit-vector as a Coq record, with its size as the parameter, made of two fields: a Boolean list and a coherence condition to ensure that the parameterized size is indeed the length of the given list. The library also implements a functor module from the simply-typed module to the dependently-typed module establishing a correspondence between the two theories. This way, one can first prove a bit-vector property in the context of the simply-typed theory and then map it to its corresponding dependently-typed one via the functor module. Note that while it is possible to define bit-vectors natively as a dependently-typed theory in Coq and prove their properties there, it would be cumbersome and unduly complex to do dependent pattern matching or case analysis over bit-vector instances because of the complications brought by unification in Coq (which is inherently undecidable). One can try to handle such complications as illustrated by Sozeau 

[13]. However, we found the two-theory approach of Ekici et al. [7] more convenient in practice for our purposes.

The library adopts the little-endian notation for bit-vectors, thus following the internal representation of bit-vectors in SMT solvers such as CVC4. This makes arithmetic operations easier to perform since the least significant bit of a bit-vector is the head of the list representing it in the raw theory.

Out of the 11 bit-vector operators and 10 predicates contained in , the library had support for 8 operators and 6 predicates. The supported predicates, however, can be used to express the other 4. The predicate and function symbols that were not directly supported by the library were the weak inequalities , , , and the operators , , and . We extended the library with the operator and the predicates and and redefined and , as explained in Section 5.

We focused on invertibility conditions for literals of the form and , where , and are variables and and are respectively function and predicate symbols in (invertibility conditions for such literals were found in [10] for the extended signature ). was chosen as a representative set because it seemed both expressive enough and feasible for proofs in Coq. Such literals, as well as their invertibility conditions, include only operators that are supported by the library (after its extension with , , and ).

[fontsize=,xleftmargin=1em,linenos=true, escapeinside=!!]coq Fixpoint ulelistbigendian (x y : list bool) := match x, y with — nil, nil =¿ true — nil, =¿ false — , nil =¿ false — xi :: x’, yi :: y’ =¿ ((eqb xi yi) && (ulelistbigendian x’ y’)) —— ((negb xi) && yi) end.

Definition ulelist (x y: list bool) := (ulelistbigendian (rev x) (rev y)).

Definition bvule (a b : bitvector) := if @size a =? @size b then ulelist a b else false.

Figure 1: Definitions of in Coq.

To demonstrate the intuition and various aspects of the extension of the library, we briefly describe the addition of  (the definition of is similar). The relevant Coq definitions are provided in Figure 1.222Both the library and the proofs of invertibility equivalences can be found at https://github.com/ekiciburak/bitvector/tree/pxtp2019. It compiles with coqc-8.9.0. Like most other operators, is defined in several layers. The function bv_ule, at the highest layer, ensures that comparisons are between bit-vectors of the same size and then calls ule_list. Since we want to compare bit-vectors starting from their most significant bits and the input lists start instead with the least significant bits (because of the little-endian encoding), ule_list first reverses the two lists. Then it calls ule_list_big_endian, which we consider to be at the lowest layer of the definition. ule_list_big_endian then does a lexicographical comparison of the two lists, starting from the most significant bits.

To see why the addition of to the library is useful, consider, for example, the following parametric lemma, stating that is the largest unsigned bit-vector of its type:

(2)

When not using this explicit operator, we usually rewrite it as:

(3)

In such cases, since the definitions of and have a similar structure to the one in Figure 1, we strip down the layers of and separately, whereas using , we only do this once. Depending on the specific proof at hand, using is sometimes more convenient for this reason.

5 Proving Invertibility Equivalences in Coq

In this section we provide specific details about proving invertibility equivalences in Coq. In addition to the bit-vector library described in Section 4, in several proofs of invertibility equivalences we benefited from CoqHammer [5]

, a plug-in that aims at extending the automation in Coq by combining machine learning and automated reasoning techniques in a similar fashion to what is done in Isabelle/HOL 

[12]. Note that one does not need to install CoqHammer in order to build the bit-vector library, since all the proof reconstruction tactics of CoqHammer are included in it.

The natural representation of bit-vectors in Coq is the dependently-typed representation, and therefore the invertibility equivalences are formulated using this representation. As discussed in Section 4, however, proofs in this representation are composed of proofs over simply-typed bit-vectors, which are easier to reason about. Some conversions between the different representations are then needed to lift a proof over raw bit-vectors to one over dependently-typed bit-vectors.

For example, Figure 2 includes a proof of the following direction of the invertibility equivalence for and :

(4)

In the proof, lines 22 transform the dependent bit-vectors from the goal and the hypotheses into simply-typed bit-vectors. Then, lines 22 invoke the corresponding lemma for simply-typed bit-vectors (called InvCond.bvashr_ult2_rtl) along with some simplifications.

[fontsize=,xleftmargin=1em,linenos=true, escapeinside=!!]coq Theorem bvashrult2rtl : forall (n : N), forall (s t : bitvector n), (exists (x : bitvector n), (bvult (bvashra s x) t = true)) -¿ (((bvult s t = true) (bvslt s (zeros n)) = false) / (bveq t (zeros n)) = false). Proof. intros n s t H. destruct H as ((x, Hx), H).!! destruct s as (s, Hs). destruct t as (t, Ht). unfold bvult, bvslt, bvashra, bveq, bv in *. cbn in *.!! specialize (InvCond.bvashrult2rtl n s t Hs Ht); intro STIC.!! rewrite Hs, Ht in STIC. apply STIC. now exists x. !! Qed.

Figure 2: A proof of one direction of the invertibility equivalence for and using dependent types.

Most of the effort in this project went into proving equivalences over raw bit-vectors. As an illustration, consider the following equivalence over and :

(5)

The left-to-right implication is easy to prove using itself as the witness of the existential proof goal and considering the symmetry between and . The proof of the right-to-left implication relies on the following lemma:

(6)

From the right side of the equivalence in Equation 5, we get some for which holds. Flipping the inequality, we have that ; using this, and transitivity over and , Lemma 6 gives us the left side of the equivalence in Equation 5.

As mentioned in Section 4, we have redefined the shift operators and in the library. This was instrumental, for example, in the proof of Equation 6. Figure 3 includes both the original and new definitions of . The definitions of are similar. Originally, was defined using the shl_one_bit and the shl_n_bits functions. shl_one_bit shifts the bit-vector to the left by one bit and is repeatedly called by shl_n_bits to complete the shift. The new definition shl_n_bits_a uses mk_list_false which constructs the necessary list of s and appends (++ in Coq) it to the beginning of the list (because of the little-endian encoding); the bits to be shifted from the original bit-vector are retrieved using the firstn function, which is defined in the Coq library for lists. The nat type used in Figure 3 is the Coq representation of Peano natural numbers that has 0 and S as its two constructors — as depicted in the pattern match in lines 3 and 3. The theorem at the bottom of Figure 3 allows us to switch between the two definitions when needed. Function bv_shl defines the left shift operation using shl_n_bits whereas bv_shl_a does it using shl_n_bits_a.

[fontsize=,xleftmargin=1em,linenos=true, escapeinside=!!]coq Definition shlonebit (a: list bool) := match a with — [] =¿ [] — =¿ false :: removelast a end.

Fixpoint shlnbits (a: list bool) (n: nat) := match n with — O =¿ a !! — S n’ =¿ shlnbits (shlonebit a) n’ !! end.

Definition shlnbitsa (a: list bool) (n: nat) := if (n ¡? length a)mklistfalse n ++ firstn (length a - n) a else mklistfalse (length a).

Theorem bvshleq: forall (a b : bitvector), bvshl a b = bvshla a b.

Figure 3: Various definitions of .

The new definition uses firstn and ++, over which many necessary properties are already proven in the standard library. This benefits us in manual proofs, and in calls to CoqHammer, since the latter is able to use lemmas from the imported libraries to prove the goals that are given to it. Using this representation, proving Equation 6 reduces to proving Lemmas bv_ule_1_firstn and bv_ule_pre_append, shown in Figure 4. The proof of bv_ule_pre_append benefited from the property app_comm_cons from the standard list library of Coq, while firstn_length_le was useful in reducing the goal of bv_ule_1_firstn to Coq’s equivalent of Equation 2. The statements of the properties mentioned from the standard library are also shown in Figure 4. mk_list_true creates a bit-vector that represents , of the length given to it as input, and bv_ule is the representation of in the bit-vector library. bv_ule has output type bool (and so we equate terms in which it occurs to true), while the functions from the standard library have output type Prop. We also have two definitions for , and a proof of their equivalence (as done for the other shift operators).

[fontsize=,xleftmargin=1em,linenos=true, escapeinside=!!]coq Lemma bvule1firstn : forall (n : nat) (x : bitvector), (n ¡ length x)bvule (firstn n x) (firstn n (mklisttrue (length x))) = true.

Lemma bvulepreappend : forall (x y z : bitvector), bvule x y = true -¿ bvule (z ++ x) (z ++ y) = true.

Theorem appcommcons : forall (x y:list A) (a:A), a :: (x ++ y) = (a :: x) ++ y.

Lemma firstnlengthle: forall l:list A, forall n:nat, n ¡= length l -¿ length (firstn n l) = n.

Figure 4: Examples of lemmas used in proofs of invertibility equivalences.

Table 1 summarizes the results of proving invertibility equivalences for invertibility conditions in the signature . In the table, means that the invertibility equivalence was successfully verified in Coq but not in [11], while means the opposite; means that the invertibility equivalence was verified using both approaches, and means that it was verified with neither. We successfully proved all invertibility equivalences over that are expressible in , including 4 that were not proved in [11]. For the rest of the predicates, we focused only on the 8 invertibility equivalences that were not proved in [11], and succeeded in proving 7 of them. Overall, these results strictly improve the results of [11], as we were able to prove 11 additional invertibility equivalences in Coq. Taking into account our work together with [11], only one invertibility equivalence for the restricted signature is not fully proved yet, the one for the literal , although one direction of the equivalence, namely , was successfully proved both in Coq and in [11].

Table 1: Proved invertibility equivalences in where ranges over the given predicate symbols.

6 Conclusion and Future Work

We have described our work-in-progress on verifying bit-vector invertibility conditions in the Coq proof assistant, which required extending a bit-vector library in Coq. The most immediate direction for future work is proving more of the invertibility equivalences supported by the bit-vector library. In addition, we plan to extend the library so that it supports the full syntax in which invertibility conditions are expressed, namely . We expect this to be useful also for verifying properties about bit-vectors in other applications.

References

  • [1]
  • [2] Clark Barrett, Aaron Stump & Cesare Tinelli (2010): The SMT-LIB Standard: Version 2.0. In A. Gupta & D. Kroening, editors: Proceedings of the 8th International Workshop on Satisfiability Modulo Theories (Edinburgh, UK).
  • [3] Arthur Blot, Pierre-Evariste Dagand, & Julia Lawall: Bit Sequences and Bit Sets Library. Available at https://github.com/pedagand/ssrbit.
  • [4] Tej Chajed, Haogang Chen, Adam Chlipala, Joonwon Choi, Andres Erbsen, Jason Gross, Samuel Gruetter, Frans Kaashoek, Alex Konradi, Gregory Malecha, Duckki Oe, Murali Vijayaraghavan, Nickolai Zeldovich & Daniel Ziegler: Bedrock Bit Vectors Library. Available at https://github.com/mit-plv/bbv.
  • [5] Lukasz Czajka & Cezary Kaliszyk (2018): Hammer for Coq: Automation for Dependent Type Theory. J. Autom. Reasoning 61(1-4), pp. 423–453, doi:10.1007/s10817-018-9458-4.
  • [6] Jean Duprat: Library Coq.Bool.Bvector. Available at https://coq.inria.fr/library/Coq.Bool.Bvector.html.
  • [7] Burak Ekici, Alain Mebsout, Cesare Tinelli, Chantal Keller, Guy Katz, Andrew Reynolds & Clark Barrett (2017): SMTCoq: A Plug-In for Integrating SMT Solvers into Coq. In: Proceedings of 29th International Conference on Computer Aided Verification (CAV 2017), Lecture Notes in Computer Science 10427, Springer, pp. 126–133, doi:10.1007/s10703-012-0163-3
  • [8] Herbert B. Enderton (2001): Chapter TWO - First-Order Logic. In Herbert B. Enderton, editor: A Mathematical Introduction to Logic (Second Edition), second edition edition, Academic Press, Boston, pp. 67 – 181, doi:10.1016/B978-0-08-049646-7.50008-4.
  • [9] Aarti Gupta & Allan L. Fisher (1993): Representation and Symbolic Manipulation of Linearly Inductive Boolean Functions. In: Proceedings of the 1993 IEEE/ACM International Conference on Computer-aided Design, ICCAD ’93, IEEE Computer Society Press, Los Alamitos, CA, USA, pp. 192–199. Available at http://dl.acm.org.stanford.idm.oclc.org/citation.cfm?id=259794.259827.
  • [10] Aina Niemetz, Mathias Preiner, Andrew Reynolds, Clark Barrett & Cesare Tinelli (2018): Solving Quantified Bit-Vectors Using Invertibility Conditions. In: Proceedings of 30th International Conference on Computer Aided Verification (CAV 2018), pp. 236–255, doi:10.1007/978-3-319-96142-2_16.
  • [11] Aina Niemetz, Mathias Preiner, Andrew Reynolds Yoni Zohar, Clark Barrett & Cesare Tinelli (2019): Towards Bit-Width-Independent Proofs in SMT Solvers. To appear in the proceedings of CADE-27.
  • [12] Tobias Nipkow, Lawrence C Paulson & Markus Wenzel (2002): Isabelle/HOL: a proof assistant for higher-order logic. Lecture Notes in Computer Science 2283, Springer Science & Business Media, doi:10.1007/3-540-45949-9_6
  • [13] Matthieu Sozeau (2010): Equations: A Dependent Pattern-Matching Compiler. In: Proceedings of the 1st International Conference on Interactive Theorem Proving (ITP 2010), pp. 419–434, doi:10.1007/978-3-642-14052-5_29.
  • [14] The Coq development team (2019): The Coq Proof Assistant Reference Manual Version 8.9. Available at https://coq.inria.fr/distrib/current/refman/.