 # Partial Quantifier Elimination With Learning

We consider a modification of the Quantifier Elimination (QE) problem called Partial QE (PQE). In PQE, only a small part of the formula is taken out of the scope of quantifiers. The appeal of PQE is that many verification problems, e.g. equivalence checking and model checking, reduce to PQE and the latter is much easier to solve than complete QE. Earlier, we introduced a PQE algorithm based on the machinery of D-sequents. A D-sequent is a record stating that a clause is redundant in a quantified CNF formula in a specified subspace. To make this algorithm efficient, it is important to reuse learned D-sequents. However, reusing D-sequents is not as easy as conflict clauses in SAT-solvers because redundancy is a structural rather than a semantic property. (So, a clause is redundant only in some subset of logically equivalent CNF formulas.) We address this problem by introducing a modified definition of D-sequents that facilitates their safe reusing. We also present a new PQE algorithm that proves redundancy of target clauses one by one rather than all at once as in the previous PQE algorithm. We experimentally show the improved performance of this algorithm. We demonstrate that reusing D-sequents makes the new PQE algorithm even more powerful.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Many verification problems reduce to Quantifier Elimination (QE). So, any progress in QE is of great importance. In this paper, we consider propositional CNF formulas with existential quantifiers. Given formula where and are sets of variables, the QE problem is to find a quantifier-free formula such that . Building a practical QE algorithm is a tall order. In addition to the sheer complexity of QE, a major obstacle here is that the size of formula can be prohibitively large.

There are at least two ways of making QE easier to solve. First, one can consider only instances of QE where is small, which limits the size of . In particular, if , QE reduces to the satisfiability problem (SAT). This line of research featuring very efficient methods of model checking based on SAT [3, 32, 7] has gained great popularity. Another way to address the complexity of QE suggested in  is to perform partial QE (PQE). Given formula , the PQE problem is to find a quantifier-free formula such that . We will say that formula is obtained by taking out of the scope of quantifiers.

The appeal of PQE is that in many verification problems, one can replace complete QE with PQE that can be dramatically more efficient than QE (e.g. when is much smaller than ). Importantly, using quantifiers gives PQE extra semantic power over SAT-based methods. For instance, an equivalence checker based on PQE  enables construction of short resolution proofs of equivalence for a very broad class of structurally similar circuits. These proofs are based on the notion of clause redundancy111A clause is a disjunction of literals. So a CNF formula is a conjunction of clauses: . We also consider as the set of clauses . Clause is redundant in if . in a quantified formula and thus cannot be generated by a traditional SAT-solver. In , we show that a PQE-solver can check if the reachability diameter exceeds a specified value. So it can turn bounded model checking  into unbounded as opposed to a pure SAT-solver. Importantly, no generation of an inductive invariant is required by the method of . In Appendix A, we recall the two applications of PQE above222All appendices will be available in a technical report published by arXiv..

If is a solution to the PQE problem above, it is implied by . So can be obtained by resolving clauses of . However, a PQE-solver based on resolution alone cannot efficiently address the following “termination problem”. Suppose one builds incrementally, adding one clause at a time. When can one terminate this procedure claiming that is a solution to the PQE problem (and so )?

In [20, 22] we approached the termination problem above using the following observation. Assume for the sake of simplicity that every clause of contains at least one variable of . Then, if is a solution, can be dropped from . Thus, becomes a solution as soon as it makes the clauses of redundant. In , we introduced a PQE-solver called - (DS stands for “D-Sequent”). - is a branching algorithm that, in addition to deriving new clauses and adding them to , generates dependency sequents (D-sequents). A D-sequent is a record saying that a clause is redundant in a specified subspace. - branches until proving redundancy333By ”proving a clause redundant” we mean “making redundant by adding new clauses (if necessary) and then proving redundant”. of target clauses becomes trivial at which point so-called “atomic” D-sequents are generated. The D-sequents of different branches are merged using a resolution-like operation called join. Upon completing the search tree, - derives D-sequents stating redundancy of the clauses of .

- has two flaws. First, - employs “multi-event” backtracking. Namely, it backtracks only when all clauses of are proved redundant in the current subspace. (This is different from a SAT-solver that backtracks as soon as just one clause of the formula is falsified.) The intuition here is that multi-event backtracking leads to building very deep and thus very large search trees. Second, - does not reuse D-sequents derived in different branches. The problem here is that redundancy is a structural rather than a semantic property. So, a clause redundant in formula may not be redundant in logically equivalent to (whereas a semantic property holds for all equivalent formulas). So, reusing a D-sequent is not as easy as reusing a clause learned by a SAT-solver.

In this paper, we address both flaws of -. First, we present a new PQE algorithm called - that employs single-event backtracking

. At any given moment,

- proves redundancy of only one clause. Once this goal is achieved, it picks a new clause to prove redundant. Second, we modify the definition of D-sequents given in . A PQE-algorithm that employs the new type of D-sequents (e.g. -) can safely reuse them.

Our contribution is as follows. First, we give a new definition of D-sequents facilitating their reusing (Section IV). Second, we redefine atomic D-sequents and the join operation to accommodate the new definition of D-sequents (Sections V and VI). Third, we present a new algorithm for PQE that employs single-event backtracking (Sections VII and VIII). Fourth, we provide experimental results showing the benefit of single-event backtracking and D-sequent reusing (Section IX).

## Ii A Simple Example

In this section, we present a simple example of performing PQE by deriving D-sequents. A D-sequent of  is a record stating redundancy of clause in in subspace (where is an assignment to variables of ). Let be a formula where , , , , . Consider the PQE problem of taking out of the scope of quantifiers. Below we solve this problem by proving redundant.

In subspace , clauses are unit (i.e. one literal is unassigned, the rest are falsified). After assigning , to satisfy , the clause is falsified. Using the standard conflict analysis  one derives a conflict clause . Adding to makes redundant in subspace . So the D-sequent equal to holds where and .

In subspace , the clause is “blocked” at . That is no clause of is resolvable with on in subspace because is satisfied by (see Subsection V-C). So is redundant in formula and the D-sequent equal to holds where . D-sequents and are examples of so-called atomic D-sequents. They are derived when proving clause redundancy is trivial (see Section V). One can produce a new D-sequent where by “joining” and at (see Subsection VI). This D-sequent states the unconditional redundancy of in . So, . Since implies , then . So is a solution to our PQE problem.

## Iii Basic Definitions

In this paper, we consider only propositional CNF formulas. In the sequel, when we say “formula” without mentioning quantifiers we mean a quantifier-free CNF formula.

###### Definition 1

Let be a CNF formula and be a subset of variables of . We will refer to as an formula.

###### Definition 2

Let be a CNF formula. denotes the set of variables of and denotes .

###### Definition 3

Let be a set of variables. An assignment to is a mapping where . We will denote the set of variables assigned in as . We will denote as the fact that a) and b) every variable of has the same value in and .

###### Definition 4

Let be a clause, be a formula that may have quantifiers, and be an assignment. if is satisfied by ; otherwise it is the clause obtained from by removing all literals falsified by . denotes the formula obtained from by replacing every clause with .

###### Definition 5

Let be formulas that may have quantifiers. We say that are equivalent, written , if for all assignments where , we have .

###### Definition 6

The Quantifier Elimination (QE) problem for formula is to find a formula such that .

###### Definition 7

The Partial QE (PQE) problem of taking out of the scope of quantifiers in is to find formula such that .

###### Remark 1

From now on, we will use and to denote sets of quantified and non-quantified variables respectively. We will assume that variables denoted by and are in and respectively. Using in a quantifier-free formula implies that in the context of QE/PQE, and specify the quantified and non-quantified variables respectively.

###### Definition 8

Let be an formula. A clause of is called an -clause if .

###### Definition 9

Let be a CNF formula and and . The clauses of are redundant in if . The clauses of are redundant in if . Note that implies but the opposite is not true.

## Iv Dependency Sequents (D-sequents)

In this section, we modify the definition of D-sequents introduced in . In Subsection IV-A, we explain the reason for such a modification. The new definition is given in Subsection IV-B. Finally, in Subsections IV-C and IV-D we discuss the topic of reusing single and multiple D-sequents.

### Iv-a Motivating example

Let formula contain two identical -clauses and . The presence of makes redundant and vice versa. So, D-sequents and hold where . Denote them as and respectively. (Here, we use the old definition of D-sequents given in .) and state that and are redundant in individually. Using and together (to remove both and from ) is incorrect because it involves circular reasoning.

The problem here is that redundancy is a structural property. So, the redundancy of in does not imply that of in even though . The definition of a D-sequent given in  does not help to address the problem above. This definition states redundancy of a clause only with respect to formula . (This makes it hard to reuse D-sequents and is the reason why the PQE-solver introduced in  does not reuse D-sequents). We address this problem by adding a structural constraint to the definition of a D-sequent. It specifies a subset of formulas where a D-sequent holds and so identifies the situations where this D-sequent may not hold. Adding structural constraints to D-sequents and makes them mutually exclusive (see Example 1 below).

### Iv-B Definition of D-sequents

###### Definition 10

Let be an formula and   be an assignment to . Let be an -clause of and be a subset of . A dependency sequent (D-sequent) has the form . It states that clause is redundant in every formula logically equivalent to where .

###### Definition 11

The assignment and formula above are called the conditional and the structure constraint of the D-sequent respectively. We will call , where , a member formula of . We will say that a D-sequent specified by holds if it states redundancy of according to Definition 10 (i.e. if is correct). We will say that is applicable to a formula if the latter is a member formula of . Otherwise, is called inapplicable to .

The structure constraint of Definition 10 specifies a subset of formulas logically equivalent to where the clause is redundant. From a practical point of view, the presence of influences the order in which -clauses can be proved redundant. Proving an -clause of redundant and removing it from renders the D-sequent inapplicable to the modified formula (i.e. ). Thus, using implies that will be proved redundant after .

###### Example 1

Consider the example introduced in Subsection IV-A. In terms of Definition 10, the D-sequent looks like where , (because the presence of clause is used to prove redundant). Similarly, the D-sequent looks like where . D-sequents and are mutually exclusive: using to remove from as a redundant clause renders inapplicable and vice versa.

###### Remark 2

We will abbreviate D-sequent to if is known from the context.

### Iv-C Reusing a single D-sequent

Let be a D-sequent specified by . We will say that is active in subspace for formula if

• and

• is applicable to (see Definition 11)

The activation of means that it can be safely reused (i.e. can be dropped in the subspace as redundant444Redundancy of a clause in subspace does not trivially imply its redundancy in subspace , i.e. in a smaller subspace (see Appendix B). in ).

An applicable D-sequent equal to is called unit under assignment if all value assignments of but one are met in . Suppose, for instance, and contains but is not assigned in . Then is unit. Adding the assignment to , activates , which indicates that is redundant in the subspace . So, a unit D-sequent can be used like a unit clause in Boolean Constraint Propagation (BCP) of a SAT-solver. Namely, one can use to derive the “deactivating” assignment as a direction to a subspace where is not proved redundant yet.

### Iv-D Reusing a set of D-sequents

In Example 1, we described D-sequents that cannot be active together. Below, we introduce a condition under which a set of D-sequents can be active together.

###### Definition 12

Assignments and are called compatible if every variable of is assigned the same value in and .

###### Definition 13

Let be an formula. Let be D-sequents specified by ,, respectively. They are called consistent if a) every pair ,, is compatible and b) there is an order on such that obtained after using D-sequents is a member formula of , .

The item b) above means that can be active together if there is an order following which one guarantees the applicability of every D-sequent. (The D-sequents and of Example 1 are inconsistent because such an order does not exist. Applying one D-sequent makes the other inapplicable.) Definition 13 specifies a sufficient condition for a set of D-sequents to be active together in a subspace where , . If this condition is met, can be safely removed from in the subspace (see ).

## V Atomic D-sequents

In this section, we describe D-sequents called atomic. An atomic D-sequent is generated when proving a clause redundant is trivial . We modify the definitions of  to accommodate the appearance of a structure constraint.

### V-a Atomic D-sequents of the first kind

###### Proposition 1

Let be an formula and and . Let where satisfy . Then the D-sequent holds where and . We will refer to it as an atomic D-sequent of the first kind.

Proofs of all propositions can be found in . Satisfying by an assignment does not require the presence of any other clause of . Hence, the structure constraint of a D-sequent of the first kind is an empty set of clauses.

###### Example 2

Let be an formula and be a clause of . Since is satisfied by assignments and , D-sequents and hold.

### V-B Atomic D-sequents of the second kind

###### Proposition 2

Let be an formula and be an assignment to . Let and be clauses of and be an -clause. Let still be an -clause and imply (i.e. every literal of is in ). Then the D-sequent holds where . We will refer to it as an atomic D-sequent of the second kind.

###### Example 3

Let be an formula. Let and be clauses of . Let . Since implies the D-sequent holds.

### V-C Atomic D-sequents of the third kind

###### Definition 14

Let clauses , have opposite literals of exactly one variable . The clause having all literals of but those of is called the resolvent of , on . The clause is said to be obtained by resolution on . Clauses , are called resolvable on .

###### Definition 15

A clause of a CNF formula is called blocked at variable , if no clause of is resolvable with on . The notion of blocked clauses was introduced in .

If a clause of an formula is blocked with respect to a quantified variable in a subspace, it is redundant in this subspace. This fact is used by the proposition below.

###### Proposition 3

Let be an formula. Let be an -clause of and . Let be the clauses of resolvable with on variable . Let ,, be consistent D-sequents (see Definition 13). Then the D-sequent holds where = and . We will refer to it as an atomic D-sequent of the third kind.

###### Example 4

Let be an formula. Let be the only clauses of with variable where , , . Since satisfies , the D-sequent holds. Suppose that the D-sequent holds where . Note that the two D-sequents above are consistent. So, from Proposition 3 it follows that the D-sequent holds where . The redundancy of in the subspace is caused by the fact that it is blocked at in this subspace.

## Vi Join Operation

In this section, we present a resolution-like operation called join . It can be viewed as a way to generate a new D-sequent from existing ones. In Appendix C, we describe two more ways to generate new D-sequents. We modify the definition of the join operation given in   to accommodate the appearance of a structure constraint.

###### Definition 16

Let and be assignments in which exactly one variable is assigned different values. The assignment consisting of all the assignments of and but those to is called the resolvent of ,  on . Assignments ,  are called resolvable on .

###### Proposition 4

Let be an formula. Let D-sequents and hold. Let , be resolvable on and   be the resolvent. Then the D-sequent holds where .

###### Definition 17

We will say that the D-sequent of Proposition 4 is produced by joining D-sequents and at variable .

###### Example 5

Let be an formula. Let be clauses of and be an -clause. Let , be D-sequents were , , , . By joining them at , one produces the D-sequent where and .

## Vii Introducing DS-PQE+

In this section, we describe a PQE-algorithm called -. As we mentioned earlier, in contrast to - of , - uses single-event backtracking. Namely, - proves redundancy of -clauses one by one and backtracks as soon as the current target -clause is proved redundant in the current subspace. Besides, due to introduction of structure constraints, it is safe for - to reuse D-sequents. A proof of correctness of - is given in Appendix H.

### Vii-a Main loop of DS-PQE+

The main loop of - is shown in Fig. 1. - accepts formulas and set and outputs formula such that . We use symbol ’’ to separate in/out-parameters and in-parameters. For instance, the line means that formulas change by - (via adding/removing clauses) whereas does not.

- first initializes the set Ds of learned D-sequents. It starts an iteration of the loop with picking an -clause (line 3). If every clause of contains only variables of , then is a solution to the PQE problem above (line 4). Otherwise, - invokes a procedure called PrvRed to prove redundant. This may require adding new clauses to and . In particular, PrvRed may add to new -clauses to be proved redundant in later iterations of the loop. Finally, - removes from (line 6).

### Vii-B Description of PrvRed procedure

The pseudo-code of PrvRed is shown in Fig 2. The objective of PrvRed is to prove a clause of redundant. We will refer to this clause as the primary target and denote it as . To prove redundant, PrvRed, in general, needs to prove redundancy of other -clauses called secondary targets. At any given moment, PrvRed tries to prove redundancy of only one -clause. If a new secondary target is selected, the current target is pushed on a stack to be finished later. How - manages secondary targets is described in Section VIII. (The lines of code relevant to this part of - are marked in Fig. 2 and 3 with an asterisk.)

First, PrvRed initializes its variables (lines 1-3). The stack of the target -clauses is initialized to . The current assignment to is initially empty. So is the assignment queue . The current target clause is set to . The main work is done in a while loop that is similar to the main loop of a SAT-solver . In particular, PrvRed uses the notion of a decision level. The latter consists of a decision assignment and implied assignments derived by BCP. Decision level number 0 is an exception: it consists only of implied assignments. BCP derives implied assignments from unit clauses and from unit D-sequents (see Subsection IV-C).

The operation of PrvRed in the while loop can be partitioned into three parts identified by dotted lines. The first part (lines 5-10) starts with checking if the assignment queue is empty. If so, a new assignment is picked (line 6) where and and added to . PrvRed first assigns the variables of for the reason explained in Appendix D describing the decision making of -. So , only if all variables of are assigned. Then PrvRed calls the BCP procedure. If BCP identifies a backtracking condition, PrvRed goes to the second part. (This means that is proved redundant in the subspace . In particular, a backtracking condition is met if BCP falsifies a clause or activates a D-sequent learned earlier.) Otherwise, PrvRed begins a new iteration.

PrvRed starts the second part (lines 11-17) with generating a conflict clause or a new D-sequent for (line 11). Then PrvRed stores in Ds (if it is worth reusing) or adds to . If a clause of is used in generation of , the latter is added to . Otherwise, is added to . If the current target is , one uses “regular” backtracking (lines 13-14, see Subsection VII-E). If the conditional of is empty, PrvRed terminates (line 15). Otherwise, an assignment derived from or is added to (line 16). This derivation is possible because after backtracking, the generated conflict clause (or the D-sequent ) becomes unit. If the assignment above is derived from , PrvRed keeps until the decision level of this assignment is eliminated (even if is not stored in Ds). The third part (lines 18-25) is described in Section VIII.

### Vii-C Bcp

The main loop of BCP consists of the three parts shown in Fig. 3 by dotted lines. (Parameters and are defined in Fig. 2.) In the first part (lines 2-9), BCP extracts555As we mentioned above, assignments to variables of are made after those to variables of . So the former are kept in until reaching the decision level where all variables of are assigned. an assignment from the assignment queue (line 2). It can be a decision assignment or one derived from a clause or D-sequent . Then, BCP updates the current assignment (line 9). Lines 3-8 are explained in Subsection VIII-B.

In the second part (lines 10-17), BCP first checks if the current target clause is satisfied by . If so, BCP terminates returning the backtracking condition SatTrg (line 11). Then BCP identifies the clauses of satisfied or constrained by (line 12). If a clause becomes unit, BCP stores the assignment derived from this clause in . If a falsified clause is found, BCP terminates (lines 13-14). Otherwise, BCP checks the applicable D-sequents of Ds stating the redundancy of (line 15). If such a D-sequent became unit, the deactivating assignment is added to (see Subsection IV-C). If an active D-sequent is found, BCP terminates (lines 16-17).

Finally, BCP checks if is blocked (lines 18-19). If not, BCP reports that no backtracking condition is met (line 20).

### Vii-D D-sequent generation

When BCP reports a backtracking condition, the Lrn procedure (line 11 of Fig 2) generates a conflict clause or a D-sequent . Lrn generates a conflict clause when BCP returns a falsified clause and every implied assignment used by Lrn to construct is derived from a clause . Adding to makes the current target clause redundant in subspace . Otherwise666There is one case where Lrn generates a D-sequent and a clause (see Appendix E-E). , Lrn generates a D-sequent for . The D-sequent is built similarly to a conflict clause . First, Lrn forms an initial D-sequent equal to (unless an existing D-sequent is activated by BCP). The conditional and structure constraint of depend on the backtracking condition returned by BCP. If contains assignments derived at the current decision level, Lrn tries to get rid of them as it is done by a SAT-solver generating a conflict clause. Only instead of resolution, Lrn uses the join operation. Let be the assignment of derived at the current decision level where . If it is derived from a D-sequent equal to , Lrn joins and at to produce a new D-sequent . If is derived from a clause , Lrn joins with the atomic D-sequent of the second kind stating the redundancy of when is falsified. is equal to where is the shortest assignment falsifying and . Lrn keeps joining D-sequents until it builds a D-sequent whose conditional does not contain assignments derived at the current decision level (but may contain the decision assignment of this level). Appendix E gives examples of D-sequents built by Lrn.

### Vii-E Regular backtracking

If is the primary target , PrvRed calls the backtracking procedure RegBcktr (line 14 of Fig. 2). If Lrn returns a conflict clause , RegBcktr backtracks to the smallest decision level where is still unit. So an assignment can be derived from . (This is how a SAT-solver with conflict clause learning backtracks .) Similarly, if Lrn returns a D-sequent , RegBcktr backtracks to the smallest decision level where is still unit. So an assignment can be derived from .

## Viii Using Secondary-Target Clauses

The objective of PrvRed (see Fig. 2) is to prove the primary target clause redundant. To achieve this goal, PrvRed may need to prove redundancy of so-called secondary target clauses. In this section, we describe how this is done.

### Viii-a The reason for using secondary targets

Let be an formula. Assume that PrvRed tries to prove redundancy of the clause where . Suppose that is the current assignment to and