 # Quantifier Elimination With Structural Learning

We consider the Quantifier Elimination (QE) problem for propositional CNF formulas with existential quantifiers. QE plays a key role in formal verification. Earlier, we presented an approach based on the following observation. To perform QE, one just needs to add a set of clauses depending on free variables that makes the quantified clauses (i.e. clauses with quantified variables) redundant. To implement this approach, we introduced a branching algorithm making quantified clauses redundant in subspaces and merging the results of branches. To implement this algorithm we developed the machinery of D-sequents. A D-sequent is a record stating that a quantified clause is redundant in a specified subspace. Redundancy of a clause is a structural property (i.e. it holds only for a subset of logically equivalent formulas as opposed to a semantic property). So, re-using D-sequents is not as easy as re-using conflict clauses in SAT-solving. In this paper, we address this problem. We introduce a new definition of D-sequents that enables their re-usability. We develop a theory showing under what conditions a D-sequent can be safely re-used.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Many verification problems can be cast as an instance of the Quantifier Elimination (QE) problem or its variations111In , we introduced Partial QE (PQE) where only a part of the formula is taken out of the scope of quantifiers. The appeal of PQE is twofold. First, many verification problems like equivalence and model checking require partial rather than complete QE [1, 2]. Second, PQE is much simpler to solve than QE. Both QE and PQE benefit from the results of this paper. However, since QE is conceptually simpler than PQE, we picked the former to introduce our new approach to D-sequent re-using. . So any progress in solving the QE problem is of great importance. In this paper, we consider the QE problem for propositional CNF formulas with existential quantifiers. Given formula where and are sets of variables, the QE problem is to find a quantifier-free formula such that . In [3, 4], we introduced a new approach to QE based on the following observation. Let us call a clause222A clause is a disjunction of literals. So, a CNF formula is a conjunction of clauses. of an -clause if it contains at least one variable of . Solving the QE problem reduces to finding formula implied by that makes the -clauses of redundant in (and so holds).

To implement the approach above, we introduced an algorithm called (Derivation of Clause D-Sequents). is based on the following three ideas. First, branches on variables of to reach a subspace where proving redundancy333An -clause is said to be redundant in if . In this paper, we use the standard convention of viewing a set of clauses as an alternative way to specify the CNF formula . So, the expression denotes the CNF formula obtained from by removing clause . of -clauses (or making them redundant by adding a new clause) is easy. Second, once an -clause is proved redundant, stores this fact in the form of a Dependency Sequent (D-sequent). A D-sequent is a record where is an -clause of and is an assignment to variables of . This record states that is redundant in in subspace . The third idea of is to use a resolution-like operation called join to merge the results of branches. This join operation is applied to D-sequents and derived in branches and where is a variable of . The result of this operation is a D-sequent where does not contain variable .

To make more efficient, it is natural to try to re-use a D-sequent in every subspace where (i.e. contains all the assignments of ). However, here one faces the following problem. The definition of D-sequent implies that is also redundant in subspace for formulas logically equvialent to where is a subset of . However, this may not be true for some formulas . Here is a simple example of that. Let formula contain two identical -clauses and . Then D-sequents and hold. They state that and are redundant in individually. However, in general, one cannot444That is . drop both and from . This means that, say, may not be redundant in despite the fact that .

The problem above prevents from reusing D-sequents. The reason why D-sequents cannot be re-used as easily as, say, conflict clauses in SAT-solvers [8, 9] is as follows. Redundancy of a clause in a formula is a structural property555This is true regardless of whether this formula has quantifiers.. That is the fact that clause is redundant in formula may not hold in a formula logically equivalent to . On the other hand, re-using a conflict clause is based on the fact that is implied by the initial formula and implication is a semantic property. That is is implied by every formula logically equivalent to .

In this paper, we address the problem of re-usability of D-sequents. Our approach is based on the following observation. Consider the example above with two identical -clauses . The D-sequent requires the presence of clause . This means that is supposed to be proved redundant after . On the contrary, the D-sequent requires the presence of and hence is proved redundant before . So these D-sequents have a conflict in the order of proving redundancy of and .

To be able to identify order conflicts, we modify the definition of D-sequents given in . A new D-sequent is a record where is a subset of . This D-sequent states that the clause is redundant in subspace in every formula where in subspace and . Note that if an -clause of is proved redundant and removed from , the D-sequent is not applicable. So one can view as an order constraint stating that applies only if -clauses of are proved redundant after . In other words, one can safely reuse in subspace where , if none of the -clauses of is proved redundant yet.

The contribution of this paper is as follows. First, we give the necessary definitions and propositions explaining the semantics of D-sequents stating redundancy of -clauses666In , we just give basic definitions. In , we do provide a detailed theoretical consideration but it is meant for D-sequents introduced in  expressing redundancy of variables rather than clauses. (Sections 234 and 5.) Second, we give a new definition of D-sequents facilitating their re-using (Section 6). We also introduce the notion of consistent D-sequents and show that they can be re-used. Third, we re-visit definitions of atomic D-sequents (i.e. D-sequents stating trivial cases of redundancy) and the join operation to accommodate the D-sequents of the new kind (Sections 7 and 8). Fourth, we present , a version of that can safely re-use D-sequents (Section 12).

## 2 Basic Definitions

In this paper, we consider only propositional CNF formulas. In the sequel, when we say “formula” without mentioning quantifiers we mean a quantifier-free CNF formula. We consider true and false as a special kind of clauses. A non-empty clause becomes true when it is satisfied by an assignment  i.e. when a literal of is set to true by . A clause becomes false when it is falsified by i.e. when all the literals of are set to false by .

###### Definition 1

Let be a CNF formula and be a subset of variables of . We will refer to formula as .

###### Definition 2

Let be an assignment and be a CNF formula. denotes the variables assigned in ; denotes the set of variables of ; denotes .

###### Definition 3

Let be a clause, be a formula that may have quantifiers, and be an assignment. is true if is satisfied by ; otherwise it is the clause obtained from by removing all literals falsified by . denotes the formula obtained from by replacing with .

###### Definition 4

Let be formulas that may have quantifiers. We say that are equivalent, written , if for all assignments such that , we have .

###### Definition 5

The Quantifier Elimination (QE) problem for formula is to find a formula such that .

###### Remark 1

From now on, we will use and to denote sets of free and quantified variables respectively. We will assume that variables denoted by and are in and respectively. When we use and in a quantifier-free formula we mean that, in the context of QE, the set specifies the quantified variables.

###### Definition 6

A clause of is called a -clause if . Denote by the set of all -clauses of .

###### Definition 7

Let be a CNF formula and (i.e. G is a non-empty subset of clauses of ). The clauses of are redundant in if . The clauses of are redundant in formula if .

Note that implies but the opposite is not true.

## 3 Clause Redundancy And Boundary Points

In this section, we explain the semantics of QE in terms so-called boundary points.

###### Definition 8

Given assignment and a formula , we say that is a point of if .

In the sequel, by “assignment” we mean a possibly partial one. To refer to a full assignment we will use the term “point”.

###### Definition 9

Let be a formula and . A point of is called a -boundary point of if a) and b) and c) every clause of falsified by is a -clause and d) the previous condition breaks for every proper subset of .

###### Remark 2

Let be a CNF formula where sets and are interpreted as described in Remark 1. In the context of QE, we will deal exclusively with -boundary points that falsify only -clauses of and so holds.

###### Example 1

Let and . Let be a CNF formula of four clauses: , , , . The clauses of falsified by = are and . One can verify that and the set satisfy the four conditions of Definition 9, which makes a -boundary point. The set above is not unique. One can easily check that is also a -boundary point.

The term “boundary” is justified as follows. Let be a satisfiable CNF formula with at least one clause. Then there always exists a -boundary point of , that is different from a satisfying assignment only in value of .

###### Definition 10

Given a CNF formula and a -boundary point of :

• is -removable in if 1) ; and 2) there is a clause such that a) ; b) ; and c) .

• is removable in if is -removable in .

In the above definition, notice that is not a -boundary point of because falsifies and . So adding clause to eliminates as a -boundary point.

###### Example 2

Let us consider the -boundary point = of Example 1. Let denote clause obtained777See Definition 13 of the resolution operation. by resolving , and on variables and . Note that set and satisfy the conditions a),b) and c) of Definition 10 for . So   is an -removable -boundary point. After adding to , is not an -boundary point any more. Let us consider the point = obtained from by flipping values of and . Both and have the same set of falsified clauses consisting of and . So, like , point is an -boundary point. However, no clause implied by and consisting only of variables of is falsified by . So, the latter, is an -boundary point that is not -removable.

###### Proposition 1

A -boundary point of is removable in , iff one cannot turn into an assignment satisfying by changing only the values of variables of .

The proofs are given in the appendix.

###### Proposition 2

Let be a CNF formula where (see Definition 6). Let be a non-empty subset of . The set is not redundant in iff there is a -boundary point of such that a) every clause falsified by is in and b) is -removable in .

Proposition 2 justifies the following strategy of solving the QE problem. Add to a set of clauses that a) are implied by ; b) eliminate every -removable boundary point falsifying a subset of -clauses of . By dropping all -clauses of , one produces a solution to the QE problem.

## 4 Quantifier Elimination By Branching

In this section, we explain the semantics of QE algorithm called   (Derivation of Clause D-Sequents). A high-level description of is given in Section 11. is a branching algorithm. Given a formula , branches on variables of until it proves that every -clause is redundant in the current subspace. (In case of a conflict, proving -clauses of redundant, in general, requires adding to a conflict clause.) Then merges the results obtained in different branches to prove that the -clauses are redundant in the entire search space. Below we give propositions justifying the strategy of . Proposition 3 shows how to perform elimination of removable boundary points of in the subspace specified by assignment . This is done by using formula , a “local version” of . Proposition 4 justifies proving redundancy of -clauses of incrementally.

Let and be assignments to a set of variables . Since and are sets of value assignments to individual variables of one can apply set operations to them. We will denote by the fact that contains all value assignments of . The assignment consisting of value assignments of and is represented as .

###### Proposition 3

Let be an and be an assignment to . Let be a -boundary point of where and . Then if is removable in it is also removable in .

###### Remark 3

One cannot reverse Proposition 3: a boundary point may be -removable in and not -removable in . For instance, if , a -boundary point of where is removed from only by adding an empty clause to . So if is satisfiable, is not removable. Yet may be removable in if is unsatisfiable. A ramification of the fact that Proposition 3 is not reversible is discussed in Section 5.

###### Proposition 4

Let be an and be redundant in . Let an -clause of be redundant in . Then is redundant in .

###### Remark 4

To simplify the notation, we will sometimes use the expression “clause is redundant in in subspace  ” instead of saying “clause is redundant in ”.

Proposition 4 shows that one can prove redundancy of, say, a set of -clauses in in subspace incrementally. This can be done by a) proving redundancy of in in subspace and c) proving redundancy of in formula in subspace .

## 5 Virtual redundancy

If a boundary point is -removable in , this does not mean that it is -removable in (see Remark 3). This fact leads to the following problem. Let and be two assignments to and . Suppose that clause is redundant in in subspace . It is natural to expect that this also holds in the smaller subspace . However, does not imply . In particular, due to this problem, one cannot define the join operation in terms of redundancy specified by Definition 7. To address this issue we introduce the notion of virtual redundancy.

###### Definition 11

Let be an formula, be an assignment to , and be an -clause of . Let be the set of points of such every falsifies only clause and is -removable. Clause is called virtually redundant in if one of the two conditions are true.

1. or

2. For every , there is an assignment where such that is not -removable in . Here is obtained from by removing all value assignments to variables of .

The first condition just means that . We will refer to this type of redundancy (earlier specified by Definition 7) as regular redundancy. Regular redundancy is a special case of virtual redundancy.

###### Proposition 5

Let be an assignment to and clause be redundant in . Then, for every such that , clause is virtually redundant in .

From now on, when we say that a clause is redundant in we mean that it is at least virtually redundant. Note that, in general, proving virtual redundancy of in subspace can be extremely hard. We avoid this problem by using the notion of virtual of redundancy only if we have already proved that is redundant in a subspace containing subspace . (For instance, we have already proved that is redundant in in subspace where .)

## 6 Dependency Sequents (D-sequents)

In this section, we give a new definition of D-sequents that is different from that of .

###### Definition 12

Let be an formula. Let be an assignment to and and . A dependency sequent (D-sequent) has the form . It states that clause is redundant in every formula logically equivalent to where . The assignment and formula are called the conditional and order constraint of respectively. We will refer to as a member formula for .

Definition 12 implies that the D-sequent becomes inapplicable if a clause of is removed from . So, is meant to be used in situations where the -clauses of are proved redundant after (hence the name “order constraint”). As we mentioned in the introduction, in , a D-sequent implies redundancy of clause in and in (some) logically equivalent formulas where . In Definition 12, the set of formulas where is redundant in subspace is specified precisely. We will say that a D-sequent is fragile if contains at least one -clause. Such a D-sequent becomes inapplicable if an -clause of is proved redundant before . If does not contain -clauses, the D-sequent above is called robust. A robust D-sequent is not affected by the order in which -clauses are proved redundant.

###### Remark 5

We will abbreviate D-sequent to if formula is known from the context. We will further reduce to   if i.e. if no order constraint is imposed.

There are two ways to produce D-sequents. First, one can generate an “atomic” D-sequent that states a trivial case of redundancy. The three atomic types of D-sequents are presented in Section 7. Second, one can use a pair of existing D-sequents to generate a new one by applying a resolution-like operation called join (Section 8).

## 7 Atomic D-sequents

In this section we describe D-sequents called atomic. These D-sequents are generated when redundancy of a clause can be trivially proved. Similarly to , we introduce atomic D-sequents of three kinds. However, in contrast to , we consider D-sequents specified by Definition 12. In particular, we show that D-sequents of the first kind are robust whereas D-sequents of the second and third kind are fragile.

### 7.1 Atomic D-sequents of the first kind

###### Proposition 6

Let be an and and . Let assignment where satisfy . Then D-sequent holds. We will refer to it as an atomic D-sequent of the first kind.

###### Example 3

Let be an and be a clause of . Since is satisfied by assignments and , D-sequents and hold.

### 7.2 Atomic D-sequents of the second kind

###### Proposition 7

Let be an formula and be an assignment to . Let be two clauses of . Let be an -clause and imply (i.e. every literal of is in ). Then the D-sequent holds where . We will refer to it as an atomic D-sequent of the second kind.

###### Example 4

Let be an formula. Let and be -clauses of . Let . Since implies , the D-sequent holds. Since  is an -clause, this D-sequent is fragile.

### 7.3 Atomic D-sequents of the third kind

To introduce atomic D-sequents of the third kind, we need to make a few definitions.

###### Definition 13

Let and be clauses having opposite literals of exactly one variable . The clause consisting of all literals of and but those of is called the resolvent of , on . Clause is said to be obtained by resolution on . Clauses , are called resolvable on .

###### Definition 14

A clause of a CNF formula is called blocked at variable , if no clause of is resolvable with on . The notion of blocked clauses was introduced in .

###### Proposition 8

Let be an formula. Let be an -clause of and . Let be the clauses of that can be resolved with on variable . Let ,, be a consistent set888We will introduce the notion of a consistent set of D-sequents later, see Definition 19. Consistency of D-sequents in Proposition 8 means that are redundant together in subspace =. So clause is blocked at variable in subspace . of D-sequents. Then D-sequent holds where = and . We will refer to it as an atomic D-sequent of the third kind.

Note that, in general, a D-sequent of the third kind is fragile.

###### Example 5

Let be an formula. Let be the only clauses of with variable where , , . Note that assignment satisfies clause . So the D-sequent holds. Suppose that D-sequent holds where is a clause of and . From Proposition 8 it follows that D-sequent holds where .

## 8 Join Operation

In this section, we describe the operation of joining D-sequents that produces a new D-sequent from two parent D-sequents. In contrast to , the join operation introduced here is applied to D-sequents with order constraints.

###### Definition 15

Let and be assignments in which exactly one variable is assigned different values. The assignment consisting of all the value assignments of and but those to is called the resolvent of , on . Assignments , are called resolvable on .

###### Proposition 9

Let be an formula for which D-sequents and hold. Let , be resolvable on and be the resolvent of and . Let . Then the D-sequent holds.

###### Definition 16

We will say that the D-sequent of Proposition 9 is produced by joining D-sequents and at .

###### Remark 6

Note that the D-sequent produced by the join operation has a stronger order constraint than its parent D-sequents. The latter have order constraints and in subspaces and , whereas has the same order constraint in either subspace. Due to this “imprecision” of the join operation, a set of D-sequents with conflicting order constraints can still be correct (see Section 9 and Subsection 11.2).

## 9 Re-usability of D-sequents

To address the problem of D-sequent re-using, we introduce the notion of composability. Informally, a set of D-sequents is composable if the clauses stated redundant individually are also redundant collectively. Robust D-sequents are always composable. So they can be re-used in any context like conflict clauses in SAT-solvers. However, this is not true for fragile D-sequents. Below, we show that such D-sequents are composable if they are consistent. So it is safe to re-use a fragile D-sequent in a subspace , if it is consistent with the D-sequents already used in subspace .

###### Definition 17

Assignments and are called compatible if every variable from is assigned the same value.

###### Definition 18

Let   be an  . A set of D-sequents ,, is called composable if the clauses are redundant collectively as well. That is holds in subspace where =.

###### Definition 19

Let be an . A set of D-sequents ,, is called consistent if

• every pair of assignments ,, is compatible;

• there is a total order over clauses of that satisfies the order constraints of these D-sequents i.e. , holds where .

###### Proposition 10

Let be an . Let ,, be a consistent set of D-sequents. Then these D-sequents are composable and hence clauses are collectively redundant in in subspace where =.

###### Remark 7

The fact that D-sequents are inconsistent does not necessarily mean that these D-sequents are not composable. As we mentioned in Remark 6, as far as order constraints are concerned, the join operation is not “precise”. This means that if the D-sequents above are obtained by applying the join operation, their order-inconsistency may be artificial. An example of that is the QE procedure called  . As we explain in Subsection 11.2, if one uses the new definition of D-sequents (i.e. Definition 12), the D-sequents produced by are, in general, inconsistent. However, is provably correct .

###### Remark 8

Let be an and be the set of clauses added to by a QE-solver. Let (i.e. the latter is the set of all -clauses of ). This QE-solver terminates when the set999This set consists of the clauses of that depend only on variables of . is sufficient to derive consistent D-sequents , , , , . From Proposition 10 it follows, that all -clauses can be dropped from . The resulting formula consisting of clauses of is logically equivalent to .

## 10 Two Useful Transformations Of D-sequents

In this section, we describe two transformations that are useful for a QE-solver based on the machinery of D-sequents. Since a QE-solver has to add new clauses once in a while, D-sequents of different branches are, in general, computed with respect to different formulas. In Subsection 10.1, we describe a transformation meant for “aligning” such D-sequents. In Subsection 10.2, we describe a transformation meant for relaxing the order constraint of a D-sequent. In Section 12, this transformation is used to generate a consistent set of D-sequents.

### 10.1 D-sequent alignment

According to Definition 12, a D-sequent holds with respect to a particular formula . Proposition 11 shows that this D-sequent also holds after adding to implied clauses.

###### Proposition 11

Let D-sequent hold and be a CNF formula implied by . Then D-sequent holds too.

Proposition 11 is useful in aligning D-sequents derived in different branches. Suppose that is derived in the current branch of the search tree where the last assignment is . Suppose that is derived after flipping the value of from 0 to 1. Here is the set of clauses implied by that has been added to before the second D-sequent was derived. One cannot apply the join operation to these D-sequents because they are computed with respect to different formulas. Proposition 11 allows one to replace with . The latter can be joined with at variable .

### 10.2 Making a D-sequent more robust

In this subsection, we give two propositions showing how one can make a D-sequent more robust. Proposition 12 introduces a transformation that removes a clause from the order constraint of possibly adding to the latter some other clauses. Proposition 13 describes a scenario where by repeatedly applying this transformation one can remove a clause from the order constraint of without adding any other clauses.

###### Proposition 12

Let be an . Let and be two D-sequents forming a consistent set (see Definition 19). Let be in . Then D-sequent holds where and .

###### Proposition 13

Let be an and ,, be consistent D-sequents where , . Assume, for the sake of simplicity, that the numbering order is consistent with the order constraints. Let be in . Then, by repeatedly applying the transformation of Proposition 12, one can produce D-sequent where .

## 11 Recalling DCDS

In , we described a QE algorithm called (Derivation of Clause D-Sequents) that did not re-use D-sequents. We Recall in Subsections 11.1 and 11.2.

### 11.1 A brief description of DCDS

The pseudocode of is given101010For the sake of simplicity, Figure 1 gives a very abstract view of . For instance, we omit the lines of code where new clauses are generated. Our objective here is just to show the part of where D-sequents are involved. A more detailed description of can be found in . in Fig. 1. uses the old definition of a D-sequent lacking an order constraint. accepts three parameters: formula (denoted as ), the current assignment  and the set of active D-sequents . (If an -clause of is proved redundant in subspace , this fact is stated by a D-sequent. This D-sequent is called active). returns the final formula (where consists of the initial clauses and derived clauses implied by ) and the set of current active D-sequents. has an active D-sequent for every -clause of . The conditional of this D-sequent is a subset of . In the first call of , the initial formula is used and and are empty sets. A solution to the QE problem at hand is obtained by dropping the -clauses of the final formula and removing the quantifiers.

starts with examining the -clauses whose redundancy is not proved yet. Namely, checks if the redundancy of such clauses can be established by atomic D-sequents (line 1) introduced in Section 7. If all -clauses are proved redundant, terminates returning the current formula and the current set of active D-sequents (lines 2-3). Otherwise, moves to the branching part of the algorithm (lines 4-9).