 # Generation of complete test sets

We use testing to check if a combinational circuit N always evaluates to 0. The usual point of view is that to prove that N always evaluates to 0 one has to check the value of N for all 2^|X| input assignments where X is the set of input variables of N. We use the notion of a Stable Set of Assignments (SSA) to show that one can build a complete test set (i.e. a test set proving that N always evaluates to 0) that consists of less than 2^|X| tests. Given an unsatisfiable CNF formula H(W), an SSA of H is a set of assignments to W proving unsatisfiability of H. A trivial SSA is the set of all 2^|W| assignments to W. Importantly, real-life formulas can have SSAs that are much smaller than 2^|W|. Generating a complete test set for N using only the machinery of SSAs is inefficient. We describe a much faster algorithm that combines computation of SSAs with resolution derivation and produces a complete test set for a "projection" of N on a subset of variables of N. We give experimental results and describe potential applications of this algorithm.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Testing is an important part of verification flows. For that reason, any progress in understanding testing and improving its quality is of great importance. In this paper, we consider the following problem. Given a single-output combinational circuit , find a set of input assignments (tests) proving that evaluates to 0 for every test (written as ) or find a counterexample111Circuit usually describes some property of a multi-circuit , the latter being the real object of verification. For instance, may specify a requirement that never outputs some combinations of values. . We will call a set of input assignments proving a complete test set (CTS)222Term CTS is sometimes used to say that a test set is complete in terms of a coverage metric i.e. that every event considered by this metric is tested. Our application of term CTS is obviously quite different. . We will call a CTS trivial if it consists of all possible tests. Typically, one assumes that proving involves derivation of a trivial CTS, which is infeasible in practice. Thus, testing is used only for finding an input assignment refuting . In this paper, we present an approach for building a non-trivial CTS that consists only of a subset of all possible tests.

Let be a single-output combinational circuit where and are sets of variables specifying input and internal variables of respectively. Variable specifies the output of . Let be a formula defining the functionality of (see Section 3). We will denote the set of variables of circuit (respectively formula ) as (respectively ). Every assignment333By an assignment to a set of variables , we mean a full assignment where every variable of is assigned a value. to satisfying corresponds to a consistent assignment444An assignment to a gate of is called consistent if the value assigned to the output variable of is implied by values assigned to its input variables. An assignment to variables of is called consistent if it is consistent for every gate of . to and vice versa. Then the problem of proving reduces to showing that formula is unsatisfiable. From now on, we assume that all formulas mentioned in this paper are propositional. Besides, we will assume that every formula is represented in CNF i.e. as a conjunction of disjunctions of literals. We will also refer to a disjunction of literals as a clause.

Our approach is based on the notion of a Stable Set of Assignments (SSA) introduced in . Given formula , an SSA of is a set of assignments to variables of that have two properties. First, every assignment of falsifies . Second, is a transitive closure of some neighborhood relation between assignments (see Section 2). The fact that has an SSA means that the former is unsatisfiable. Otherwise, an assignment satisfying is generated when building its SSA. If is unsatisfiable, the set of all assignments is always an SSA of . We will refer to it as trivial. Importantly, a real-life formula can have a lot of SSAs whose size is much less than . We will refer to them as non-trivial. As we show in Section 2, the fact that is an SSA of is a structural property of the latter. That is this property cannot be expressed in terms of the truth table of (as opposed to a semantic property of ). For that reason, if is an SSA for , it may not be an SSA for some other formula that is logically equivalent to .

We show that a CTS for can be easily extracted from an SSA of formula . This makes a non-trivial CTS a structural property of circuit that cannot be expressed in terms of its truth table. Unfortunately, building an SSA even for a formula of small size is inefficient. To address this problem, we present a procedure that constructs a simpler formula where for which an SSA is generated. Formula is implied by . Thus, the unsatisfiability of proved by construction of its SSA implies that is unsatisfiable too and . A test set extracted from an SSA of can be viewed as a CTS for a “projection” of on variables of .

We will refer to the procedure for building formula above as (“Semantics and Structure”). The name is due to the fact that combines semantic and structural derivations. can be applied to an arbitrary CNF formula . If is unsatisfiable, returns a formula implied by and its SSA. Otherwise, it produces an assignment to satisfying . The semantic part of is to derive . Its structural part consists of proving that is unsatisfiable by constructing an SSA. Formula produced when is unsatisfiable is logically equivalent to . Thus, can be viewed as a quantifier elimination algorithm for unsatisfiable formulas. On the other hand, can be applied to check satisfiability of a CNF formula, which makes it a SAT-algorithm.

The notion of non-trivial CTSs helps better understand testing. The latter is usually considered as an incomplete version of a semantic derivation. This point of view explains why testing is efficient (because it is incomplete) but does not explain why it is effective (only a minuscule part of the truth table is sampled). Since a non-trivial CTS for is its structural property, it is more appropriate to consider testing as a version of a structural derivation (possibly incomplete). This point of view explains not only efficiency of testing but provides a better explanation for its effectiveness: by using circuit-specific tests one can cover a significant part of a non-trivial CTS.

The contribution of this paper is threefold. First, we use the machinery of SSAs to introduce the notion of non-trivial CTSs (Section 3). Second, we present , a SAT-algorithm that combines structural and semantic derivations (Section 4). We show that this algorithm can be used for computing a CTS for a projection of a circuit. We also discuss some applications of (Sections 6 and 7). Third, we give experimental results showing the effectiveness of tests produced by (Section 8). In particular, we describe a procedure for “piecewise” construction of test sets that can be potentially applied to very large circuits.

## 2 Stable Set Of Assignments

### 2.1 Some definitions

Let be an assignment to a set of variables . Let falsify a clause . Denote by the set of assignments to satisfying that are at Hamming distance 1 from . (Here Nbhd stands for “Neighborhood”). Thus, the number of assignments in is equal to that of literals in . Let be another assignment to (that may be equal to ). Denote by the subset of consisting only of assignments that are farther away from than (in terms of the Hamming distance).

###### Example 1

Let and =0110. We assume that the values are listed in in the order the corresponding variables are numbered i.e. , . Let . (Note that falsifies .) Then = where = 1110 and =0100. Let = 0000. Note that is actually closer to than . So =.

###### Definition 1

Let be a formula555In this paper, we use the set of clauses as an alternative representation of a CNF formula . specified by a set of clauses . Let = be a set of assignments to such that every falsifies . Let denote a mapping where is a clause of falsified by . We will call an AC-mapping where “AC” stands for “Assignment-to-Clause”. We will denote the range of as . (So, a clause of is in iff there is an assignment such that .)

###### Definition 2

Let be a formula specified by a set of clauses . Let = be a set of assignments to . is called a Stable Set of Assignments666In , the notion of “uncentered” SSAs was introduced. The definition of an uncentered SSA is similar to Definition 2. The only difference is that one requires that for every , holds instead of . (SSA) of with center if there is an AC-mapping such that for every , holds where .

Note that if is an SSA of with respect to AC-mapping , then is also an SSA of .

###### Example 2

Let consist of four clauses: , , , . Let where , , , . Let be an AC-mapping specified as . Since falsifies , ,   is a correct AC-mapping. Set is an SSA of with respect to and center =. Indeed, = where and = , where , . Thus, , .

### 2.2 SSAs and satisfiability of a formula

###### Proposition 1

Formula is unsatisfiable iff it has an SSA.

The proof is given in Section 0.A of the appendix. A similar proposition was proved in  for “uncentered” SSAs (see Footnote 6).

###### Corollary 1

Let be an SSA of with respect to PC-mapping . Then the set of clauses is unsatisfiable. Thus, every clause of is redundant.

The set of all assignments to forms the trivial uncentered SSA of . Example 2 shows a non-trivial SSA. The fact that formula has a non-trivial SSA is its structural property. That is one cannot express the fact that is an SSA of using only the truth table of . For that reason, may not be an SSA of a formula logically equivalent to .

The relation between SSAs and satisfiability can be explained as follows. Suppose that formula is satisfiable. Let be an arbitrary assignment to and be a satisfying assignment that is the closest to in terms of the Hamming distance. Let be the set of all assignments to that falsify and be an AC-mapping from to . Then can be reached from by procedure BuildPath shown in Figure 1. (This procedure is non-deterministic: an oracle is used in line 7 to pick a variable to flip.) It generates a sequence of assignments where = and =. First, BuildPath checks if current assignment equals . If so, then has been reached. Otherwise, BuildPath uses clause to generate next assignment. Since satisfies , there is a variable that is assigned differently in and . BuildPath generates a new assignment obtained from by flipping the value of .

BuildPath converges to in steps where is the Hamming distance between and . Importantly, BuildPath reaches for any AC-mapping. Let be an SSA of with respect to center and AC-mapping . Then if BuildPath starts with and uses as AC-mapping, it can reach only assignments of . Since every assignment of falsifies , no satisfying assignment can be reached.

A procedure for generation of SSAs called BuildSSA is shown in Figure 2. It accepts formula and outputs either a satisfying assignment or an SSA of , a center and AC-mapping . BuildSSA maintains two sets of assignments denoted as and . Set contains the examined assignments i.e. ones whose neighborhood is already explored. Set specifies assignments that are queued to be examined. is initialized with an assignment and is originally empty. BuildSSA updates and in a while loop. First, BuildSSA picks an assignment of and checks if it satisfies . If so, is returned as a satisfying assignment. Otherwise, BuildSSA removes   from and picks a clause of falsified by . The assignments of that are not in are added to . After that, is added to as an examined assignment, pair is added to and a new iteration begins. If is empty, is an SSA with center and AC-mapping .

## 3 Complete Test Sets

Let be a single-output combinational circuit where and are sets of variables specifying input and internal variables of . Variable specifies the output of . Let consist of gates . Then can be represented as CNF formula where is a CNF formula specifying the consistent assignments of gate . Proving reduces to showing that formula is unsatisfiable.

###### Example 3

Circuit shown in Figure 3 represents equivalence checking of expressions and . The former is specified by gates and and the latter by , and . Formula is equal to where, for instance, , , , . Every satisfying assignment to corresponds to a consistent assignment to gate and vice versa. For instance, satisfies and is a consistent assignment to since the latter is an OR gate. Formula is unsatisfiable due to functional equivalence of expressions and . Thus, .

Let be a test i.e. an assignment to . The set of assignments to sharing the same assignment to forms a cube of assignments. (Recall that .) Denote this set as . Only one assignment of specifies the correct execution trace produced by under . All other assignments can be viewed as “erroneous” traces under test .

###### Definition 3

Let be a set of tests where . We will say that is a Complete Test Set (CTS) for if contains an SSA for formula .

If satisfies Definition 3, set “contains” a proof that and so can be viewed as complete. If , is the trivial CTS. In this case, contains the trivial SSA consisting of all assignments to . Given an SSA of , one can easily generate a CTS by extracting all different assignments to that are present in the assignments of .

###### Example 4

Formula of Example 3 has an SSA of 21 assignments to . They have only 5 different assignments to . So the set of those assignments is a CTS for .

Definition 3 is meant for circuits that are not “too redundant”. Its extension to the case of high redundancy is given in Section 0.B of the appendix.

## 4 Description Of SemStr Procedure

### 4.1 Motivation

Building an SSA can be inefficient even for a small formula. This makes construction of a CTS for from an SSA of impractical. We address this problem by introducing procedure called (a short for “Semantics and Structure”). Given formula , generates a simpler formula implied by at the same time trying to build an SSA for . We will refer to as the set of variables to exclude. If succeeds in constructing an SSA of , the latter is unsatisfiable and so is . can be applied to to generate tests as follows. Let be a subset of . First, is applied to construct formula implied by and an SSA of . Then a set of tests is extracted from this SSA.

The test set above can be considered as a CTS for a projection of circuit on . On the other hand, can be viewed as an approximation of a CTS for circuit , since is essentially an abstraction of formula . In this paper, we give two examples of building a test set for from an SSA of generated by . In the first example, is the set of input variables. Then an SSA found by for is itself a test set. The second example is given in Subsection 8.3 where a “piecewise” construction of tests is described.

###### Example 5

Consider the circuit of Figure 3. Assume that where is the set of input variables. Application of to produces formula . Besides, generates an SSA of with center =000 that consists of four assignments to : . (The AC-mapping is omitted here.) These assignments form a CTS for projection of on and an approximation of CTS for .

### 4.2 High-level description

In Figure 4, we describe as a recursive procedure. Like DPLL-like SAT-algorithms [6, 13, 15], makes decision assignments, runs the Boolean Constraint Propagation (BCP) procedure and performs branching. In particular, it uses decision levels . A decision level consists of a decision assignment to a variable and assignments to single variables implied by the former. accepts formula , partial assignment to variables of and index of current decision level. In the first call of , , . In contrast to DPLL, keeps a subset of variables (namely those of ) unassigned. If is satisfiable, outputs an assignment to satisfying . Otherwise, it returns an SSA of formula , its center and an AC-mapping . The latter maps to clauses of that consist only of variables of . ( derives such clauses by resolution777Recall that resolution is applied to clauses and that have opposite literals of some variable . The result of resolving and on is the clause consisting of all literals of and but those of . ). Hence formula depends only of variables of . The existence of an SSA means that and hence are unsatisfiable.

We will refer to a clause of as a -clause, if and all literals of of (if any) are falsified in the current node of the search tree by . If a conflict occurs when assigning variables of , behaves as a regular SAT-solver with conflict clause learning. Otherwise, the behavior of is different in two aspects. First, after BCP completes the current decision level, tries to build an SSA of the set of -clauses. If it succeeds in finding an SSA, is unsatisfiable in the current branch and backtracks. Thus, has a “non-conflict” backtracking mode. Second, in the non-conflict backtracking mode, uses a non-conflict learning. The objective of this learning is as follows. In every leaf of the search tree, maintains the invariant that the set of current -clauses is unsatisfiable. Suppose that a -clause contains a literal of a variable that is falsified by the current partial assignment . If unassigns during backtracking, stops being a -clause. To maintain the invariant above, uses resolution to produce a new -clause that is a descendant of and does not contain .

### 4.3 SemStr in more detail

As shown in Figure 4, consists of three parts separated by dotted lines. In the first part (lines 1-6), runs BCP to fill in the current decision level number . Since does not assign variables of , BCP ignores clauses that contain a variable of . If, during BCP, a clause consisting only of variables of gets falsified, a conflict occurs. Then generates a conflict clause (line 3) and adds it to . In this case, formula consists simply of that is empty (has no literals) in subspace specified by . Any set where is an arbitrary assignment to is an SSA of in subspace specified by .

If no conflict occurs in the first part, starts the second part (lines 7-13). Here, runs BldSSA procedure to check if the current set of -clauses is unsatisfiable by building an SSA. If BldSSA fails to build an SSA (line 8), it checks if all variables of are assigned (line 9). If so, formula is satisfiable. returns a satisfying assignment (line 10) that is the union of current assignment to and assignment to returned by BldSSA. (Assignment satisfies all the current -clauses).

If BldSSA succeeds in building an SSA with respect to an AC-function and center (line 11), performs operation called Normalize over formula where (line 12). After that, returns. Let be the decision variable of the current decision level (i.e. level number ). The objective of Normalize is to guarantee that every clause of contains no more than one variable assigned at level and this variable is . Let be a clause of that violates this rule. Suppose, for instance, that has one or more literals falsified by implied assignments of level . In this case, Normalize performs a sequence of resolution operations that starts with clause and terminates with a clause that contains only variable . (This is similar to the conflict generation procedure of a SAT-solver. It starts with a clause rendered unsatisfiable that has at least two literals assigned at the conflict level. After a sequence of resolutions, this procedure generates a clause where only one literal is falsified at the conflict level.) Importantly, and are identical as -clauses i.e. they are different only in literals of . Clause is added to and replaces in AC-function and hence in .

If neither satisfying assignment nor SSA is found in the second part, starts the third part (lines 14-26) where it branches. First, a decision variable is picked to start decision level number . adds assignment to and calls itself to explore the left branch (line 17). If this call returns a satisfying assignment , ends the current invocation and returns (line 18). If (i.e. no satisfying assignment is found), checks if the set of clauses found to be unsatisfiable in branch contains variable . If not, then branch is skipped and returns SSA and AC-mapping found in the left branch. Otherwise, examines branch (lines 21-23).

Finally, merges results of both branches by calling procedure Excl. Formulas and specify unsatisfiable -clauses of branches and respectively. This means that formula is unsatisfiable in the subspace specified by . However, maintains a stronger invariant that all -clauses are unsatisfiable in subspace . This invariant is broken after unassigning since the clauses of containing variable are not -clauses any more. Procedure Excl “excludes” to restore this invariant via producing new -clauses obtained by resolving clauses of and on .

The pseudo-code of Excl is shown in Figure 5. First, Excl builds formula that consists of clauses of minus those that have variable (lines 1-3). Then Excl tries to build an SSA of by calling procedure BldSSA in a while loop (lines 4-9). If BldSSA succeeds, Excl returns the SSA found by BldSSA. Otherwise, BldSSA returns an assignment that satisfies . This satisfying assignment is eliminated by generating a -clause falsified by and adding it to . Clause is generated by resolving two clauses of on variable . After that, a new iteration begins.

## 5 Example Of How SemStr Operates

Let , and be a formula of 6 clauses: , , , , .

Let us consider how operates on the formula above. We will identify invocations of by partial assignment to . For instance, since is empty in the initial call of , the latter is denoted as . We will also use as a subscript to identify under assignment . The first part of (see Figure 4) does not trigger any action because does not contain unit clauses (i.e. unsatisfied clauses that have only one unassigned literal). In the second part of , procedure BldSSA fails to build an SSA because the only -clause of is . So the current set of -clauses is satisfiable. Having found out that not all variables of are assigned (line 9 of Figure 4), leaves the second part.

Let be the variable of picked in the third part for branching (line 14). uses assignment to start decision level number 1. (In the original call, the decision level value is 0). Then is invoked that operates as follows. contains unit clauses and (we crossed out literal as falsified). Unit clause is ignored by BCP, since does not assign variables of . On the other hand, BCP assigns value 1 to to satisfy . So current equals and decision level number 1 contains one decision and one implied assignment. At this point, BCP stops. The only clause consisting solely of variables of (clause ) is satisfied. So no conflict occurred and finishes the first part of the code.

Current formula has the following -clauses: , , . This set of -clauses is unsatisfiable. BldSSA proves this by generating a set of three assignments: =11, =01, =10 that is an SSA. The center is and the AC-function is defined as = , = , = . So formula for subspace consists of clauses . Note that needs normalization, since contains literal falsified by the implied assignment of level 1. Procedure Normalize (line 12) fixes this problem. It produces new clause obtained by resolving with clause on . (Note that is the clause from which assignment was derived during BCP.) Clause is added to . It replaces clause in and hence in . So now = and consists of clauses . At this point, terminates returning SSA , center  , AC-mapping   and modified to .

Having completed branch , invokes . Since does not have any unit clauses, no action is taken in the first part. Formula contains three -clauses: , and . Procedure BldSSA proves them unsatisfiable by generating a set of three assignments =11, =01, =10 that is an SSA with respect to center and AC-function: = , = , = . So formula consists of clauses . It does not need normalization. terminates returning SSA ,  , and   to .

Finally, calls Excl to merge the results of branches and by excluding variable . Formulas