Learning for Dynamic subsumption

03/31/2009 ∙ by Youssef Hamadi, et al. ∙ 0

In this paper a new dynamic subsumption technique for Boolean CNF formulae is proposed. It exploits simple and sufficient conditions to detect during conflict analysis, clauses from the original formula that can be reduced by subsumption. During the learnt clause derivation, and at each step of the resolution process, we simply check for backward subsumption between the current resolvent and clauses from the original formula and encoded in the implication graph. Our approach give rise to a strong and dynamic simplification technique that exploits learning to eliminate literals from the original clauses. Experimental results show that the integration of our dynamic subsumption approach within the state-of-the-art SAT solvers Minisat and Rsat achieves interesting improvements particularly on crafted instances.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The SAT problem, i.e., the problem of checking whether a set of Boolean clauses is satisfiable or not, is central to many domains in computer science and artificial intelligence including constraint satisfaction problems (CSP), planning, non-monotonic reasoning, VLSI correctness checking, etc. Today, SAT has gained a considerable audience with the advent of a new generation of SAT solvers able to solve large instances encoding real-world applications and the demonstration that these solvers represent important low-level building blocks for many important fields, e.g., SMT solving, Theorem proving, Model finding, QBF solving, etc. These solvers, called modern SAT solvers

[Moskewicz01, MiniSat03], are based on classical unit propagation [Davis62] efficiently combined through incremental data structures with: (i) restart policies [Gomes1998, kautz02dynamic]

, (ii) activity-based variable selection heuristics (VSIDS-like)

[Moskewicz01], and (iii) clause learning [Bayardo97, Marques-Silva96, Moskewicz01]. Modern SAT solvers can be seen as an extended version of the well known DPLL-like procedure obtained thanks to these different enhancements. It is important to note that the well known resolution rule à la Robinson still plays a strong role in the efficiency of modern SAT solvers which can be understood as a particular form of general resolution [BeameKS03].

Indeed, conflict-based learning, one of the most important component of SAT solvers is based on resolution. We can also mention, that the well known and highly successful (SatElite) preprocessor is based on variable elimination through the resolution rule [SubbarayanP04a, Biere05]. As mentioned in [SubbarayanP04a], on industrial instances, resolution leads to the generation of many tautological resolvents. This can be explained by the fact that many clauses represent Boolean functions encoded through a common set of variables. This property of the encodings might also be at the origin of many redundant or subsumed clauses at different steps of the search process.

The utility of (SatElite) on industrial problems has been proved, and therefore one can wonder if the application of the resolution rule could be performed not only as a pre-processing stage but systematically during the search process. Unfortunately, dynamically maintaining a formula closed under subsumption might be time consuming. An attempt has been made recently in this direction by L. Zhang [Zhang05]. In this work, a novel algorithm maintains a subsumption-free clause database by dynamically detecting and removing subsumed clauses as they are added. Interestingly, the author mention the following perspective of research: ”How to balance the runtime cost and the quality of the result for on-the-fly CNF simplification is a very interesting problem worth much further investigation”.

In this paper, our objective is to design an effective dynamic simplification algorithm based on resolution. Our proposed approach aims at eliminating literals from the CNF formula by dynamically substituting smaller clauses. More precisely, our approach exploits the intermediate steps of classical conflict analysis to subsume the clauses of the formula which are used in the underlying resolution derivation of the asserting clause. Since original clauses or learnt clauses can be used during conflict analysis both categories can be simplified. The effectiveness of our technique lies in the efficiency of the subsumption test, which is based on a simple and sufficient condition computable in constant time. Moreover, since our technique relies on the derivation of a conflict-clause, it is guided by the conflicts, and simplifies parts of the formula identified as important by the search strategy (VSIDS guidance). This dynamic process preserves the satisfiability of the formula, and with some additional bookkeeping can preserve the equivalence of the models.

The paper is organized as follows. After some preliminary definitions and notations, classical implication graph and learning schemes are presented in section 2. Then our dynamic subsumption approach is described in section LABEL:sec:sub. Finally, before the conclusion, experimental results demonstrating the performances of our approach are presented.

2 Technical background

2.1 Preliminary definitions and notations

A CNF formula is a conjunction of clauses, where a clause is a disjunction of literals. A literal is a positive () or negated () propositional variable. The two literals and are called complementary. We note by the complementary literal of . For a set of literals , is defined as . A unit clause is a clause containing only one literal (called unit literal), while a binary clause contains exactly two literals. An empty clause, noted , is interpreted as false (unsatisfiable), whereas an empty CNF formula, noted , is interpreted as true (satisfiable).

The set of variables occurring in is noted . A set of literals is complete if it contains one literal for each variable in , and fundamental if it does not contain complementary literals. An assignment of a Boolean formula is function which associates a value to some of the variables . is complete if it assigns a value to every , and partial otherwise. An assignment is alternatively represented by a complete and fundamental set of literals, in the obvious way. A model of a formula is an assignment that makes the formula ; noted .

The following notations will be heavily used throughout the paper:

  • denotes the resolvent between a clause containing the literal and a clause containing the opposite literal . In other words . A resolvent is called tautological when it contains opposite literals.

  • will denote the formula obtained from by assigning the truth-value true. Formally (that is: the clauses containing and are therefore satisfied are removed; and those containing are simplified). This notation is extended to assignments: given an assignment , we define .

  • denotes the formula closed under unit propagation, defined recursively as follows: (1) if does not contain any unit clause, (2) if contains two unit-clauses and , (3) otherwise, where is the literal appearing in a unit clause of . A clause is deduced by unit propagation from , noted , if .

Let and be two clauses of a formula . We say that (respectively ) subsume (respectively is subsumed) (respectively by ) iff . If subsume , then (the converse is not true). Also and are equivalent with respect to satisfiability.

2.2 DPLL search

DPLL [Davis62] is a tree-based backtrack search procedure; at each node of the search tree, the assigned literals (decision literal and the propagated ones) are labeled with the same decision level starting from 1 and increased at each decision (or branching). After backtracking, some variables are unassigned, and the current decision level is decreased accordingly. At level , the current partial assignment can be represented as a sequence of decision-propagation of the form where the first literal corresponds to the decision literal assigned at level and each for represents a propagated (unit) literals at level . Let , we note the assignment level of . For a clause , is defined as the maximum level of its assigned literals.

2.3 Conflict analysis using implication graphs

Implication graphs is a standard representation conveniently used to analyze conflicts in modern SAT solvers. Whenever a literal is propagated, we keep a reference to the clause which triggers the propagation of , which we note . The clause , called implication of , is in this case of the form where every literal is false under the current partial assignment (), while . When a literal is not obtained by propagation but comes from a decision, is undefined, which we note for convenience . When , we denote by the set , called set of explanations of . When is undefined we define as the empty set.

Definition 1 (Implication Graph)

Let be a CNF formula, a partial assignment, and let denotes the set of explanations for the deduced (unit propagated) literals in . The implication graph associated to , and is where:

  • , i.e., there is exactly one node for every literal, decided or implied;

In the rest of this paper, for simplicity reason, is omitted, and an implication graph is simply noted as . We also note as the conflict level.

Example 1

, shown in Figure 1 is an implication graph for the formula and the partial assignment given below :

Figure 1: Implication Graph

. The conflict level is and