Integrating Conflict Driven Clause Learning to Local Search

10/07/2009 ∙ by Gilles Audenard, et al. ∙ 0

This article introduces SatHyS (SAT HYbrid Solver), a novel hybrid approach for propositional satisfiability. It combines local search and conflict driven clause learning (CDCL) scheme. Each time the local search part reaches a local minimum, the CDCL is launched. For SAT problems it behaves like a tabu list, whereas for UNSAT ones, the CDCL part tries to focus on minimum unsatisfiable sub-formula (MUS). Experimental results show good performances on many classes of SAT instances from the last SAT competitions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The SAT problem, namely the issue of checking whether a set of Boolean clauses is satisfiable or not, is a central issue in many computer science and artificial intelligence domains, like theorem proving, planning, non-monotonic reasoning, VLSI correctness checking. These last two decades, many approaches have been proposed to solve large SAT instances, based on logically complete or incomplete algorithms. Both local-search techniques

[SelmanKC94, SelmanK93, HirschK05] and elaborate variants of the Davis-Putnam-Loveland-Logemann DPLL procedure [DPLL62] [MoskewiczMZZM01, EenS03], called modern SAT solvers, can now solve many families of hard SAT instances. These two kinds of approaches present complementary features and performances. Modern SAT solvers are particularly efficient on the industrial SAT category while local search performs better on random SAT instances.

Consequently, combining stochastic local search (SLS) and conflict driven clause learning (CDCL) solvers seems promising. Note that it was pointed as a challenge by Selman et al. [SelmanKM97] in 1997. Such methods should exploit the quality and differences of both approaches. Furthermore, the perfect hybrid method has to outperform both local search and CDCL solvers. A lot of attempts have been done last decade [HandbookOfSAT2009]. These different attempts will be discussed in section 3.

In this paper, we propose another hybridization of local search and modern SAT solver, named SatHyS (SAT HYbrid Solver). The local search solver is the main one. Each time it reaches a local minimum, the CDCL part is called and assigns some variables. This part of our solver is expected to have different behaviours depending on the kind of formula to solve. In case of a satisfiable one, the CDCL part can be seen as a tabu list [GloverI89] in order to protect good variables and avoid to reach the same minimum quickly. On the other hand, for unsatisfiable formulas, it tries to focus the search on minimum unsatisfiable sub-formulas (MUS) [GMP-06-2, GMP-07-1, GMP-07-2], allowing to concentrate on a small part of the whole formula. Like this, the CDCL component of SatHyS is used as a strategy to escape from local minimum.

The rest of the paper is organized as follows. Section 2 introduces different notions necessary for understanding the rest of the paper. Section 3 discusses different hybrid methods. Section 4 gives the insights of our method. In section 5, we give the details and algorithms of SatHyS. Before a conclusion, section 6 provides different experiments.

2 Preliminary definitions and technical background

2.1 Definitions

Let us give some necessary definitions and notations. Let be a set of boolean variables, a literal is a variable or its negation . A clause is a disjunction of literals . A unit clause is a clause with only one literal. A formula is in conjunctive normal form (CNF) if it is a conjunction of clauses . The set of literals appearing in is denoted . An interpretation of a formula associates a value to variables in the formula. An interpretation is complete if it gives a value to each variable , otherwise it is said partial. A clause, a CNF formula and an interpretation can be conveniently represented as sets. A model of a formula , denoted , is an interpretation which satisfies the formula i.e. satisfies each clause of . Then, we can define the SAT decision problem as follows: is there an assignment of values to the variables so that the CNF formula is satisfied?

Let us introduce some additional notations.

  • The negation of a formula is denoted

  • denotes the formula simplified by the assignment of the literal to true. This notation is extended to interpretations: Let be an interpretation, ;

  • denotes the formula simplified by unit propagation;

  • denotes logic deduction by unit propagation: means that the literal is deducted by unit propagation from i.e. . One notes if the formula is unsatisfiable by unit propagation.

  • denotes the resolvent between a clause containing the literal and a clause containing the opposite literal . In other words . A resolvent is called tautological when it contains opposite literals.

2.2 Local Search Algorithms

Local search algorithms for SAT problems use a stochastic walk over complete interpretations of . At each step (or flip), they try to reduce the number of unsatisfiable clauses (usually called a descent). The next complete interpretation is chosen among the neighbours of the current one (they differ only on one literal value). A local minimum is reached when no descent is possible. One of the key point of stochastic local search algorithms is the method used to escape from local minimum. For lack of space, we can not provide a general algorithm of local search solver. For more details, the reader will refer to [hoos04].

2.3 CDCL solvers

Input: a CNF formula
Output: SAT or UNSAT
/* interpretation */
/* decision level */
/* number of conflicts */
while (true) do  BCP(,); AlgoLine0.1
if (!=null) then  ; AlgoLine0.2
conflictAnalysis(,,);AlgoLine0.3
computeBackjumpingLevel(,);AlgoLine0.4
if () then return UNSAT;AlgoLine0.5
;AlgoLine0.6
if (restart()) then ;AlgoLine0.7
backjump(,,);AlgoLine0.8
;AlgoLine0.9
else  if (all variables are instanciated) then  return SAT; AlgoLine0.10
= chooseDecisionLiteral();AlgoLine0.11
;AlgoLine0.12
;AlgoLine0.13
end
Algorithm 1 CDCL solver
13

13

13

13

13

13

13

13

13

13

13

13

13

Algorithm 1 shows the general scheme of a CDCL solver (due to lack of space, we can not provide details for all subroutines). A typical branch of a CDCL solver is a sequence of decisions, followed by propagations, repeated until a conflict is reached. Each decision literal (lines 18–20) is assigned at a given decision level (), deducted literals (by unit propagation) have the same decision level. If all variables are assigned, then is a model of (lines 16–17). Each time a conflict is reached by unit propagation (then is the conflict clause) A nogood is computed (line 8) using a given scheme, usually the first-UIP (Unique Implication Point) one [ZhangMMM01] and a backjump level is computed. At this point, It may have proved the unsatisfiability of the formula . If it is not the case, the nogood is added to the clause database and backjump is done (lines 11–14). Finally, sometimes CDCL solvers enforce restarts (different strategies are possible [Huang07]). In this case, one backjump in the top of the search tree.

2.4 Muses

Minimum unsatisfiable sub-formulas (MUS) of a CNF formula represent the smallest explanations for the inconsistency in term of the number of clauses. MUS are very important in order to circumscribe and highlight the source of contradiction of a given formula. Formally, one has:

Definition 1

Let be a CNF formula. A MUS of is a set of clauses such that:

  1. ;

  2. is unsatisfiable;

  3. is satisfiable.

Example 1

Let be a CNF formula. Figure 1 represents all MUS of .

Figure 1: All MUS of the formula (example 1)

Due to unsatisfiability of MUS, one has the following property:

Proposition 1

Let be an unsatisfiable CNF formula, a MUS of .

an interpretation over , such that

Let us consider a CNF formula and a complete interpretation . We say that the literal satisfies (resp. falsifies) a clause if (resp. ). We note (resp. ), the set of literals satisfying (resp. falsifying) a clause . The following definitions were introduced in [Gregoire-etal:2006a].

Definition 2 (once-satisfied clause)

A clause is said once-satisfied by an interpretation on literal if .

Definition 3 (critical and linked clauses)

Let be a complete interpretation. A clause is critical wrt if and , with and . Clauses are linked to for the interpretation .

Example 2

Let be a formula and an interpretation. The clause is critical. The other clauses of are linked to for .

The following properties was proposed and exploited in order to compute MUS by [Gregoire-etal:2006a].

Proposition 2

In a minimum (local or global), the set of falsified clauses are critical.

Proposition 3

In a minimum (local or global), at least one of clause of each MUS is critical.

3 Related Works

As it was suggested in the introduction, a lot of different approaches have been proposed to combine local search and DPLL based ones. One can divide such hybridizations in three different categories depending on the kind of the main solver. First, the main solver can be the SLS one. In that case, DP is used in order to help SLS [MazureSG98, Crawford93, Ferris2004, Havens2004]

. All of these approaches use the local search component as an assistance for the heuristic choice for variable assignment. Some of them try to focus the search on the unsatisfiable part of the formula

[MazureSG98], others on the satisfiable one [hybrideGM2009, Ferris2004]. Furthermore, this step can be achieved before the search [Crawford93] or dynamically at each decision nodes [MazureSG98].

The second category of hybridizations is the opposite, that is, the SLS solver is the core of the method and the DPLL one helps it [HabetLDV02, Jussien2002]. In [HabetLDV02], the DPLL solver is used in order to find dependencies between variables. Then, the local search framework is called on a subset of variables (the independent ones). Whereas, in [Jussien2002] (note that this method is for constraint satisfaction problems), the local search engine is used to find a promising partial interpretation.

Finally, the last category contains hybrid solvers where the both engines work together [FangH07, Letombe08]. The second method is an improvement of the first one. The local search tries to find a solution. After some time, it stops and sends all falsified clauses by the current interpretation to the CDCL part. This last one has the responsibility to find a model to this sub-formula. If it proves unsatisfiable, then the whole instance is unsatisfiable too.

We propose in Figure 2 a classification of all of these approaches. The X-axis corresponds to the kind of search. For example, DPLL is at the left, whereas walksat is at the right. The Y-axis corresponds to the ability to solve SAT and/or UNSAT formulas. Then, walksat is at the top of the classification. Methods introduced above are located in this graph. Of course, this classification is subjective and and it can be subject of discussion. It is here to help the reader to understand all of these approaches.

DPLSSATUNSAT[DPLL62][SelmanKC94][Crawford93][Ferris2004][hybrideGM2009][HabetLDV02][Jussien2002][MazureSG98][FangH07, Letombe08]
Figure 2: Classification

4 Intuition

In this section, we provide insights of our hybrid approach SatHyS. They are related to the satisfiability or unsatisfiability of the formulas. First note that the SLS engine is the core of our method. Then, the Local search part tries to find a solution. When a local minimum is reached, the CDCL part of the solver is launched. It works like a tabu list in case of satisfiable formula and tries to focus on MUS for unsatisfiable ones. Let us explain the main differences now.

4.1 SAT instances

Much research has been done on meta-heuristics. Among them, Tabu search was introduced in 1986 by Glover [GloverI89] and extended to the SAT case in 1995 [MSG-95-1]. The main idea of tabu search consists, in a given position (interpretation), in exploring neighbours and choosing as the next position the one which minimises the objective function.

It is crucial to note that such an operation could increase the objective function value: it is the case when all neighbours have a greater value. Then, this mechanism allows to escape from local minimum. However, the main drawback is that at the next step, one goes back in the same local minimum. To avoid this, heuristic needs memory for the last explored positions to be forbidden. These positions are tabu.

Already explored positions are stored in a queue (usually called tabu list) of a given length which is a parameter of the method. This list must contain complete positions, which can be prohibitive. To go round this, one can store only previous actions, associated to values of the objective function. The length parameter is very important. A lot of work have been done to provide optimal length, statically [MazureSG98] or dynamically [Battiti1997].

We propose to keep the set of tabu variables by using a partial interpretation computed with unit propagation engine. When a variable becomes tabu, it is assigned in the CDCL solver part and propagated. Then, resulting interpretation is used as a tabu list. There are two advantages: firstly, the length of the tabu list is dynamic, it depends of unit propagation and backjumping. Secondly, unit propagation allows to catch some functional dependencies in the tabu list.

4.2 UNSAT instances

First of all, note that if an instance is unsatisfiable then, whatever is the complete interpretation, a falsified clause exists. Furthermore, if an instance is unsatisfiable, then it contains at least one MUS. This MUS, i.e. a subset of clauses of the formula, is often smaller than the global formula and, then, can contain less variables. Then, in the case of unsatisfiable formula, it is advantageous to focus the search on such variables.

In the frame of MUS detection, Grégoire et al. [Gregoire-etal:2006a, GMP-07-2] shown that local search provides good heuristics, concerning inconsistent kernel detection. These methods use properties 2 and 3 in order to balance clauses which could be part of a MUS.

The proposed method in this paper is based on this principle. When a local minimum is reached, property 2 assures that the set of clauses falsified by current interpretation are critical. Given that such clauses could be part of a MUS, we choose one of them to make it totally true. Therefore only the variables of a kernel are expected to be taken into account.

5 Implementation

As explicated in the previous section, the core of our solver SatHyS is the local search component. It is based on an iterative search process that in each step moves from one point to a neighbouring one until discovering a solution. At each step it tries to reduce the number of falsified clauses. When it is not possible, a local minimum is reached. In that case, the CDCL part is called. It chooses a falsified clause and assigns all of its literals such that the clause becomes totally valid. All literals of the chosen clause are decision nodes. Of course unit propagation is achieved. In this manner, it escapes from the local minimum and the SLS part of the hybrid solver can be used again. Note that all variables assigned by the CDCL part are fixed and can not be flipped by the SLS solver. Of course, during the CDCL process, a conflict can occur. In that case, conflict analysis is performed, a clause is learnt and a backjump is done. Then, some of fixed variables become free and can be flipped again. This conflict analysis makes the solver able to prove unsatisfiability.

Input: a CNF formula
Result: if is satisfiable, else
while () do  Init(); AlgoLine0.1
;AlgoLine0.2
for to do  if then  return ; AlgoLine0.3
  /* set of falsified clauses */;AlgoLine0.4
while do  ; AlgoLine0.5
if allowing a descent then  ; AlgoLine0.6
break;AlgoLine0.7
else  ; AlgoLine0.8
    if then /* local minimum */  such that ; AlgoLine0.9
if (=) then  return UNSAT; AlgoLine0.10
       
Algorithm 2 SatHyS
10

10

10

10

10

10

10

10

10

10

Algorithm 2 takes a CNF formula in parameter and returns SAT or UNSAT. It is based on WSAT-like algorithms. Two variables are used. A complete interpretation for the local search engine (initialised randomly) and a partial interpretation for the CDCL part (initialised to the empty set). In order to forbid to flip fixed literals by the CDCL part (the literals of ), the SLS solver deals with . If the current complete interpretation is a model of then SatHyS finishes and returns SAT (lines 5–6). Otherwise, if it exists a neighbour of which allows to decrease the number of falsified clauses, it becomes the current complete interpretation (lines 8–14). If it is not the case, then a local minimum is reached (line 15). In that case, a falsified clause is randomly chosen and the function fix is called in order to fix new literals (lines 15–17). This function is explained below. It modifies interpretations and by fixing new variables and (if a conflict occurs during boolean propagation) freeing other ones. At this step, the CDCL solver can prove the unsatisfiability. Of course, if it is the case the search is done (line 17–18).

This whole process is repeated a given number of times (, line 4). After that, the solver tries to go in another area of the search space. Then, the process can continue until finding an answer.

Function fix is described in Algorithm 3. It works like a very simple CDCL solver. It takes a clause in input. It takes also in input the complete interpretation and the partial one and modifies them. It returns if the unsatisfiability is proven, and otherwise. The main goal of this function is to fix new variables. To achieve this, it tries to totally satisfy the clause . First of all, the set of decision denoted is initialized. Whenever it is not empty and a conflict does not occur, a new decision variable is chosen and added to the partial interpretation and boolean unit propagation (BCP) is performed (lines 3–6). If a conflict occurs, then the process is stopped. A conflict analysis is done and the partial interpretation is repaired (backjumping). At this step the unsatisfiability can be proved. Otherwise, the obtained nogood is added to the clause database (lines 7–11). Then, the complete interpretation is updated with the help of the partial one (note that and can not differ).

Input: a clause
Output: a CNF, a complete interpretation, a partial interpretation
Result: if unsatisfiable is proven, otherwise
;AlgoLine0.1
;AlgoLine0.2
while and do  tel que ; AlgoLine0.3
;AlgoLine0.4
;AlgoLine0.5
  if then  conflictAnalysis(,,); AlgoLine0.6
computeBackjumpingLevel(,);AlgoLine0.7
if () then return ;AlgoLine0.8
;AlgoLine0.9
  ;AlgoLine0.10
;AlgoLine0.11
return ;AlgoLine0.12
Algorithm 3 fix
12

12

12

12

12

12

12

12

12

12

12

12

6 Experiments

Experimental results reported in this section were obtained on a Xeon 3.2 GHz with 2 GByte of RAM. The CPU time is limited to 1200 seconds.

Our approach is compared with:

  • three SLS methods:

    1. classical wsat [SelmanKC94], i.e. using random walk strategy

    2. rsaps [HutterTH02]

    3. adaptg2 [LiWZ07]

  • two recent hybrid methods submitted at the last SAT competition in 2009:

    1. hybridGM [hybrideGM2009]

    2. hybrid1 [LiWZ07]

  • and two complete methods:

    1. cls a local search method completed by adding resolution process [Fang04]

    2. minisat [EenS03] a well-known CDCL solver.

Instances used are taken from the last SAT competitions (www.satcompetition.org). They are divided into three different categories: crafted (1439 instances), industrial (1305) and random (2172). All instances are preprocessed with SatElite [biere05].

Crafted Industrial Random
sat unsat sat unsat sat unsat
adaptg2 326 0 232 0 1111 0
rsaps 339 0 226 0 1071 0
wsat 259 0 206 0 1012 0
cls 235 75 227 102 690 0
SatHyS 322 191 466 309 341 14
hybridGM 290 0 209 0 1114 0
hybrid1 329 0 277 0 1126 0
minisat 402 369 588 414 609 315
Table 1: SatHyS versus some other SAT solvers

Table 1 summarizes the obtained results on this large number of instances. For more details on this experimental part, the reader can refer to http://www.cril.fr/lagniez/sathys. For each category and for each solver we report the number of solved instances. Of course, minisat a state-of-the-art CDCL based complete solver is only considered to mention the gap between local search based techniques and complete modern SAT solvers on industrial and crafted instances. On random satisfiable instances, local search techniques generally outperform complete techniques.

Before analysing more precisely the table of results (Table 1), remark that only three solvers are able to solve unsatisfiable instances (minisat, cls and SatHyS). The recent hybrid methods submitted at the last SAT competition cannot prove inconsistency in the allowed time.

On the crafted instances, SatHyS is very competitive and solves approximately the same number of satisfiable instances as rsaps, adaptg2 and the recent hybrid methods. Furthermore, SatHyS solves much more instances than wsat, its built-in solver. Concerning unsatisfiable crafted instances, as expected our approach is less efficient than minisat but it is proved highly more efficient than cls.

Concerning industrial instances, SatHyS solves two times more satisfiable instances than SLS and hybrid methods. Once again, on unsatisfiable industrial instances, your solver is better than cls but less efficient than minisat.

These results show that conflict analysis allows to solve efficiently structured SAT and UNSAT instances.

Finally, for the random category, we can note that SatHyS is unable to solve unsatisfiable problems. As pointed by minisat results, learning is not the good approach to solve random instances. As a summary, unfortunately our approach cannot reach the minisat performance. However the solver SatHyS is much more efficient than local search based algorithms and hybrid methods. It significantly improves wsat, its built-in solver. Even if minisat is the best solver on crafted and industrial instances, these first results are very encouraging and reduce the gap between local search based techniques and DPLL-like complete solvers.

Figure 3: Crafted instances

Figure 4: Industrial instances

Figure 5: Random instances

The figures 3, 4 and 5 give the classical cactus plot. For each tested method, the X-axis corresponds to the number of formulas and the Y-axis corresponds to the time needed to solve them if they were ran in parallel. When a method does not appear in the curve, that means that this method is not able to solve instance of this instances category. In these figures, we have distinguished satisfiable and unsatisfiable instances for each categories.

7 Conclusion

In this paper a new integration of local search and CDCL based SAT solvers is introduced. This hybrid solver represents an original combination of both engines. The CDCL component can be seen as a new strategy for escaping from local minimum. This is achieved by the assignment of opposite literals from the falsified clause. In the case of satisfiable SAT instances, such assignments are supposed to behave like a tabu search approach, whereas for unsatisfiable ones, they try to focus on a small sub-part of the formula, which is minimally unsatisfiable (MUS). SatHyS, the resulting method, obtains very good results for a large category of instances. This new method can be improved in different ways. As it was pointed in the experimental section, our solver allows for more diversification and less intensification. First attempts have been done to correct this. Finally, we aim at designing a solver which would focuses only on an approximation of the MUS.

References