On SAT representations of XOR constraints

by   Matthew Gwynne, et al.

We study the representation of systems S of linear equations over the two-element field (aka xor- or parity-constraints) via conjunctive normal forms F (boolean clause-sets). First we consider the problem of finding an "arc-consistent" representation ("AC"), meaning that unit-clause propagation will fix all forced assignments for all possible instantiations of the xor-variables. Our main negative result is that there is no polysize AC-representation in general. On the positive side we show that finding such an AC-representation is fixed-parameter tractable (fpt) in the number of equations. Then we turn to a stronger criterion of representation, namely propagation completeness ("PC") --- while AC only covers the variables of S, now all the variables in F (the variables in S plus auxiliary variables) are considered for PC. We show that the standard translation actually yields a PC representation for one equation, but fails so for two equations (in fact arbitrarily badly). We show that with a more intelligent translation we can also easily compute a translation to PC for two equations. We conjecture that computing a representation in PC is fpt in the number of equations.



There are no comments yet.


page 1

page 2

page 3

page 4


Propagation complete encodings of smooth DNNF theories

We investigate conjunctive normal form (CNF) encodings of a function rep...

Parameterized String Equations

We study systems of String Equations where block variables need to be as...

Performance of the Survey Propagation-guided decimation algorithm for the random NAE-K-SAT problem

We show that the Survey Propagation-guided decimation algorithm fails to...

Max-3-Lin over Non-Abelian Groups with Universal Factor Graphs

Factor graph of an instance of a constraint satisfaction problem with n ...

Unit contradiction versus unit propagation

Some aspects of the result of applying unit resolution on a CNF formula ...

On finding minimal w-cutset

The complexity of a reasoning task over a graphical model is tied to the...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recall that the two-element field has elements , where addition is XOR, which we write as , while multiplication is AND, written . A linear system of equations over , in matrix form , where is an matrix over , with the number of equations, the number of variables, while , yields a boolean function , which assigns to a total assignments of the variables of iff that assignment is a solution of . The task of “good” representations of by conjunctive normal forms (clause-sets, to be precise), for the purpose of SAT solving, shows up in many applications, for example cryptanalysing the Data Encryption Standard and the MD5 hashing algorithm in [16], translating Pseudo-Boolean constraints to SAT in [22], and in roughly in benchmarks from SAT 2005 to 2011 according to [57] (see Subsection 1.4 for an overview on the literature).

The basic criterion for a good is “arc-consistency”, which we write as “AC-representation” (similar to “-representation”, explained later); in Subsection 1.4.1 we review and discuss this terminology. For an arbitrary boolean function , a CNF-representation of is AC if for every partial assignment to the variables of , when applying the partial assignment to , i.e., performing , and then performing unit-clause propagation, which we write as , the result has no forced assignments anymore, that is, for every remaining variable and the result of assigning to in is satisfiable.

1.1 The lower bound

We show that there is no polynomial-size AC-representation of arbitrary linear systems (Theorem 6.5). To show this, we apply the lower bound on monotone circuit sizes for monotone span programs (msp’s) from [2], by translating msp’s into linear systems. An msp computes a boolean function (with ), by using auxiliary boolean variables , and for each a linear system , where is an matrix over . For the computation of , a value means the system is active, while otherwise it’s inactive; the value of is if all the active systems together are unsatisfiable, and otherwise. Obviously is monotonically increasing. The task is now to put that machinery into a single system of (XOR) equations. The main idea is to “dope” each equation of every with a dedicated new boolean variable added to the equation, making that equation trivially satisfiable, independently of everything else; all these auxiliary variables together are called , where is the number of equations in .

Example 1.1

Consider (), which can be represented by an msp with (with , thus ), where activates , while activates and activates . If , then the combined system is unsatisfiability, otherwise it is satisfiable. The doping process applied to yields linear equations , and .

If all the doping variable used for a system are set to , then they disappear and the system is active, while if they are not set, then this system is trivially satisfiable, and thus is deactivated. Now consider an AC-representation of . Note that the are not part of , but the variables of are together with , where the latter represent in a sense the . From we can compute by setting the accordingly (if , then all belonging to are set to , if , then these variables stay unassigned), running on the system, and output iff the empty clause was produced by . So we can compute msp’s from AC-representations of the corresponding linear system, since we can apply partial instantiation. The second pillar of the lower-bound proof is a general polynomial-time translation of AC-representations of (arbitrary) boolean functions into monotone circuits computing a monotonisation of the boolean function (Theorem 6.1; motivated by [9]), where this monotonisation precisely enables partial instantiation. So from we obtain a monotone circuit computing , whose size is polynomial in , where by [2] the size of is for certain msp’s.

Based on [34], this superpolynomial lower bound also holds, if we consider any fixed , and instead of requiring unit-clause propagation to detect all forced assignments, we only ask that “asymmetric width-bounded resolution”, i.e., -resolution, is sufficient to derive all contradictions obtained by (partial) instantiation (to the variables in ); see Corollary 6.6. Here -resolution is the appropriate generalisation of width-bounded resolution for handling long clauses (see [49, 51, 52, 54]), where for each resolution step at least one parent clause has length at most (while the standard “symmetric width” requires both parent clauses to have length at most ).

1.2 Upper bounds

Besides this fundamental negative result, we provide various forms of good representations of systems with bounded number of equations. Theorem 9.2 shows that there is an AC-representation with many clauses. The remaining results use a stronger criterion for a “good” representation, namely they demand that , where is the class of “unit-propagation complete clause-sets” as introduced in [12] — while for AC only partial assignments to the variables of are considered, now partial assignments for all variables in (which contains the variables of , and possibly further auxiliary variables) are to be considered. For the obvious translation , by subdividing the big constraints into small constraints, is in (Lemma 8.4). For we have an intelligent representation in (Theorem 10.1), while the use of (piecewise) is still feasible for full (dag-)resolution, but not for tree-resolution (Theorem 10.6).

We conjecture (Conjecture 10.2) that Theorem 9.2 and Theorem 10.1 can be combined, which would yield an fpt-algorithm for computing a representation for arbitrary with the parameter . We now turn to a discussion of the advantages of having a representation in compared to mere AC, also placing this in a wider framework.

1.3 Measuring “good” representations

We have seen yet two criteria for good representations of boolean functions, namely AC and the stronger condition of unit-propagation completeness. In Subsection 1.3.1 we discuss some fundamental aspects of these two criteria, while in Subsection 1.3.2 we consider another criterion, namely unit-refutation completeness, concluding this excursion in Subsection 1.3.3 by a general framework.

1.3.1 AC versus

It has been shown that the practical performance of SAT solvers can depend heavily on the SAT representation used. See for example [5, 72, 22] for work on cardinality constraints, [75, 76] for work on general constraint translations, and [46, 31] for investigations into different translations in cryptography. In order to obtain “good” representations, the basic concept is that of an AC-representation. The task is to ensure that for all (partial) assignments to the variables of the constraint, if there is a forced assignment (i.e., some variable which must be set to a particular value to avoid inconsistency), then unit-clause propagation () is sufficient to find and set this assignment. In a similar vein, there is the class of propagation-complete clause-sets, containing all clause-sets for which unit-clause propagation is sufficient to detect all forced assignments; the class was introduced in [12], while in [3] it is shown that membership decision is coNP-complete.

AC and may at a glance seem the same concept. However there is an essential difference. When translating a constraint into SAT, typically one does not just use the variables of the constraint, but one adds auxiliary variables to allow for a compact representation. Now when speaking of AC, one only cares about assignments to the constraint variables. But propagation-completeness deals only with the representing clause-set, thus can not know about the distinction between original and auxiliary variables, and thus it is a property on the (partial) assignments over all variables! So a SAT representation, which is AC, will in general not fulfil the stronger property of propagation-completeness, due to assignments over both constraint and auxiliary variables yielding a forced assignment or even an inconsistency which doesn’t detect.

In [44] it is shown that conflict-driven solvers with branching restricted to input variables have only superpolynomial run-time on , an (extreme) Extended Resolution extension to the pigeon-hole formulas, while unrestricted branching determines unsatisfiability quickly (see [54] for a proof-theoretical analysis of the general context). Also experimentally it is demonstrated in [45] that input-restricted branching can have a detrimental effect on solver times and proof sizes for modern CDCL solvers. This adds motivation to considering all variables (rather than just input variables), when deciding what properties we want for SAT translations. We call this the “absolute (representation) condition”, taking also the auxiliary variables into account, while the “relative condition” only considers the original variables.

Besides avoiding the creation of hard unsatisfiable sub-problems, the absolute condition also enables one to study the “target classes”, like , on their own, without relation to what is represented. Target classes different from have been proposed, and are reviewed in the following. The underlying idea of AC- and propagation-complete translations is to compress all of the constraint knowledge into the SAT translation, and then to use to extract this knowledge when appropriate. In Subsection 1.3.2 we present a weaker notion of what “constraint knowledge” could mean, while in Subsection 1.3.3 we present different extraction mechanisms.

1.3.2 Unit-refutation completeness

In [32, 33, 37] we considered the somewhat more fundamental class of “unit-refutation complete” clause-sets, introduced in [21] as a method for propositional knowledge compilation. Rather than requiring that detects all forced assignments (as for ), a clause-set is in iff for all partial assignments resulting in an unsatisfiable clause-set, detects this. As shown in [32, 33, 37], the equation holds, where , introduced in [71], is a fundamental class of clause-sets for which SAT is decidable in polynomial time; in [14] it was shown that membership decision for is coNP-complete.

These considerations can be extended to a general “measurement” approach, where we do not just have in/out for some target classes, but where a “hardness” measure tells us how far is from resp.  (in some sense), and this general approach is discussed next.

1.3.3 How to gauge representations?

We now outline a a more general approach to gauge how good is a representation of a boolean function . Obviously the size of must be considered, number of variables , number of clauses , number of literal occurrences . Currently we do not see a possibility to be more precise than to say that a compromise is to be sought between stronger inference properties of and the size of . One criterion to judge the inference power of is AC, as already explained. This doesn’t yield a possibility in case no AC-representation is feasible, nor is there a possibility for stronger representations. Our approach addresses these concerns as follows.

[32, 37] introduced the measures (“hardness”, “p-hardness”, and “w-hardness”), where is the set of all clause-sets (interpreted as CNF’s), and is some set of variables. These measures determine the maximal “effort” (in some sense) needed to show unsatisfiability of instantiations of for partial assignments with in case of and , resp. the maximal “effort” to determine all forced assignments for in case of . The “effort” in case of or is the maximal level of generalised unit-clause propagation needed, that is the maximal for reductions introduced in [51, 52], where is unit-clause propagation and is (complete) elimination of failed literals. While for the effort is the maximal needed for asymmetric width-bounded resolution, i.e., for each resolution step one of the parent clauses must have length at most .1)1)1)Symmetric width-bounded resolution requires both parent clauses to have length at most , which for arbitrary clause-length is not appropriate as complexity measure, since already unsatisfiable Horn clause-sets need unbounded symmetric width; see [54] for the use of asymmetric width in the context of resolution and/or space lower bounds.

Now we have that is an AC-representation of iff , while would allow higher levels of generalised unit-clause propagation (allowing potentially shorter ). Weaker is the requirement , which has various names in the literature: we call it “relative hardness”, while [11] calls it “existential unit-refutation completeness”, and [4] calls it “unit contradiction”. Now not every forced assignment is necessarily detected by unit-clause propagation, but only unsatisfiability. Similarly, would allow higher levels of generalised unit-clause propagation.

If we only consider “relative (w/p-)hardness”, that is, , then, as shown in [34], regarding polysize representations for all conditions , , and are equivalent to AC (), that is, the representations can be transformed in polynomial time into AC-representations. These transformations produce large representations (and very likely they are not fixed-parameter tractable in ), and so higher can yield smaller representations, however these savings can not be captured by the notion of polynomial size.

This situation changes, as we show in [36], when we do not allow auxiliary variables, that is, we require : Now higher for each of these measures allows short representations which otherwise require exponential size. We conjecture, that this strictness of hierarchies also holds in the presence of auxiliary variables, but using the absolute condition, i.e., (all variables are included in the worst-case determinations for (w/p)-hardness). The measurements in case of are just written as . In this way we capture the classes and , namely and . More generally we have , and . The basic relations between these classes are for , for , and for .

1.4 Literature review

Section 1.5 of [33, 37] discusses the translation of the so-called “Schaefer classes” into the hierarchy; see Section 12.2 in [19] for an introduction, and see [18] for an in-depth overview on recent developments. All Schaefer classes except affine equations have natural translations into either or . The open question was whether systems of XOR-clauses (i.e., affine equations) can be translated into for some fixed ; the current paper answers this question negatively.

Our investigations into the classes started with [32, 37]. From there on, three new developments started. First we have this paper. Then we have [36], showing that without auxiliary variables, the hierarchies , and are strict regarding polysize representations of boolean functions. Finally, [34] discusses general tools for obtaining “good” representations, and how SAT solvers perform with them; it contains the proof that regarding the relative condition and allowing auxiliary variables, all three hierarchies , and collapse to their first level (regarding polysize representations). The predecessor of [36, 34] and the current report is [35], while the (shortened) conference version of the current report is [38].

1.4.1 Discussion of terminology “arc-consistent representation”

We think that the current terminology in the literature can potentially cause confusion between the fields CSP (constraint-satisfaction problems) and SAT, and thus some clarifying discussion is needed. The notion “AC-representation” of fully expanded says: “a representation maintaining hyperarc-consistency (or generalised arc-consistency) via unit-clause propagation for the (single, global) boolean constraint after (arbitrary) partial assignments”. Recall that a constraint is “hyperarc-consistent” if for each variable each value in its (current) domain is still available (does not yield an inconsistency); see Chapter 3 of [68]. However for a (boolean) clause-set it is not clear what the “constraint” is. In our context it is most natural to consider the whole as a “constraint”, and then, since every variable has (precisely) two values, “arc-consistency” of means that has no forced assignments (or no “forced literals”). It must be emphasised here that considering as a single constraint is not a natural point of view for the CSP area. The standard notion of “arc-consistency” just applies to , and for ordinary constraints is considered as constant — only “global constraints” are allowed to contain a non-constant number of variables. Especially for XOR-clause-sets it is tempting to take each XOR-clause as a constraint, but this is not interesting here.

The notion of “an encoding maintaining arc-consistency via unit propagation” has been introduced in [28], showing that the support encoding of a single constraint yields in our terminology an AC-representation — it is essential here that this assumes as usual that the number of variables is constant. “Maintaining”, as in “MAC” for “maintaining arc-consistency”, applies to constraints after a domain restriction (which for SAT is achieved by partial assignments), where (hyper)arc-consistency has to be re-established (this can be done in polynomial time, since the number of variables in a constraint is constant). Apparently the first explicit definition of “arc-consistency under unit propagation” for SAT representations is [22], the Definition on Page 5 (it is left open whether may also involve the introduced variables, but this is a kind of automatic assumption, since only the variables of the (original) constraint are considered in this context2)2)2)An assumption which we challenge by considering .). For further examples for pseudo-boolean constraints see Section 22.6.7 in [69] and [5, 72, 6], while related considerations one finds in [48].

We prefer to speak of “AC-representations”, hiding the “arc-consistency”. It also seems superfluous to mention in this context “unit(-clause) propagation”. One could also say “AC-translation” or“AC-encoding”, but we reserve “translation” for (poly-time) functions computing a representation, and “encoding” for the translation of non-boolean variables into boolean variables. Our own “proper terminology” for an “AC-representation” of is that is a CNF-representation of with relative p-hardness at most , i.e., .

1.4.2 Applications of XOR-constraints

If we do not specify the representation in the following, then essentially what we call is used, that is, breaking up long XOR-constraints into short ones and using the unique equivalent CNF’s for the short constraints.

XOR-constraints are a typical part of cryptographic schemes, and accordingly it is important to have “good” representations for them. The earliest application of SAT to cryptanalysis is [65], translating DES to SAT and then considering finding a key. In [16], DES is encoded to ANF (“algebraic normal form”, that is, XOR’s of conjunctions), and then translated. [46] attacks DES, AES and the Courtois Toy Cipher via translation to SAT. Each cipher is first translated to equations over GF(2) and then to CNF. A key contribution is a specialised translation of certain forms of polynomials, designed to reduce the number of variables and clauses. The size for breaking up long XOR-constraints is called the “cutting length”, and has apparently some effect on solver times. [66] translates MD5 to SAT and finds collisions. MD5 is translated by modelling it as a circuit (including XORs) and applying the Tseitin translation.

[64] provides an overview of SAT-based methods in Electronic Design Automation, and suggests keeping track of circuit information (fan in/fan out of gates etc.) in the SAT solver when solving such instances. XOR is relevant here due to the use of XOR gates in the underlying circuit being checked (and translated).

A potential application area is the translation of pseudo-boolean constraints, as investigated by [22]. Translations via “full-adders” introduces XORs via translation of the full-adder circuit. It is shown that this translation does not produce an AC-representation (does not “maintain arc-consistency via unit propagation”), and the presence of XOR and the log encoding is blamed for this (in Section 5.5). Experiments conclude that sorting network and BDD methods perform better, as long as their translations are not too large.

1.4.3 Hard examples via XORs

It is well-known that translating each XOR to its prime implicates results in hard instances for resolution. This goes back to the “Tseitin formulas” introduced in [77], which were proven hard for full resolution in [78], and generalised to (empirically) hard satisfiable instances in [39]. A well-known benchmark was introduced in [17], called the “Minimal Disagreement Parity” problem (which became the parity32 benchmarks in the SAT2002 competition). Given vectors , further bits , and , find a vector such that , where is the scalar product. The SAT encoding is for (so iff ), together with a cardinality constraint based on full-adders. So the XORs occur both in the summations and the cardinality constraint. These benchmarks were first solved by the solver EqSatz ([63]).

1.4.4 Special reasoning

It is natural to consider extensions of resolution and/or SAT techniques to handle XOR-constraints more directly. The earliest theoretical approach seems [7], integrating a proof calculus for Gaussian elimination with an abstract proof calculus modelling DPLL. It is argued that such a system should offer improvements over just DPLL/resolution in handling XORs. [42] points out the simple algorithm for extracting “equivalence constraints”. The earliest SAT solver with special reasoning is EqSatz ([63]), extracting XOR-clauses from its input and applying DP-resolution plus incomplete XOR reasoning rules. More recently, CryptoMiniSAT ([74, 73]) integrates Gaussian elimination during search, allowing both explicitly specified XOR-clauses and also XOR-clauses extracted from CNF input. However in the newest version 3.3 the XOR handling during search is removed, since it is deemed too expensive.3)3)3)See http://www.msoos.org/2013/08/why-cryptominisat-3-3-doesnt-have-xors/. Further approaches for hybrid solvers one finds in [15, 40].

A systematic study of the integration of XOR-reasoning and SAT-techniques has been started with [55], by introducing the “DPLL(XOR)” framework, similar to SMT. These techniques have also been integrated into MiniSat. [56] expands on this by reasoning about equivalence classes of literals created by binary XORs, while [58] learns conflicts in terms of “parity (XOR) explanations”. The latest paper [59] (with underlying report [60]) extends the reasoning from “Gauß elimination” to “Gauß-Jordan elimination”, which corresponds to moving from relative hardness to relative p-hardness, i.e., also detecting forced literals, not just inconsistency.4)4)4)We say “relative” here, since the reasoning mechanism is placed outside of SAT solving, different from the “absolute” condition, where also the reasoning itself is made accessible to SAT solving (that is, one can (feasibly!) split in some sense on the higher-level reasoning). Theorem 4 in [59] is similar in spirit to Corollary 4.8, Part 2, considering conditions when strong reasoning only needs to be applied to “components”.

Altogether we see a mixed picture regarding special reasoning in SAT solvers. The first phase of expanding SAT solvers could be seen as having ended in some disappointment regarding XOR reasoning, but with [55] a systematic approach towards integrating special reasoning has been re-opened. A second approach for handling XOR-constraints, the approach of this paper, is by using intelligent translations (possibly combined with special reasoning).

1.4.5 Translations to CNF

Switching now to translations of XORs to CNF, [57] identifies the subsets of “tree-like” systems of XOR constraints, where the standard translation delivers an AC-representation (our Theorem 8.5 strengthens this, showing that indeed a representation in is obtained):

  • [57] also considered equivalence reasoning, where for “cycle-partitionable” systems of XOR constraints this reasoning suffices to derive all conclusions.

  • Furthermore [57] showed how to eliminate the need for such special equivalence reasoning by another AC-representation.

  • In general, the idea is to only use Gaussian elimination for such parts of XOR systems which the SAT solver is otherwise incapable of propagating on. Existing propagation mechanisms, especially unit-clause propagation, and to a lesser degree equivalence reasoning, are very fast, while Gaussian elimination is much slower (although still poly-time).

Experimental evaluation on SAT 2005 benchmarks instances showed that, when “not too large”, such CNF translations outperform dedicated XOR reasoning modules. The successor [61] provides several comparisons of special-reasoning machinery with resolution-based methods, and in Theorem 4 there we find a general AC-translation; our Theorem 9.2

yields a better upper bound, but the heuristic reasoning of

[59, 61] seems valuable, and should be explored further.

1.5 Better understanding of “SAT”

One motivation of our investigations is the question on the relation between CSP and SAT, and on the success of SAT. Viewing a linear system as a constraint on , one can encode evaluation via Tseitin’s translation, obtaining a CNF-representation with the property that for every total assignment , i.e., , we have that either contains the empty clause or is empty.5)5)5)In Subsection 9.4.1 of [36] this class of representations is called ; up to linear-time transformation it is the same as representations by boolean circuits. However this says nothing about partial assignments , and as our result shows (Theorem 6.5), there is indeed no polysize representation which handles all partial assignments. One can write an algorithm (a kind of “constraint”, now using Gaussian elimination) which handles (in a sense) all partial assignments in polynomial time (detects unsatisfiability of for all partial assignments ), but this can not be integrated into the CNF formalism (by using auxiliary variables and clauses), since algorithms always need total assignments, and so partial assignments would need to be encoded — the information “variable not assigned” (i.e., ) needs to be represented by setting some auxiliary variable, and this must happen by a mechanism outside of the CNF formalism.

It is an essential strength of the CNF formalism to allow partial instantiation; if we want these partial instantiations also to be easily understandable by a SAT solver, then the results of [9] and our results show that there are restrictions. Yet there is little understanding of these restrictions. There are many examples where AC and stronger representations are possible, while the current non-representability results, one in [9], one in this article and a variation on [9] in [54], rely on non-trivial lower bounds on monotone circuit complexity; in fact Corollary 6.4 shows that there is a polysize AC-representation of a boolean function if and only if the monotonisation , which encodes partial assignments to , has polysize monotone circuits.

1.6 Overview on results

In Section 2 we give the basic definitions related to clause-sets, and in Section 3 we review the basic concepts and notions related to hardness, w-hardness, and the classes and . Section 4 contains a review of p-hardness and the classes , and also introduces basic criteria for for clause-sets : the main result Theorem 4.6 shows that the “incidence graph” being acyclic is sufficient. In Section 5 we introduce the central concepts of this article, “XOR-clause-sets” and their CNF-representations, and show in Lemma 5.2, that the sum of XOR-clauses is the (easier) counterpart to the resolution operation for (ordinary) clauses.

In Section 6 we present our general lower bound. Motivated by [9], in Theorem 6.1 we show that from a CNF-representation of relative hardness of a boolean function we obtain in polynomial time a monotone circuit computing the monotonisation , which extends by allowing partial assignments to the inputs.6)6)6)The precise relation to the results of [9] is not clear. The notion of “CNF decomposition of a consistency checker” in [9] is similar to an AC-representation, but it contains an additional special condition. Actually, as we show in Corollary 6.4, in this way AC-representations are equivalently characterised. Theorem 6.5 shows that there are no short AC-representations of arbitrary XOR-clause-sets (at all), with Corollary 6.6 generalising this to arbitrary relative w-hardness.

The fundamental translation of XOR-clause-sets (using the trivial translation for every XOR-clause) is studied in Section 7, with Lemma 7.1 stating the basic criterion for . Furthermore the Tseitin formulas are discussed. The standard translation, called , uses , but breaks up long clauses first (to avoid the exponential size-explosion), and is studied in Section 8. The main result here is Theorem 8.5, showing that if fulfils the basic graph-theoretic properties considered before, then . In Section 9 we show that , where is obtained from by adding all implied XOR-clauses, achieves AC in linear time for a constant number of XOR-clauses (Theorem 9.2). In Section 10 we turn to the question of two XOR-clauses and . In Theorem 10.1 we show how to obtain a translation . Then we discuss and and show, that all three cases can be distinguished here regarding their complexity measures; the worst representation is , which still yields an acceptable translation regarding w-hardness, but not regarding hardness (Theorem 10.6). Finally in Section 11 we present the conclusions and open problems.

2 Preliminaries

We follow the general notations and definitions as outlined in [50]. We use and . Let be the infinite set of variables, and let be the set of literals, the disjoint union of variables as positive literals and complemented variables as negative literals. We use to complement a set of literals. A clause is a finite subset which is complement-free, i.e., ; the set of all clauses is denoted by . A clause-set is a finite set of clauses, the set of all clause-sets is . By we denote the underlying variable of a literal , and we extend this via for clauses , and via for clause-sets . The possible literals in a clause-set are denoted by . Measuring clause-sets happens by for the number of variables, for the number of clauses, and for the number of literal occurrences. A special clause-set is , the empty clause-set, and a special clause is , the empty clause.

A partial assignment is a map for some finite , where we set , and where the set of all partial assignments is . For let (with and ). We construct partial assignments by terms for literals with different underlying variables and . For and we denote the result of applying to by , removing clauses containing with , and removing literals with from the remaining clauses. By the set of satisfiable clause-sets is denoted, and by the set of unsatisfiable clause-sets.

By we denote unit-clause propagation, that is,

  • if ,

  • if contains only clauses of length at least ,

  • while otherwise a unit-clause is chosen, and recursively we define .

It is easy to see that the final result does not depend on the choices of the unit-clauses. In [51, 52] the theory of generalised unit-clause propagation for was developed; the basic idea should become clear from the definition of , which is complete “failed literal elimination”: if , then , if otherwise there is a literal such that , then , and otherwise .

Reduction by applies certain forced assignments to the (current) , which are assignments such that the opposite assignment yields an unsatisfiable clause-set, that is, where ; the literal here is also called a forced literal. The reduction applying all forced assignments is denoted by (so ). Forced assignments are also known under other names, for example “necessary assignments” or “backbones”; see [43] for an overview on algorithms computing all forced assignments.

Two clauses are resolvable iff they clash in exactly one literal , that is, , in which case their resolvent is (with resolution literal ). A resolution tree is a full binary tree formed by the resolution operation. We write if is a resolution tree with axioms (the clauses at the leaves) all in and with derived clause (at the root) .

A prime implicate of is a clause such that a resolution tree with exists, but no exists for some with ; the set of all prime implicates of is denoted by . The term “implicate” refers to the implicit interpretation of as a conjunctive normal form (CNF). Considering clauses as combinatorial objects one can speak of “prime clauses”, and the “” in our notation reminds of “unsatisfiability”, which is characteristic for CNF. Two clause-sets are equivalent iff . A clause-set is unsatisfiable iff . If is unsatisfiable, then every literal is a forced literal for , while otherwise is forced for iff . It can be considered as known, that for a clause-set the computation of all prime implicates is fixed-parameter tractable (fpt) in the number of clauses, but perhaps that is not stated explicitly in the literature, and so we provide a simple proof:

Lemma 2.1

Consider and let . Then , and can be computed in time .

Proof:  By Subsection 4.1 (especially Lemma 4.12) in [36] we run through all subsets , determine the set of pure literals of , and include if is an implicate of (note that SAT-decision for a CNF-clause-set can be done in time ). To the final result subsumption-elimination is applied, which can be done in cubic time, and we obtain .

3 Measuring “SAT representation complexity”

In this section we define and discuss the measures and the corresponding classes . It is mostly of an expository nature, explaining the background from [51, 52, 32, 33, 37]. For the measure and the corresponding classes see Section 4.

3.1 Hardness and

Hardness for unsatisfiable clause-sets was introduced in [51, 52], while the specific generalisation to arbitrary clause-sets used here was first mentioned in [1], and systematically studied in [32, 33, 37]. Using the Horton-Strahler number of binary trees, applied to resolution trees (deriving clause from ), the hardness for can be defined as the minimal such that for all prime implicates of there exists with . An equivalent characterisation uses necessary levels of generalised unit-clause propagation (see [32, 33, 37] for the details):

Definition 3.1

Consider the reductions for as introduced in [51]; it is unit-clause propagation, while is (full, iterated) failed-literal elimination. Then for and is the minimal such that for all with and holds , i.e., the minimal such that detects unsatisfiability of any partial instantiation of variables in . Furthermore .

For every there is a partial assignment with , where consists of certain forced assignments , i.e., . A weaker localisation of forced assignments has been considered in [20], namely “-backbones”, which are forced assignments for such that there is with and such that is forced also for . It is not hard to see that for will set all -backbones of (using that for we have by Lemma 3.18 in [51]). The fundamental level of “hardness” for forced assignments or unsatisfiability is given by the level needed for . As a hierarchy of CNF-classes and only considering detection of unsatisfiability (for all instantiations), this is captured by the -hierarchy (with “UC” for “unit-refutation complete”):

Definition 3.2

For let .

is the class of unit-refutation complete clause-sets, as introduced in [21]. In [32, 33, 37] we show that , where is the class of clause-sets solvable via Single Lookahead Unit Resolution (see [25]). Using [14] we then obtain ([32, 33, 37]) that membership decision for () is coNP-complete for . The class is the class of all clause-sets where unsatisfiability for any partial assignment is detected by failed-literal reduction (see Section 5.2.1 in [41] for the usage of failed literals in SAT solvers).

3.2 W-Hardness and

A basic weakness of the standard notion of width-restricted resolution, which demands that both parent clauses must have length at most for some fixed (“width”, denoted by below; see [8]), is that even Horn clause-sets require unbounded width in this sense. A better solution, as investigated and discussed in [51, 52, 54], seems to use the notion of “-resolution” as introduced in [49], where only one parent clause needs to have length at most (thus properly generalising unit-resolution).7)7)7)Symmetric width is only applied to clause-sets with bounded clause-length, and here everything can be done as well via asymmetric width, as discussed in [54]. It might be that symmetric width could have a relevant combinatorial meaning, so that symmetric and asymmetric width both have their roles. Nested input-resolution ([51, 52]) is the proof-theoretic basis of hardness, and approximates tree-resolution. In the same vein, -resolution is the proof-theoretic basis of “w-hardness”, and approximates dag-resolution (see Theorem 6.12 in [52]):

Definition 3.3

The w-hardness (“width-hardness”, or “asymmetric width”) is defined for as follows:

  1. If , then is the minimum such that -resolution refutes , that is, such that exists where for each resolution step in we have or (this concept corresponds to Definition 8.2 in [51], and is a special case of “” as introduced in Subsection 6.1 of [52]).

  2. If , then .

  3. If , then .

For let .

The symmetric width is defined in the same way, only that for we define as the minimal such that there is , where all clauses of (axioms and resolvents) have length at most .

More generally, for , we define and for unsatisfiable , while for satisfiable only with are considered.

We have , , and for all holds (this follows by Lemma 6.8 in [52] for unsatisfiable clause-sets, which extends to satisfiable clause-sets by definition). What is the relation between asymmetric width and the well-known (for unsatisfiable ) symmetric width ? By definition we have for all . Now consider , where is the set of all Horn clause-sets, defined by the condition that each clause contains at most one positive literal. The symmetric width here is unbounded, and is equal to the maximal clause-length of in case is minimally unsatisfiable. But we have here. So for unbounded clause-length there is an essential difference between symmetric and asymmetric width. On the other hand we have

for , , by Lemma 8.5 in [51], or, more generally, Lemma 6.22 in [52] (also shown in [54]). So for bounded clause-length and considered asymptotically, symmetric and asymmetric width can be considered equivalent.

4 P-Hardness and

Complementary to “unit-refutation completeness”, there is the notion of “propagation-completeness” as investigated in [67, 12], yielding the class . This was captured and generalised by a measure of “propagation-hardness” along with the associated hierarchy, defined in [33, 37] as follows:

Definition 4.1

For and we define the (relative) propagation-hardness (for short “p-hardness”) as the minimal such that for all partial assignments with we have , where applies all forced assignments, and can be defined by . Furthermore . For let (the class of propagation-complete clause-sets of level ).


  1. We have .

  2. For we have .

  3. By definition (and composition of partial assignments) we have that all classes are stable under application of partial assignments.

  4. We have iff for all the clause-set in case of has no forced literals.

Recall that a clause-set has no forced assignments (at all) if and only if all prime implicates of have length at least .8)8)8)The “at all” is for the case , where every literal is forced for , but has no literals. Before proving the main lemma (Lemma 4.5), we need a simple characterisation of such clause-sets without forced assignments. Recall that a partial assignment is an autarky for iff for all with holds ; for an autarky for the (sub-)clause-set is satisfiable iff is satisfiable. See [50] for the general theory of autarkies.

Lemma 4.2

A clause-set has no forced assignments if and only if is satisfiable, and for every there is an autarky for with .

Proof:  If has no forced assignment, then can not be unsatisfiable (since then every literal would be forced), and for a literal the clause-set is satisfiable (since is not forced), thus has a satisfying assignment , and the composition (first assigning , then applying ) is an autarky for . Now let be satisfiable, and for each let there be an autarky for with . If had a forced literal , then consider an autarky for with : since is forced, is unsatisfiable, while by the autarky condition would be satisfiable.

In the rest of this section we show that having an “acyclic incidence graph” yields a sufficient criterion for for clause-sets .

Definition 4.3

For a finite family of clause-sets the incidence graph is the bipartite graph, where the two parts are given by and , while there is an edge between and if . We say that is acyclic if is acyclic (i.e., has no cycle as an (undirected) graph, or, equivalently, is a forest). A single clause-set is acyclic if is acyclic.

From the family of clause-sets we can derive the hypergraph , whose hyperedges are the variable-sets of the . Now is acyclic iff is “Berge-acyclic” (which just means that the bipartite incidence graph of is acyclic). The standard notion of a constraint satisfaction instance being acyclic, as defined in Subsection 2.4 in [30], is “-acyclicity” of the corresponding “formula hypergraph” (as with , given by the variable-sets of the constraints), which is a more general notion, however since there is no automatic conversion from (sets of) clause-sets to CSP’s, there is no danger of confusion here.

Since the property of the incidence graph being acyclic only depends on the occurrences of variables, if is acyclic, then this is maintained by applying partial assignments and by adding new variables to each :

Lemma 4.4

Consider an acyclic family of clause-sets.

  1. For every family of partial assignments the family is acyclic.

  2. Every family with and for all , , is acyclic.

We are ready to prove that an acyclic union of clause-sets without forced assignments has itself no forced assignments. This is kind of folklore in the CSP literature (see e.g. [26, 27]), using that a clause-set has no forced assignments iff considered as a single constraint is generalised arc-consistent, but to be self-contained we provide the proof. The idea is simple: any assignment to a (single) variable in some can be extended to a satisfying assignment of , which sets single variables in other , which can again be extended to satisfying assignments, and so on, and due to acyclicity never two variables are set in some . Perhaps best to see this basic idea at work for a chain of clause-sets, where each has no forced assignments, and neighbouring clause-sets share at most one variable (), while otherwise these clause-sets are variable-disjoint: then no literal is forced for , since the assignment can only affect at most two neighbouring clause-sets, for which that assignment can be extended to a satisfying assignment of them, due to them not having forced assignment (and must be the only common variable); now can be continued “left and right”, using that assigning at most one variable in some can not destroy satisfiability of .

Lemma 4.5

Consider an acyclic family of clause-sets. If no has forced assignments, then also has no forced assignments.

Proof:  We use the following simple property of acyclic graphs : if is a connected set of vertices and , then there is at most one vertex in adjacent to (since otherwise there would be a cycle in ).

Let and . Consider a literal , and we show that can be extended to an autarky for (the assertion then follows by Lemma 4.2); let with . Consider a maximal with the properties:

  1. ;

  2. for it is connected in ;

  3. there is a partial assignment with

    1. .

fulfils these three properties (since has no forced assignments), and so there is such a maximal . If there is no adjacent to some variable in , then is an autarky for and we are done; so assume there is such an . According to the above remark, there is exactly one adjacent to , that is, . Since has no forced assignments, there is a partial assignment with , and . Now satisfies , and thus satisfies the three conditions, contradicting the maximality of .

Lemma 4.5 only depends on the boolean functions underlying the clause-sets , and thus could be formulated more generally for boolean functions . We obtain a sufficient criterion for the union of unit-propagation complete clause-sets to be itself unit-propagation complete:

Theorem 4.6

Consider and an acyclic family of clause-sets. If for all we have , then also .

Proof:  Let , and consider a partial assignment with for . We have to show that has no forced assignments. For all we have , and thus has no forced assignments (since ). So has no forced assignments by Lemma 4.5. Thus , whence has no forced assignments.

The conditions for being acyclic, which are relevant to us, are collected in the following lemma; they are in fact pure graph-theoretical statements on the acyclicity of bipartite graphs, but for concreteness we formulate them in terms of families of clause-sets:

Lemma 4.7

Consider a family of clause-sets, and let .

  1. If there are , , with , then is not acyclic.

  2. Assume that for all , , holds . If the “variable-interaction graph”, which has vertex-set , while there is an edge between with if , is acyclic, then is acyclic.

  3. If there is a variable , such that for , , holds , then is acyclic.

Proof:  For Part 1 note that together with , , yield a cycle (of length ) in . For Part 2 assume has a cycle (which must be of even length ). The case is not possible, since different clause-sets have at most one common variable, and thus . Leaving out the interconnecting variables in , we obtain a cycle of length in the variable-interaction graph. Finally for Part 3 it is obvious that can not have a cycle , since the length of needed to be at least , which is not possible, since the only possible vertex in it would be .

Corollary 4.8

Consider and a family of clause-sets with for all . Then each of the following conditions implies :

  1. Any two different clause-sets have at most one variable in common, and the variable-interaction graph is acyclic.

  2. There is a variable with for all , .

The following examples show that the conditions of Corollary 4.8 can not be improved in general:

Example 4.9

An example for three boolean functions without forced assignments, where each pair has exactly one variable in common, while the variable-interaction graph has a cycle, and the union is unsatisfiable, is . And if there are two variables in common, then also without a cycle we can obtain unsatisfiability, as shows. The latter family of two boolean functions yields also an example for a family of two clause-sets where none of them has forced assignments, while the union has (is in fact unsatisfiable). Since a hypergraph with two hyperedges is “-acyclic”, in the fundamental Lemma 4.5 we thus can not use any of the more general notions “//-acyclicity” (see [23] for these four basic notions of “acyclic hypergraphs”).

5 Systems of XOR-constraints

We now review the concepts of “XOR-constraints” and their representations via CNF-clause-sets. In Subsection 5.1 we model XOR-constraints via “XOR-clauses” (and “XOR-clause-sets”), and we define their semantics. And in Subsection 5.2 we define “CNF-representations” of XOR-clause-sets, and show in Lemma 5.2 that all XOR-clauses following from an XOR-clauses are obtained by summing up the XOR-clauses.

5.1 XOR-clause-sets

As usual, an XOR-constraint (also known as “parity constraint”) is a (boolean) constraint of the form for literals and , where is the addition in the 2-element field . Note that is equivalent to , while and , and and . Two XOR-constraints are equivalent, if they have exactly the same set of solutions. In this report we prefer a lightweight approach, and so we do not present a full framework for working with XOR-constraints, but we use a representation by XOR-clauses. These are just ordinary clauses , but under a different interpretation, namely implicitly interpreting as the XOR-constraints . And instead of systems of XOR-constraints we just handle XOR-clause-sets , which are sets of XOR-clauses, that is, ordinary clause-sets with a different interpretation. So two XOR-clauses are equivalent iff