    # A Separator Theorem for Hypergraphs and a CSP-SAT Algorithm

We show that for every r ≥ 2 there exists ϵ_r > 0 such that any r-uniform hypergraph on m edges with bounded vertex degree has a set of at most (1/2 - ϵ_r)m edges the removal of which breaks the hypergraph into connected components with at most m/2 edges. We use this to give an algorithm running in time d^(1 - ϵ_r)m that decides satisfiability of m-variable (d, k)-CSPs in which every variable appears in at most r constraints, where ϵ_r depends only on r and k∈ o(√(m)). Furthermore our algorithm solves the corresponding #CSP-SAT and Max-CSP-SAT of these CSPs. We also show that CNF representations of unsatisfiable (2, k)-CSPs with variable frequency r can be refuted in tree-like resolution in size 2^(1 - ϵ_r)m. Furthermore for Tseitin formulas on graphs with degree at most k (which are (2, k)-CSPs) we give a deterministic algorithm finding such a refutation.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The -SAT problem which naturally generalizes -SAT is the problem of deciding whether a system of constraints on variables from an alphabet of size , where each constraint is on at most variables, can be satisfied. We will call such a system of constraints a -CSP and we will assume that when given as input it is represented by the set of truth tables of its constraints. Therefore the satisfiability of a -CSP can be checked by exhaustive search in time . Therefore looking for exponential time algorithms beating this trivial running time is a natural direction.

For the usual -SAT problem, where , there is a plethora of such algorithms (see e.g. [1, 2, 3]). When the CSP encodes a certain structured problem we can also find improved algorithms. The notable example here is the graph -coloring problem which is a special case of -SAT and which can be solved in time  . More generally -SAT also admits non-trivial algorithms .

For the general -SAT we are interested in finding algorithms running in time for some . We call the parameter the savings of the algorithm, and we would like these savings to be as large as possible. Note that any -SAT algorithm can be easily converted to a -SAT algorithm. For each of the original variables, introduce boolean variables representing the original value in binary, and then express each constraint as a -CNF. The conjunction of these CNFs is satisfiable if and only if the original CSP is satisfiable. Assuming that we can solve -SAT in time , this yields an algorithm running in time . That is any non-trivial savings for -SAT yields non-trivial savings for -SAT. However these savings deteriorate as grows. This turns out to be the case also for algorithms which are directly designed to solve -SAT. Schöning’s seminal algorithm runs in time  . Similarly a generalization of PPSZ analyzed by Hertli et al.  has the same shortcoming. The central question is then whether it is possible to obtain savings independent of the domain size .

Let us define

 σd,k:=sup{δ:(d,k)-SAT can be solved in time O(d(1−δ)m)}.

The argument above gives . Furthermore Traxler  shows that for all , . Therefore it follows that under Strong Exponential Time Hypothesis, for all , . Our central question can be rephrased as follows.

###### Question 1.

Is it the case that for every , ?

Currently, we are unable to answer this question. However we show that if each variable appears in a small number of constraints then it is possible to decide satisfiability with savings independent of . Note that this restriction does not influence the NP-hardness of -SAT; in fact even if each variable appears in at most 2 constraints, it is easy to see that the problem remains NP-hard. Our result can be considered as an extension of a result of Wahlström  who gave such an algorithm for CNF-SAT when variables have bounded occurrences. However our argument is entirely different. Our algorithm also solves the counting and MAX versions of this problem. For Boolean CSPs with bounded occurrence such a result was shown by Chen and Santhanam .

###### Theorem 2 (Main result, informally stated).

There exists an algorithm which decides satisfiability of an variable -CSP in which every variable appears in at most constraints in time where depends only on , provided that does not grow too fast (that is ).

The algorithm follows a simple branching strategy. At every step we find a small set of variables that once given a value, the CSP breaks into disjoint parts each with at most half of the original variables. For every assignment on these variables we exhaustively solve the problem on the resulting smaller instances. If this set contains strictly less than half of the variables then we obtain savings. To prove that such a small set of variables exists, we associate a natural hypergraph to the CSP. Then we prove a structural result for these hypergraphs: We show that in every -uniform hypergraph on edges of small vertex degree, there exists a set of significantly less than half hyperedges the removal of which breaks the hypergraph into connected components with at most hyperedges.

###### Theorem 3 (Hypergraph separator theorem, informally stated).

Let be an -uniform hypergraph on hyperedges and maximum vertex degree . Provided that does not grow too fast as a function of , there exists a set of hyperedges the removal of which breaks the hypergraph into connected components with at most hyperedges. Furthermore, depends only on .

We find this result somewhat unexpected. To see this consider one particular consequence. Our result implies that there exists a universal constant such for any positive integer , any graph with maximum degree at most with sufficiently many edges can be broken into components with at most half of the edges by removing at most fraction of edges. What is unexpected is this independence between and , and furthermore such a result is impossible if we slightly change the definition of balanced separators and require that the number of vertices (instead of edges) would be at most half of the original graph in each component after removing edges. It is easy to see that this is essentially the same as edge-expansion which is known to be at least for some -regular graphs . The independence between savings and degree in our result is precisely what we take advantage of in our applications.

Non-trivial exponential size proofs of Tseitin formulas. We provide yet another application of Theorem 3 which concerns the proof complexity of unsatisfiable -CNF formulas. Recall that -TAUT is the language of -DNF tautologies (or equivalently the language of unsatisfiable -CNFs). Define

 νk:=sup{δ:k-TAUT can be solved in non-deterministic time O(2(1−δ)m)}.

The non-deterministic strong exponential time hypothesis (NSETH)  states that . Quantity can of course be defined with respect to a specific proof system instead of a general non-deterministic algorithm. In this direction some works verify NSETH for restrictions of the resolution proof system (see [13, 14, 15]).

But even if NSETH holds we may ask how fast approaches zero. Observe that as any -SAT algorithm in particular refutes unsatisfiable -CNFs formulas. The best known lower bound for is   and thus . We now raise a very natural question.

###### Question 4.

Is it the case that , that is can we use non-determinism to beat the best known savings of -SAT algorithms?

We can think of two ways to make progress towards this question. In the first direction we could try to obtain lower bounds on directly by proving upper bounds on the size of refutations of -CNFs. Interestingly (but not surprisingly) the bound can already be achieved by tree-like resolution , that is every unsatisfiable -CNF formula in variables can be refuted by a tree-like resolution proof of size .

In the second direction we can consider general enough families of -CNFs and try to obtain non-trivial refutations for them in as weak as possible a proof system. Here by general enough families we mean families of formulas which naturally contain a wide spectrum of easy to hard instances. An example of such families is the set of -CNFs encoding systems of linear equations over each on at most variables. Indeed such systems can easily be refuted using Gaussian elimination. But for , if the underlying system is minimally unsatisfiable, then there are even non-trivial regular resolution refutations of size  . Such a bound cannot be obtained in tree-like resolution .

However here using our separator theorem we show that if we restrict ourselves to Tseitin formulas over bounded degree graphs (which are also minimally unsatisfiable linear systems and massively studied in proof complexity, see e.g. [18, 19]), then we do have tree-like resolution refutations of size for a universal constant . Another unexpected upper bound for Tseitin formulas has recently been observed in  where a cutting plane proof of quasi-polynomial size has been given. Interestingly their proof is also based on a simple branching argument.

Deterministic construction of proofs. We know from a recent breakthrough result of Atserias and Müller  that unless P equals NP, given an unsatisfiable formula it is not possible to construct resolution proofs in time polynomial in the size of the of smallest proof of the input formula, nor in quasi-polynomial or subexponential time under plausible stronger assumptions. However it seems that the question of constructing proofs in non-trivial exponential time, natural counterpart to -SAT algorithms, has not been given too much attention.

More specifically given a proof system and positive integer we define as follows. It is the supremum of such that there is a deterministic algorithm that given an unsatisfiable -variable -CNF, it constructs a -refutation of it in time .

###### Question 5.

Is it the case that ?

As mentioned earlier we know that there are even tree-like resolution proofs with such savings. Here we are asking whether we can construct such proofs deterministically. We cannot answer this question yet. But we will show that over Tseitin formulas we can deterministically construct our proofs obtained by the separator theorem with constant savings.

## 2 Preliminaries

A hypergraph is -uniform if all of its hyperedges have size exactly . A connected component in a hypergraph is a maximal subset of vertices such that for every pair there exists a sequence of edges only on vertices in with and and for every . For any we denote by the set of hyperedges induced on , i.e., all hyperedges from which are entirely on vertices in . For we define . We will use to denote the maximum vertex degree of a (hyper)graph , that is the largest number of edges in which a vertex appears.

Given two real numbers , we write to mean the interval . To ease reading we write to mean .

We assume the reader is familiar with basic proof complexity, see [22, 23] for more background. However we will use the following interpretation of tree-like resolution proofs.

A decision tree for an unsatisfiable CNF is a rooted binary tree where the inner nodes are labeled by variables of and leaves are labeled by clauses of . Edges are labeled by 0 or 1 and should be interpreted as an assignment to the variable they come out of. Therefore a path from the root to a leaf gives an assignment to some of the variables. Furthermore we require that this assignment falsifies the clause at the corresponding leaf. The following theorem is folklore.

###### Theorem 6 (see, e.g., ).

The size of the smallest tree-like resolution refutation of is exactly the size of the smallest decision tree for it.

The following is immediate.

###### Corollary 7.

If there is a decision tree of depth at most for , then there exists a tree-like resolution refutation of of size at most .

## 3 Hypergraph Separator Theorem

In this section we present our main technical tool concerning the structure of hypergraphs.

###### Definition 8.

Let be a hypergraph. A balanced separator for is a set such that any connected component in is incident to at most edges from .

Note that any set of size is trivially a balanced separator. Therefore the question is whether it is possible to get balanced separators of size strictly less than . We show that when the hypergraph has bounded vertex degree this is indeed possible.

We will first show that when the maximum vertex degree of the hypergraph is small enough, small balanced separators do exist. We then show that our bound is quantitatively tight.

We need the following concentration bound. This is a consequence of an inequality due to Alon, Kim and Spencer  which was used by Dellamonica and Rödl .

###### Lemma 9 ([25, 26]).

Let be independent Bernoulli variables with for all . Let and assume that there is a function such that for all and every ,

 |f(x1,…,xn)−f(x1,…,xi−1,1−xi,xi+1,…,xn)|≤c.

Then for and any , we have

 Pr[|X−E[X]|≥ασ]≤2e−α2/4,

where .

###### Theorem 3.

Let be fixed and let be a hypergraph on hyperedges with maximum vertex degree where each hyperedge has size at most . Then has a balanced separator of size at most , where . Furthermore such a balanced separator can be found by a randomized algorithm in expected polynomial time and by a deterministic algorithm in time .

For any the proof of the theorem actually gives a balanced separator of size at most , where is a universal constant depending only on . We will use this bound later to bound the size of tree-like resolution of Tseitin formulas.

Notice, , as for , so and thus .

###### Proof.

We first observe that with a small modification we may assume without loss of generality that the hypergraph is -uniform and contains no isolated vertices. To make the hypergraph -uniform we partition the hyperedges of size less than into blocks of size . For each of these blocks we introduce new vertices and we add sufficiently many of them to the hyperedges in the block to make them -uniform. This guarantees that the maximum degree remains at most and we have added at most new vertices. We then remove isolated vertices if any exists. We will denote by the new number of vertices. As there are no isolated vertices .

The main idea for building the separator is to find a set of vertices such that and . Observe that for such , we also have . Furthermore note that and are separated by , i.e., every connected component in is entirely contained in or in . Then we arbitrarily select two sets and with such that and (which is possible by the assumption on ). It follows that is a balanced separator of size at most .

We pick the set by including each vertex in

independently at random with probability

. Let . For each vertex , let

be the random variable which takes value 1 if vertex

is included in , and it takes value 0 otherwise. Thus we have . We define two functions and as follows:

 f1(X1,…,Xn):=|E({i∈[n]:Xi=1})|=|E(S)|

and

 f2(X1,…,Xn):=|E({i∈[n]:Xi=1},{i∈[n]:Xi=0})|=|E(S,¯¯¯¯S)|.

Hence observe that

 E[f1(X1,…,Xn)]=prm=m2

and

 E[f2(X1,…,Xn)]=(1−pr−(1−p)r)m=(12−ϵr)m.

We would like to apply Lemma 9 on and . Note that since the maximum degree of is at most , for any and ,

 |fb(X1,…,Xn)−fb(X1,…,Xi−1,1−Xi,Xi+1,…,Xn)|≤k.

Setting , and we apply Lemma 9 on and and obtain

 Pr[|f1(X1,…,Xn)−m2|≥4k√np(1−p)]≤2e−4

and

 Pr[|f2(X1,…,Xn)−(12−ϵr)m|≥4k√np(1−p)]≤2e−4.

Since is fixed, and we have . Therefore with probability at least , and and hence there exists a choice of satisfying these properties. As explained earlier by adding at most edges to we obtain a balanced separator of size .

It is clear that the above argument also yields a randomized algorithm for finding such a balanced separator. The probability that satisfies our desired properties is at least a constant and in polynomial time we can verify whether it indeed satisfies both those properties. Thus by repeating the random choice of , in expected polynomial time we find our balanced separator.

The deterministic algorithm exhaustively checks all sets of at most edges to find a balanced separator. This has running time , where is the binary entropy function. Using , we can bound the running time by . ∎

We now show that Theorem 3 is tight. We probabilistically construct a sparse hypergraph in which every sufficiently large set of vertices induces a large number of edges. We then argue that the best balanced separator essentially has to partition the vertices into two parts of a particular size.

###### Lemma 10.

For every fixed , , and for any and such that , there exists an -uniform hypergraph on vertices with the following properties:

1. For every with , .

###### Proof.

We first sample from where , that is we construct by choosing each possible hyperedge of size independently with probability . The expected number of edges is . Thus by Chernoff’s bound with probability at least , .

Next we bound the number of edges which are incident to vertices of degree at least . For every vertex the probability that it has degree at least is at most . Note that for this probability is at most . Therefore the expected number of edges incident with some vertex of degree at least is at most

 n∞∑t=2ekt2−t=O(nk/2k).

Therefore by Markov’s inequality with constant probability the number of edges incident with some vertex of degree at least is at most .

Let be any set of size . The expected number of edges in is . Let . By Chernoff’s bound

 Pr[|E(S)|≠(1±δ)(αnr)q] ≤ 2exp(−δ2(αnr)q3) = 2exp(−δ2(αnr)nk3r(nr)) ≤ 2exp(−(1±o(1))αrnk1/33r),

where in the last inequality we use since is fixed. Since there are choices for , the probability that there exists of size with is at most

 (nαn)×2exp(−(1±o(1))αrnk1/33r)=o(1).

It follows that there exists an -uniform hypergraph with edges, at most of which are incident with some vertex of degree at least , where . Furthermore for every with , . We remove at most edges to make the maximum vertex degree at most . Call the resulting hypergraph . Note that . Let . We have in for every with , . A simple averaging argument further gives that for every with , . ∎

###### Theorem 11.

For every fixed and with , there exists an -uniform hypergraph with vertex degree and edges such that any balanced separator of has size at least , where .

We will show that the hypergraph given by Lemma 10 with satisfies this property. Note that and . Let . The following fact proves the result for the case when the balance separator is a bipartition. As it turns out and which we will show later, this is also the core of the argument for the general case.

###### Fact 1.

Let be a bipartition of with . Then .

###### Proof.

Assume and . Since , Lemma 10 guarantees that and and hence

 |E(A,B)| = (1−γr−(1−γ)r)m(1±o(1)) ≥ (1−αr−(1−α)r)m(1±o(1)) = (12−ϵr)m(1±o(1)),

where the inequality follows since the function is increasing for . ∎

###### Proof of Theorem 11.

Let be a balanced separator in of minimum size. The removal of edges in breaks into two or more connected components each with at most edges. By minimality of these connected components are induced subgraphs. We group these connected components in two parts and such that is minimized. Assume that . We have two cases. Either is connected or it contains more than one connected component. Note that . Assume first that is connected. We have since otherwise for some and hence by the choice of , contradicting that . Fact 1 then implies that .

Now assume that contains more than one connected component. Thus we can write , where and is an arbitrary bipartition of these components. Assume . We show that . Assume for a contradiction that this is not the case. Then and further we have . This means that and give a more balanced bipartition, contradicting the minimality of . Since and , we have (recall that and when ). Once again we can apply Fact 1 to conclude that .

## 4 A CSP-SAT Algorithm

A -CSP is defined by a set of variables taking values in an alphabet of size and a set of constraints each on most of these variables. We write to specify the variables and the constraint set. We will assume that the CSP is represented by the set of truth tables of its constraints. Observe that a -CSP can be represented as a -CNF. An assignment to the variables satisfies if it satisfies every constraint. The variable frequency of is the largest number of constraints that any variable appears in. Given a partial assignment which gives values to a set , the restriction of by is denoted by which is a CSP on and each constraint is restricted by fixing the values of variables in by .

Let be a CSP. We construct a hypergraph as follows. We set , that is every constraint is represented by a vertex in . For every variable we create a hyperedge , that is the set of constraints containing form a hyperedge.

###### Proposition 12.

Assume that consists of connected components . Then can be expressed as where for each .

###### Proof.

This is immediate once we observe that for any constraints and which are represented in and , respectively, for some , and do not have any variable in common. ∎

###### Proposition 13.

Let be a CSP and be the corresponding hypergraph. Let be a partial assignment which gives values to a set . Then is obtained by removing all with from . Furthermore, if is unsatisfiable so is and consequently if breaks into as in Proposition 12, then at least one of s is unsatisfiable.

###### Proof.

After restricting some of the variables, those variables disappear and some constraints get simplified. But no new constraint is introduced and hence the hypergraph is obtained by removing the corresponding hyperedges. If a CSP is unsatisfiable, obviously it is unsatisfiable under any partial assignment. If an unsatisfiable CSP is decomposed into disjoint CSPs, at least one of these CSPs is unsatisfiable, since otherwise we can take a satisfying assignment from each part and since they are on disjoint sets of variables together they form a satisfying assignment of the whole CSP. ∎

We are now ready to describe our CSP-SAT algorithm.

###### Theorem 2.

Let be a fixed integer, be integers such that . Let . Let be an -variable -CSP with variable frequency at most . correctly decides the satisfiability of . Moreover, if then it runs deterministically in time , if then it runs in expected time if we find randomly, and in deterministic time if we find deterministically.

###### Proof.

The correctness of the algorithm follows immediately from Proposition 13. In polynomial time we can construct . Observe that has edges each of size at most and it has vertex degree at most . By Theorem 3, we can find a balanced separator of size deterministically in time or probabilistically in expected polynomial time in . After having found the balanced separator , there are at most runs of the for loop over . For each restriction we spend time to compute the decomposition of . Then for each of these parts we exhaustively check its satisfiability in time at most . Since breaks into at most parts the total running time after finding is at most . For , since , the total running time including finding the separator is bounded by . The claim follows by noting that . For , if we use the randomized procedure to find , the total expected running time will be at most , and if we run the deterministic procedure to find , the total running time is at most . ∎

Remark. We can slightly modify the algorithm and instead of performing exhaustive search on the disjoint parts of the CSP we can make a recursive call to the algorithm. It is easy to verify that this improves the savings by a factor of two.

###### Corollary 14.

Let be a fixed real, be integers such that . Let . Let be an -variable -CSP with average variable frequency at most . correctly decides the satisfiability of . Moreover, if then it runs deterministically in time , if then it runs in expected time if we find randomly, and in deterministic time if we find deterministically.

###### Proof.

Consider the case . There are at most variables with frequency . For all possible settings of those variables run the algorithm on restricted to that setting. In such a restricted formula all variables have frequency at most . By the previous theorem the running time will be . The case for is analogous. ∎

Remark. Note that our algorithm trivially also counts the number of satisfying assignments, hence #CSP-SAT, and Max-CSP-SAT versions.

## 5 Upper Bounds for Tree-like Resolution

In this section we use our separator theorem to give non-trivial refutations of CNF representations of unsatisfiable -CSPs with bounded variable frequency. Recall that a -CSP with constraints can be represented by a -CNF with

clauses. This class of CSPs includes the extensively studied Tseitin formulas which essentially encode that in a simple graph the number of odd degree vertices is even. Here we consider a more general definition for hypergraphs due to Pudlák and Impagliazzo

.

###### Definition 15.

Let be a hypergraph and let . The Tseitin formula on , , has a variable for every edge and states that for every , . When we call an odd charge labeling.

When is an odd charge labeling and each edge has even cardinality, is unsatisfiable. When has maximum degree , is a -CSP. From here on we will use to denote both the CSP and its CNF representation when it is clear from the context which one we refer to.

###### Theorem 16.

Let be an unsatisfiable -CSP with variable frequency at most on variables. If is fixed and , then there exists a tree-like resolution refutation of the CNF representation of of size , where .

###### Proof.

Using Corollary 7 it is sufficient to give a decision tree for of depth at most .

We will make use of Theorem 3 applied to and the strategy is quite immediate. The decision tree starts with a complete binary tree on all variables corresponding to the hyperedges in the balanced separator given by Theorem 3 of size at most . For every leaf with partial assignment , by the separator property, Proposition 13 and Proposition 12 we can write for some , where s are on disjoint sets of at most variables. Furthermore at least one of s is unsatisfiable. We then append this leaf by the complete binary tree on all the variables in . It is clear that at every leaf of this tree a contradiction is forced. The depth of this tree is at most and we are done. ∎

###### Corollary 17.

Let be an unsatisfiable -CSP with variable frequency at most on variables. If is fixed and , then there exists a resolution refutation of the CNF representation of of width , where .

###### Proof.

This is immediate from Theorem 16 and the seminal result of Ben-Sasson and Wigderson  which states that if a -CNF formula has a tree-like resolution refutation of size , then it has a width resolution refutation. ∎

###### Corollary 18.

Let be an -uniform hypergraph of maximum degree where is even and let be an odd charge labeling. If is fixed and then can be refuted in tree-like resolution in size , where .

###### Proof.

Observe that each variable appears in constraints. ∎

The case corresponds to the the usual Tseitin tautologies on simple graphs. We give a finer analysis for this case which also involves a sharper derandomization.

###### Theorem 19.

There exists a deterministic algorithm which on input where graph is of maximum degree and is any odd charge labeling, produces a tree-like resolution refutation of in time

 |E|O(1)2(1−2(1−1√2)2)|E|+o(|E|)≤20.83|E|.
###### Proof.

We will construct a decision tree for and apply Theorem 6.

Let , , . We will build the decision tree inductively. Each node of the decision tree will be labeled by a pair , where , , and is an odd charge labeling of . At each node of the decision tree, we query the value of some edge , and depending on the value of the variable corresponding to the edge we descend to a child labeled by . Here, equals except and . Hence, if is an odd charge labeling so is . (In some nodes we will reduce even more as explained below.) The root of the decision tree is labeled by .

A node of the decision tree labeled by will be a leaf if there is a degree zero vertex in with . In this case the constraint corresponding to is falsified, and so some CNF clause associated to that constraint is also falsified. If contains a vertex of degree , we query the value of the incident edge and we descend to children labeled by . Notice, violates the parity constraint at either for or . So one of the children is a leaf.

If contains a vertex of degree , we query the value of one of the incident edges , and the two children are labeled by . If does not contain any vertex of degree less than 3, we find a balanced separator in of size at most , where and is a constant. (To find the separator we use an algorithm described later. The separator exists by a remark after Theorem 3.) We query all the edges one by one and after each query we descend from to via an edge labeled by . After asking the last edge of , instead of descending to we descend to where forms a connected component on which is an odd charge labeling. We let be the vertices belonging to that component and we set to restricted to . We repeat the whole process until at which point we query all the remaining edges one by one to identify a vertex violating its parity constraint.

To find the balanced separator we proceed as follows: If has minimum degree at least 3 we have . As every balanced separator corresponds to a cut in , we look for a small balanced separator by exhaustively checking all cuts