Proof Complexity Meets Algebra

11/20/2017 ∙ by Albert Atserias, et al. ∙ 0

We analyse how the standard reductions between constraint satisfaction problems affect their proof complexity. We show that, for the most studied propositional, algebraic, and semi-algebraic proof systems, the classical constructions of pp-interpretability, homomorphic equivalence and addition of constants to a core preserve the proof complexity of the CSP. As a result, for those proof systems, the classes of constraint languages for which small unsatisfiability certificates exist can be characterised algebraically. We illustrate our results by a gap theorem saying that a constraint language either has resolution refutations of constant width, or does not have bounded-depth Frege refutations of subexponential size. The former holds exactly for the widely studied class of constraint languages of bounded width. This class is also known to coincide with the class of languages with refutations of sublinear degree in Sums-of-Squares and Polynomial Calculus over the real-field, for which we provide alternative proofs. We then ask for the existence of a natural proof system with good behaviour with respect to reductions and simultaneously small size refutations beyond bounded width. We give an example of such a proof system by showing that bounded-degree Lovász-Schrijver satisfies both requirements. Finally, building on the known lower bounds, we demonstrate the applicability of the method of reducibilities and construct new explicit hard instances of the graph 3-coloring problem for all studied proof systems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The notion of an efficient reduction lies at the heart of computational complexity. However, in some of its subareas such as proof complexity, even though the concept exists, it is much less developed. The study of the lengths of proofs has developed mostly by studying combinatorial statements, each somewhat in isolation. There is little theory, for instance, explaining why the best studied families of propositional tautologies are encodings of the pigeonhole principle or those derived from systems of linear equations over the 2-element field. Whether there is any connection between the two is an even less explored mystery.

Luckily this fact is subject to revision, especially if proof complexity exports its methods to the study of problems beyond universal combinatorial statements. Consider the NP-hard optimization problem called MAX-CUT. The objective is to find a partition of the vertices of a given graph which maximizes the number of edges that cross the partition. The best efficient approximation algorithm known for this problem relies on certifying a bound on the optimum of its semidefinite programming relaxation. Once the certificate for the relaxation is in place, a rounding procedure gives an approximate integral solution: at worst 87% of the optimum in this case [27].

In the example of the previous paragraph, the problem that is subject to proof complexity analysis is that of certifying a bound on the optimum of an arbitrary MAX-CUT instance. The celebrated Unique Games Conjecture (UGC) can be understood as a successful approach to explaining why current algorithms and proof complexity analyses stop being successful where they do, and reductions play an important role there [49]. One of the interesting open problems in this area is whether the analysis of the Sums-of-Squares semidefinite programming hierarchy of proof systems (SOS) could be used to improve over the 87% approximation ratio for MAX-CUT. Any improvement on this would improve the approximation status of all problems that reduce to it, and refute the UGC [34]. For the constraint satisfaction problem, in which all constraints must be satisfied, as well as for its optimisation version, the analogue question was resolved recently also by exploiting the theory of reducibility: in that arena, low-degree SOS unsatisfiability proofs exist only for problems of bounded width [47, 25].

The goal of this paper is to develop the standard theory of reductions between constraint satisfaction problems in a way that it applies to many of the proof systems from the literature, including but not limited to Sums-of-Squares. Doing this requires a good amount of tedious work, but at the same time has some surprises to offer that we discuss next.

Consider a constraint language given by a finite domain of values, and relations over that domain. The instances of the constraint satisfaction problem (CSP) over are given by a set of variables and a set of constraints, each of which binds some tuple of the variables to take values in one of the relations of . The literature on CSPs has focussed on three different types of conditions that, if met by two constraint languages, give a reduction from the CSP of one language to the CSP of the other. These conditions are a) pp-interpretability, b) homomorphic equivalence, and c) addition of constants to the core (see [21, 14]). What makes these three types of reductions important is that they correspond to classical algebraic constructions at the level of the algebras of polymorphisms

of the constraint languages. Indeed, pp-interpretations correspond to taking homomorphic images, subalgebras and powers. The other two types of reductions put together ensure that the algebra of the constraint language is idempotent. Thus, for any fixed algorithm, heuristic, or method

for deciding the satisfiability of CSPs, if the class of constraint languages that are solvable by is closed under these notions of reducibility, then this class admits a purely algebraic characterization in terms of identities.

Our first result is that, for most proof systems in the literature, each of these methods of reduction preserves the proof complexity of the problem with respect to proofs in . Technically, what this means is that if is obtained from by a finite number of constructions a), b) and c), then, for any appropriate encoding scheme of the statement that an instance is unsatisfiable, efficient proofs of unsatisfiability in for instances of translate into efficient proofs of unsatisfiability in for instances of . Our results hold for a very general definition of an appropriate encoding scheme that we call local. The propositional proof systems for which we prove these results include DNF-resolution with terms of bounded size, Bounded-Depth Frege, and (unrestricted) Frege. The algebraic and semi-algebraic proof systems for which we prove it include Polynomial Calculus (PC) over any field, Sherali-Adams (SA), Lasserre/SOS, and Lovász-Schrijver (LS) of bounded and unbounded degree. This is the object of Section 4.

Our second main result is an application: we obtain unconditional gap theorems for the proof complexity of CSPs. Building on the bounded-width theorem for CSPs [12, 19], the known correspondence between local consistency algorithms, existential pebble games and bounded width resolution [35, 7], the lower bounds for propositional, algebraic and semi-algebraic proof systems [1, 37, 16, 17, 28, 22, 23], and a modest amount of additional work to fill in the gaps, we prove the following strong gap theorem:

Theorem 1.

Let be a finite constraint language. Then, exactly one of the following holds:

  1. has resolution refutations of constant width,

  2. has neither bounded-depth Frege refutations of subexponential size, nor PC over the reals, nor SOS refutations of sublinear degree.

In Theorem 1 and below, the statement that the constraint language has efficient proofs in proof system means that, for some and hence every local encoding scheme, all unsatisfiable instances of have efficient refutations in . Also, here and below, sublinear means , sublinear-exponential means , and subexponential means , where is the number of variables of the instance.

The proof of Theorem 1 actually shows that case 1 happens precisely if has bounded width. As noted earlier, the collapse of Lasserre/SOS to bounded width was already known; here we give a different proof. By a very recent result on the simulation of Polynomial Calculus over the real-field by Lasserre/SOS [18], the collapse of Lasserre/SOS implies the collapse of Polynomial Calculus. The proof we present does not depend on that. Instead we exploit directly the theory of reducibility.

As an immediate corollary we get that resolution is also captured by algebra, despite the fact that our methods fall short to prove that it is closed under reductions.

Corollary 1.

Let be a finite constraint language. The following are equivalent:

  1. has bounded width,

  2. has resolution refutations of constant width,

  3. has resolution refutation of sublinear width,

  4. has resolution refutations of polynomial size,

  5. has resolution refutations of sublinear-exponential size,

  6. has Frege refutations of bounded depth and polynomial size,

  7. has Frege refutations of bounded depth and subexponential size,

  8. has SA, SOS, and PC refutations over the reals of constant degree,

  9. has SA, SOS, and PC refutations over the reals of sublinear degree.

The proof of this is the object of Sections 5 and 6.

Section 7 is about proof systems that operate with polynomial inequalities and that are stronger than Lasserre/SOS. Theorem 1 raises the question of identifying a proof system that is closed under reducibilities and that can surpass bounded width. In other words: is there a natural proof system for which the class of languages that have efficient unsatisfiability proofs is closed under the standard reducibility methods for CSPs, and that at the same time has efficient unsatisfiability proofs beyond bounded width? By the bounded-width theorem for CSPs, one way, and indeed the only way, of surpassing bounded width is by having efficient proofs of unsatisfiability for systems of linear equations over some finite Abelian group. A straightforward answer to our question is thus the following: Polynomial Calculus over a field of non-zero characteristic has efficient unsatisfiability proofs for systems of linear equations over . On the other hand, in view of the limitations of Polynomial Calculus over the real-field, and of certain semi-algebraic proof systems that are imposed by Theorem 1, it is perhaps a surprise that, as we show, bounded degree Lovász-Schrijver also satisfies both requirements.

Theorem 2.

Unsatisfiable systems of linear equations over the 2-element group have LS refutations of bounded degree and polynomial size.

Proving this amounts to showing that Gaussian elimination over can be simulated by reasoning with low-degree polynomial inequalities over . The proof of this counter-intuitive fact relies on earlier work in proof complexity for reasoning about gaps of the type , for , through quadratic polynomial inequalities [30].

It should be pointed out that another proof system that can efficiently solve CSPs of bounded width, and that at the same time goes beyond bounded width, is the proof system that operates with ordered binary decision diagrams from [8]. Although it looks unlikely that our methods could be used for this proof system, whether it is closed under the standard CSP reductions is something that was not checked, neither in [8], nor here.

In Section 8 we demonstrate the applicability of our results. Consider the graph 3-coloring problem seen as the CSP of a finite constraint language on a 3-element domain in the standard way. Since it is known that 3-coloring has unbounded width, Corollary 1 applies to it, and we get 3-coloring instances that are hard for all indicated proof systems. We open the box of the method, and elaborate on that, in order to get explicit 3-coloring instances that are hard for Polynomial Calculus over all fields simultaneously. This gives a new proof of the main result in [40]. Indeed, the same analysis applies to all CSPs that are NP-complete and all proof systems that are closed under reducibilities. This way we solve Open Problem 5.3 in [40] that asks for explicit 3-coloring instances that are hard for Lasserre/SOS.

This article is an extended version of [10]. Except for providing full proof details, we generalise the main gap theorem to cover Polynomial Calculus over the reals and apply our results to the 3-coloring problem, as explained in the paragraph above.

2 Preliminaries

2.1 Propositional logic and propositional proofs

Formulas.

Fix a set of propositional variables taking values true or false. A literal is a variable  or the negation of a variable . We write propositional formulas out of literals using conjunctions , disjunctions , and parentheses, with the usual conventions on parentheses. Also we implicitly think of and as commutative, associative and idempotent. Thus the formula is viewed literally the same as , the formula is viewed literally the same as , and the formula is viewed literally the same as . The same applies to disjunctions. Negation is allowed only at the level of literals, so our formulas are written in negation normal form. If is a formula, we define its complement inductively: if is a variable , then ; if is a negated variable , then ; if is a conjunction , then ; if is a disjunction , then . The empty formula is denoted and is always false by convention. Its complement is denoted , and is always true by convention. We think of and as the neutral elements of and , respectively, and the absorbing elements of and , respectively. Thus we view the formulas and as literally the same as , and and as literally the same as and , respectively. The size of a formula is defined inductively: if is or , then ; if is a literal, then ; if is a conjunction or a disjunction with non-absorbing and non-neutral and , then .

Propositional proof systems.

We work with a Tait-style proof system for propositional logic that we call Frege. The system manipulates formulas in negation normal form and has the following four rules of inference called axiom, cut, introduction of conjunction, and weakening:

(1)

In these rules, and could be the empty formula or its complement . In particular  is an instance of an axiom rule. A Frege proof is called cut-free if it does not use the cut rule. A Frege proof from a set of formulas is a proof in which the formulas in are allowed as additional axioms. In case such a proof ends with the empty formula we call it a Frege refutation of . As a proof system, Frege is sound and implicationally complete, which means that if is a logical consequence of , then there is a Frege proof of from . We will give a proof of this in Section 2.2 that will apply also to certain subsystems of Frege. If is a class of formulas, a -Frege proof is one that has all its formulas in the class . The size of a proof is the sum of the sizes of the formulas in it. The length of a proof is the number of formulas in it.

Resolution, -DNF Frege and Bounded Depth Frege.

A term is a conjunction of literals and a clause is a disjunction of literals. A -term or a -clause is one with at most literals. A -DNF is a disjunction of -terms and a -CNF is a conjunction of -clauses.

We define the classes of - and -formulas inductively. For , these are just the classes of -DNF and -CNF formulas, respectively. For , a formula is if it is a disjunction of -formulas, and it is if it is a conjunction of -formulas.

In this paper, we use the expression Frege proof of depth and bottom fan-in to mean a -Frege proof. Bounded-depth Frege means -Frege for some fixed and . This coincides with other definitions in the literature. Frege of depth and bottom fan-in , as a proof system, is sound and implicationally complete for proving -formulas from -formulas. A proof of this will follow from the general completeness theorem below.

is the class of clauses. It is well-known that -Frege and resolution proofs are basically the same thing (the difference is that in -Frege proofs we allow clause axioms and weakening, but these can always be removed at no cost). A resolution proof which uses only -clauses is called a proof of width .

-Frege, for , is the system introduced by Krajicek [36], also known as , -DNF resolution, and -DNF Frege. This family of proof systems is important for us because, by letting range over all constants (i.e., by considering ), it is the weakest for which we can prove closure under reductions.

2.2 Completeness of Frege and its subsystems

The proof that Frege is implicationally complete is rather standard. We give a detailed proof nonetheless because we want to have concrete bounds.

Theorem 3 (Quantitative Completeness).

Let be a class of formulas that is closed under subformulas and complementation, and let be the closure of under disjunctions. Let and be formulas in . If logically imply , then there is a -Frege proof of from . Moreover, if the formulas and have variables and size at most , then the size of the proof is at most polynomial in , , , and .

Proof.

Let be the variables in and . For , let denote the negative literal if , and the positive literal if . For a truth assignment , let be the formula . First we show that, for each formula on the variables and each truth assignment , if satisfies , then there is a cut-free proof of from no assumptions. This is proved by induction on the size of . If is a literal, say or , then is obtained as the weakening of the axiom . If is a conjunction, say , then satisfies both and , and by induction hypothesis there are cut-free proofs of and . A cut-free proof of then follows from applying introduction of conjunction. If is a disjunction, say , then satisfies either or , and by induction hypothesis there is a cut-free proof of either or . A cut-free proof of then follows from applying weakening. Note that the length of the proof constructed this way is bounded by , and since all the formulas in the proof have sizes bounded by , the size of the proof is bounded by . Note also for later use that, as a consequence of the assumption that is closed under subformulas, the following holds: if is a disjunction of formulas in , say , then each formula in this proof is a disjunction of formulas in , and if is a conjunction of formulas in , say , then the construction gives a cut-free proof of for each , and each formula in the proof of is again a disjunction of formulas in .

Now we assume that is a logical consequence of and we build a proof of from . This proof will not yet be guaranteed to have all its formulas in . We will deal with this issue later. For each truth assignment , the following hold: 1) if satisfies , then the previous paragraph gives a proof of , and 2) if falsifies , then it also falsifies some for some , and the previous paragraph gives a proof of . From these proofs, a sequence of cuts followed by one weakening gives a proof of . From there a sequence of cuts with the hypotheses gives a proof of . Finally we argue how to turn this proof into one that uses only formulas in . For the proofs of the type there is no issue because is a disjunction of formulas in and the previous paragraph argues that such proofs have all its formulas in . The problem comes from the proofs of the type . However, since each is a disjunction of formulas in , say , its negation is a conjunction of formulas in , because is closed under complementation. This means that instead of using the proof of , we could have used the proof of for each . We do this for each choice of , and what we get are proofs of . These proofs now have all their formulas in . Combining these at most many proofs with the hypotheses in a sequence of at most cuts, we get a proof of from , and all the formulas in this proof are in . The size is polynomial in , , , and , and the proof is complete. ∎

The quantitative completeness theorem applies to -Frege (-DNF Frege and resolution) because if is the class of -terms and -clauses, then is closed under subformulas and complementation, and the closure of under disjunctions is the class of -DNFs. It also applies to -Frege, for , because the class is closed under subformulas and complementation, and its closure under disjunctions is precisely .

2.3 Polynomials and algebraic proofs

Polynomials.

We define everything for the real field for simplicity. For algebraic proofs the field would not matter, but for semi-algebraic proofs we need an ordered field such as . Let be algebraic commuting variables ranging over . We want to define proof systems that manipulate equations of the form and inequalities of the form , where is a polynomial in , the ring of polynomials with commuting variables and coefficients in . For our purposes it will suffice to assume that the variables range over . Accordingly, it will also be convenient to introduce twin variables with the intended meaning that for . In all proof systems of this section, the following axioms will be imposed on the variables:

(2)

Observe that follows from these axioms: multiply by and subtract . This sort of reasoning is captured by the proof systems we are about to define.

Algebraic and semi-algebraic proof systems.

Let and denote polynomials. In addition to the axioms in (2), consider the following inference rules called addition and multiplication:

(3)

Clearly, these rules are sound: any assignment that satisfies the equations in the premises, also satisfies the equation in the conclusions. For semi-algebraic proofs we add the following axioms:

(4)

and the following inference rules for polynomial inequalities:

(5)

These rules are called addition, multiplication and positivity of squares and are also sound for assignments . One could also consider additional rules that link equalities with inequalities, such as deriving from , or deriving from and , but if we think of an equality as two inequalities, then they are not strictly necessary. On the other hand, some of the axioms are redundant, such as which can be obtained from adding and , but for the sake of clarity in writing proofs we prefer to keep them.

If denotes a system of polynomial equations and is a further equation, an algebraic proof of from is a sequence of polynomial equations ending with where each equation in the proof is either a hypothesis equation from , or an axiom equation as in (2), or follows from previous equations in the sequence by one of the inference rules in (3). If in addition includes a system of polynomial inequalities , then a semi-algebraic proof of from is defined analogously except that we think of each equation as two inequalities, we use additionally the axioms in (4), and we use additionally the rules in (5). Note that by writing , where and have only positive coefficients, the rules in (3) are actually easily simulated by the rules in (5) (for the multiplication rule, this uses also the axioms in (4)). If an algebraic proof ends with the equation , or similarly if a semi-algebraic proof ends with the inequality , we call it a refutation of .

As proof systems for deriving new polynomial equations or inequalities that follow from old ones on all evaluations of their variables in , both systems are sound and implicationally complete (we note, however, that without some restrictions on the domain of evaluation, such as in our case, the completeness claim is not true). In Section 2.4 below we will prove implicational completeness for two subsystems of algebraic and semi-algebraic proofs, and hence for algebraic and semi-algebraic proofs themselves.

The main complexity measures for algebraic and semi-algebraic proofs are size and degree. Size is measured by the number of symbols it takes to write the representations of the polynomials in the proofs, and degree is the maximum of the total degrees of the polynomials in the proofs. Polynomials are typically represented as explicit sums of monomials, or as algebraic formulas or circuits. Using formulas or circuits as representations requires some additional technicalities in the definitions of the rules, that we want to avoid (see [42, 29]). For all our examples below, we use the representation of an explicit sum of monomials.

Some proof systems from the literature.

The proofs in the Polynomial Calculus (PC) are algebraic proofs restricted in such a way that the polynomial in the multiplication rule in (3) is either a scalar or a variable [24]. In the literature, this has been called PCR for PC with resolution (see [2]), due to the presence of twin variables, but in recent works the shorter original name PC is used. As pointed out earlier, algebraic proofs can be defined over arbitrary scalar-fields beyond the real-field . A claim about algebraic proofs in which the field is omitted is meant to hold for all fields simultaneously. Whenever we need to specify the field , we speak of algebraic and PC proofs over .

The proofs in the Lovász-Schrijver (LS) proof system are semi-algebraic proofs for which the following restrictions apply: 1) the polynomial in the multiplication rule in (5) is either a positive scalar or a variable, and 2) the positivity-of-squares rule in (5) is not allowed. When the positivity-of-squares is also allowed, the system is called Positive Semidefinite Lovász-Schrijver and is denoted LS. Originally the Lovász-Schrijver proof system was defined to manipulate quadratic polynomials only (see [41, 43]). We follow [30] and consider the extension to arbitrary degree. For the original Lovász-Schrijver proof systems we use LS and LS. Degree- Lovász-Schrijver and degree- Positive Semidefinite Lovász-Schrijver are denoted LS and LS, respectively. For LS and LS proofs, an important complexity measure originally studied by Lovász and Schrijver is their rank, which is the maximum nesting depth of multiplication by a variable in the proof. Note that, due to possible cancellations, the degree of an LS proof could in principle be much smaller than its rank.

We define four additional proof systems called Nullstellensatz (NS), Sherali-Adams (SA), Positive Semidefinite Sherali-Adams (SA), and Lasserre/Sums-of-Squares (SOS). For NS, SA and SA, we define them as the subsystems of PC, LS and LS, respectively, in which all applications of the multiplication rule must precede all applications of the addition rule. Due to the structural restriction in which multiplications precede additions, we can think of a proof from a set of hypotheses as a static polynomial identity of the form

(6)

where are polynomials that either come from the set of hypotheses, or they are axiom polynomials from the lists in (2) and (4) as appropriate (i.e., from (2) for NS, and from both (2) and (4) for SA and SA), or are squares of polynomials when they are allowed (i.e., for SA), and are scalars of the appropriate type (i.e., arbitrary when the they multiply comes from an equation, or positive when the they multiply comes from an inequality). Finally we define Lasserre/Sums-of-Squares proof system as the subsystem of semi-algebraic proofs to which the following restrictions apply: 1) the polynomial is arbitrary in the multiplication rule in (3) and it is a square polynomial in the multiplication rule in (5), and 2) all multiplications precede all additions. Thus, in terms of static identities, these are proofs of the form

(7)

where are polynomials that either come from the set of hypotheses, or they are axiom polynomials from the lists (2) and (4), or they are squares, and are arbitrary polynomials or square polynomials as appropriate (i.e., arbitrary if the they multiply comes from an equation, and squares if the they multiply comes from an inequality). Note that the size of an NS, SA, SA or SOS proof is polynomially related to the sum of the sizes of the non-zero ’s and ’s in the corresponding static identities (6) and (7). Non-static proofs are sometimes called dynamic [30]. We will avoid using this term here.

We close this section by noting the relationships between these proof systems. Clearly, every NS proof of degree is also a PC proof of degree . The converse is certainly not true, but what is true is that every PC proof of degree and rank can be converted into an NS proof of degree , where the rank of a PC proof is the analogue of the rank measure for LS proofs that we defined earlier. The same relationships hold between SA and LS, and SA and LS. In all three cases, the conversions go by swapping the order in which the addition and the multiplication rules are applied, when they appear in the wrong order. Also, every NS proof over the reals is an SA proof, which is an SA proof. Finally, thanks to the axioms (2), each SA proof can be easily converted to an SOS proof of twice the degree: replace each multiplication by a variable by a multiplication by , and subtract the appropriate multiple of the axiom to effectively simulate the multiplication by . See [39] for a related discussion.

Discussion on variants of NS, SA, SA and SOS.

The polynomial identity interpretations of NS, SA, SA and SOS, c.f., (6) and (7), are closely related to the original definitions by Beame et al. [15] for NS, and the settings of Sherali and Adams [46] and Lasserre [38] for SA and SOS, respectively. In most incarnations of these proof systems the twin variables are not present; in some others they are (e.g., [9]). If we care only about degree, the presence of twin variables makes no difference at all for Nullstellensatz since we can always simulate a multiplication by by subtracting a multiplication by . Note, however, that this blows up the size exponentially in the degree. In order to make sense of Sherali-Adams without twin variables, we need to extend the definition to allow in the multiplication rule to be, besides a positive scalar or a variable , a linear polynomial of the form . The static form of such a proof is an identity such as

(8)

where and are polynomials as in (6), but without twin variables. If and denote the polynomials over that result from the polynomials and over when each twin variable is replaced by , then any valid proof with twin variables as in (6) transforms into a valid proof without twin variables as in (8). Thus, if we care only about degree, the versions of Sherali-Adams and Positive Semidefinite Sherali-Adams without twin variables simulate the versions with twin variables, for polynomials without twin variables. As for Nullstellensatz the size could blow up exponentially in the degree. The same facts are true for Sums-of-Squares.

Two further comments are in order. For Nullstellensatz, one could consider an alternative definition in which proofs are polynomial identities of the form , where the are hypotheses or axiom polynomials, and the are arbitrary polynomials. However this difference is minor since we can always write each as a combination of monomials and split into . Second, one could consider the version of Sums-of-Squares in which in addition to squares as in (7), one is also allowed multiplication by variables. As noted earlier, such multiplications by a variable can be simulated by multiplications by their squares , thanks to the axioms from (2), at the cost of at most doubling the degree, and blowing up the size at most polynomially.

2.4 Completeness of Nullstellensatz and Sherali-Adams

In this section we prove the implicational completeness of Nullstellensatz and Sherali-Adams with quantitative bounds. We start with two technical lemmas that will be used to justify the elimination of twin variables.

Lemma 1.

For every polynomial of degree and every variable , there are NS and SA proofs of and of degree and , respectively, and size polynomial in the size of .

Proof.

Split into a sum of monomials , lift the axiom by , and add up together to get . ∎

The second technical lemma that we need formalizes the elimination of twin variables.

Lemma 2.

For every polynomial of degree , every scalar and every two subsets and of , with , there are NS and SA proofs of the equation

(9)

of degree and size polynomial in and the size of and .

Proof.

Assume without loss of generality that where . Let . Define for all . Observe that the goal equation is . For each , let . Now:

(10)

for each . Lemma 1 gives proofs of for every . Adding them all together gives by (10) and we are done. ∎

We will need the following definitions. For every assignment , define

Define its indicator polynomial:

(11)

For every polynomial , let denote the evaluation of when is assigned .

For a polynomial on the variables , its multilinearization is the unique multilinear polynomial that agrees with on all assignments of values in to its variables. The uniqueness of the multilinearization follows from the fact that the collection of multilinear polynomials in

forms a vector space of dimension

for which the monomials make a basis. Note that this holds for any field; not just .

Lemma 3.

For every polynomial on the variables , there are polynomials such that the following identity holds:

(12)

where denotes the multilinearization of . Moreover, each has size polynomial in the size of .

Proof.

Observe that it is enough to prove the lemma for the special case of monomials. Indeed, if is an arbitrary polynomial, we get the identity (12) by splitting into a sum of monomials, applying the lemma to each monomial, and adding up the obtained identities.

Let be a monomial. We proceed by induction on the sum of the individual degrees of the variables. If all variables have individual degree one, there is nothing to prove. Otherwise, some variable must have individual degree at least two. Say this variable is and let and be such that and . Note that the multilinearizations of and are the same, and in both and the sum of the individual degrees is strictly smaller. The induction hypothesis applied to gives polynomials such that

(13)

Now the identity we want is obtained by defining for , and . Indeed:

(14)
(15)
(16)

and we already proved in (13) that this last thing is . ∎

Theorem 4.

Let be a system of polynomial equations, let be a system of polynomial inequalities, and let be a polynomial, all over the same variables. If follows from on all evaluations of its variables in , then there is an NS proof of from . Similarly, if follows from on all evaluations of its variables in , then there is an SA proof of from . Moreover, in both cases the degree of the proof is at most , and the size is polynomial in and in the size of and , respectively.

Proof.

Both proofs are essentially the same; first we give the proof for Sherali-Adams and then indicate how to adapt it to Nullstellensatz. We prove the theorem when is multilinear and then we adapt it to the general case. Assume is multilinear and let , where we have written each equation in as two inequalities. For every assignment , let be the real numbers defined by cases as follows. If , let for and for . If , let be the smallest element in such that , which must exist by the hypothesis, and define for and for each . Observe that in all cases is non-negative. In the first case because was non-negative, and in the second case because both and were negative, so their ratio is positive. The choice of these reals guarantees that

(17)

We need the following claim.

Claim 1.

For every assignment and every , the polynomial is the multilinearization of . In addition, .

Proof.

Since the multilinearization is unique and the polynomial is multilinear, it suffices to show that and agree on all assignments of values in to their variables. But this is easy: they both evaluate to , or both evaluate to , depending on whether the assignment is , or different from , respectively. For the second claim we use the same argument, and add the additional fact that is itself multilinear: the big sum over is a multilinear polynomial and, by (17), it agrees with on all assignments of values in to its variables. Hence, by the uniqueness of the multilinearization, and since is multilinear, it is itself. ∎

Back to the proof, by the first part of Claim 1, for every assignment and every , there exist polynomials according to Lemma 3 that make the following identities hold:

(18)

We are ready to build up the proof of from . We claim that the following identity holds:

(19)

First we claim that the left-hand side can be converted into a valid SA proof (with multiplications by ’s and ’s, which can be simulated in our definition of Sherali-Adams as discussed in Lemma 2). To see this, just reorder the terms and apply Lemma 1 to replace by proper SA proofs. It remains to see that the identity (19) holds; this will show that it is an SA proof of from .

In order to see that (19) holds, first use equation (18) to rewrite its left-hand side:

(20)

And now use the second part of Claim 1 to complete the proof when is multilinear.

When is not multilinear, it suffices to apply the above argument to get its multilinearization , and then apply the reverse identity in Lemma 3. Indeed,

(21)

To turn this into a proper SA proof we need to use Lemma 1 again.

For Nullstellensatz, the argument is the same except that, in order to handle arbitrary fields besides the real field , the coefficients need to be redefined. Let . If , define for all . If , let be the smallest element in such that , which must exist by hypothesis, and define for and for . This choice is well-defined over any field and guarantees (17). The rest of the proof is the same. ∎

2.5 Constraint satisfaction problem

There are many equivalent definitions of the constraint satisfaction problem. Here we use the definition in terms of homomorphisms. Below we introduce the necessary terminology. A concrete example will be developed in Section 8 where we apply the method of reducibilities to the graph -coloring problem for .

CSPs and homomorphisms.

A relational vocabulary is a set of symbols; each symbol has an associated natural number called its arity. A relational structure over (or an -structure) is a set , called a domain together with a set of relations over . For each natural number and each relation symbol of arity , there is a relation in of arity denoted , i.e., . Sometimes we call it an interpretation of  in . We say that a relational structure is finite if its domain is finite and it has finitely many non-empty relations.

Let and be -structures, for some relational vocabulary . A homomorphism from to is a function , which preserves all the relations, that is, for every natural number and each relation symbol of arity , if , then .

For a fixed -structure , the constraint satisfaction problem of , denoted CSP(), is the following computational problem: given a finite -structure , decide whether there exists a homomorphism from to . If the anwser is positive we call the instance satisfiable; otherwise we call it unsatisfiable. The size of an instance is the number of elements in its domain plus the number of tuples in all its relations. Note that if the vocabulary is fixed and finite, then the size of is polynomial in the number of elements of its domain which we denote by . In the context of CSP the structure is often called a constraint language or a template. We usually assume that the constraint language is finite.

Bounded-width.

The existential -pebble game is played on two relational structures and over the same vocabulary by two players called Spoiler and Duplicator. The players are given two corresponding sets of pebbles and . In each round Spoiler picks one of the pebbles , say , and puts it on an element of the structure . Duplicator responds by picking the corresponding pebble and placing it on some element of the structure . For simplicity, in any given configuration of the game let us identify a pebble with the element of the structure that it is placed on. Spoiler wins if at any point during the game the partial function defined by , for each pebbled element of , is either not well defined (because there exist indices of two pebbled elements such that but ), or is not a partial homomorphism. Otherwise, the Duplicator wins.

We say that a finite relational structure has width if, for every finite structure of the same vocabulary as , if there is no homomorphism from to , then Spoiler wins the existential -pebble game on and . The structure has bounded width if it has width for some . Structures of bounded width are exactly those structures for which CSP() can be solved by a local consistency algorithm [35].

2.6 Propositional and polynomial encodings

To reason about proof systems for CSPs we encode the fact that a finite structure maps homomorphically to a finite structure , over the same vocabulary, as a CNF or a system of polynomial inequalities or/and equations. In the proofs we will use concrete fixed encodings but our results hold for a whole class of encodings which we call local.

Local encodings.

First let us fix some notation. In the context of propositional proof systems, for any sets and by we denote a set of propositional variables: for every and every there is a variable in the set . Truth valuations of the variables in and relations on have a natural one-to-one correspondence: a variable is assigned the truth value if and only if the pair belongs to the relation. Recall that a function from to is a relation on . Hence, a homomorphism from an -structure to an -structure is a relation on .

Fix a finite relational vocabulary and a finite structure over .

A propositional encoding scheme for is a mapping which assigns to every -structure a set of clauses over the variables in in such a way that there is a one-to-one correspondence between the truth valuations of the variables in satisfying and the homomorphisms from to .

In the context of algebraic and semi-algebraic proof systems we additionally assume the presence of twin variables. For every and every there is both the algebraic variable and the algebraic variable in the set , and an analogous bijective correspondence holds between relations of and those evaluations of the variables from in which satisfy the axioms from (2): a pair belongs to the relation if and only if the variable is assigned the value if and only if the variable is assigned the value .

An algebraic encoding scheme over a field for is a mapping which assigns to every -structure a set of polynomial equations over the variables in in such a way that there is a one-to-one correspondence between the evaluations of the variables form in satisfying and the axioms from (2) over , and the homomorphisms from  to . Finally, a semi-algebraic encoding scheme for is a mapping which assigns to every -structure a set of polynomial inequalities over the variables in in such a way that there is a one-to-one correspondence between the evaluations of the variables form in satisfying and the axioms from (2) and (4), and the homomorphisms from  to . Observe that every algebraic encoding scheme over the real-field is also a semi-algebraic encoding scheme.

An encoding scheme is invariant under isomorphisms if, whenever is an isomorphism from an -structure to an -structure , it holds that , where is obtained from by substituting each variable by (and each variable by if necessary).

Next we define the key notion of local encoding scheme. We need two pieces of notation. If the structure has a single element and each of its relations is empty, we denote the encoding by . If the structure has a single non-empty relation with a single tuple in it, and its domain is , then we denote by . Since the vocabulary is finite, up to isomorphism there are only finitely many structures of one of the above-mentioned two kinds. Therefore, for any relational structure over a finite vocabulary and any encoding scheme that is invariant under isomorphisms, the size of encodings of the form or is bounded by a constant. We call it the local bound of the encoding scheme.

An encoding scheme in local if it is invariant under isomorphisms and, for every -structure , the encoding is a sum of over all and over all and . For our purposes all local encodings of the same kind (i.e., propositional, algebraic or semi-algebraic) are essentially equivalent, as formalized by the following result.

Lemma 4.

Let be a finite structure over a finite vocabulary , and let , and be pairs of local encoding schemes for that are propositional, algebraic and semi-algebraic, respectively. There exists a positive integer such that for every finite -structure it holds that:

  1. every clause in has a resolution proof from of size bounded by ,

  2. every equation in has an NS proof from of size and degree bounded by ,

  3. every inequality in has an SA proof from of size and degree bounded by .

Proof.

For 1, let  and  be the local bounds of and , respectively. Take a clause from . The clause belongs to a subset of of the form or , so the size of is bounded by . Without loss of generality suppose that belongs to a set . The corresponding subset of has size at most . The satisfying truth valuations for and are the same. Therefore, since is an element of