On the Power and Limitations of Branch and Cut

02/09/2021 ∙ by Noah Fleming, et al. ∙ 0

The Stabbing Planes proof system was introduced to model the reasoning carried out in practical mixed integer programming solvers. As a proof system, it is powerful enough to simulate Cutting Planes and to refute the Tseitin formulas – certain unsatisfiable systems of linear equations mod 2 – which are canonical hard examples for many algebraic proof systems. In a recent (and surprising) result, Dadush and Tiwari showed that these short refutations of the Tseitin formulas could be translated into quasi-polynomial size and depth Cutting Planes proofs, refuting a long-standing conjecture. This translation raises several interesting questions. First, whether all Stabbing Planes proofs can be efficiently simulated by Cutting Planes. This would allow for the substantial analysis done on the Cutting Planes system to be lifted to practical mixed integer programming solvers. Second, whether the quasi-polynomial depth of these proofs is inherent to Cutting Planes. In this paper we make progress towards answering both of these questions. First, we show that any Stabbing Planes proof with bounded coefficients SP* can be translated into Cutting Planes. As a consequence of the known lower bounds for Cutting Planes, this establishes the first exponential lower bounds on SP*. Using this translation, we extend the result of Dadush and Tiwari to show that Cutting Planes has short refutations of any unsatisfiable system of linear equations over a finite field. Like the Cutting Planes proofs of Dadush and Tiwari, our refutations also incur a quasi-polynomial blow-up in depth, and we conjecture that this is inherent. As a step towards this conjecture, we develop a new geometric technique for proving lower bounds on the depth of Cutting Planes proofs. This allows us to establish the first lower bounds on the depth of Semantic Cutting Planes proofs of the Tseitin formulas.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

An effective method for analyzing classes of algorithms is to formalize the techniques used by the class into a formal proof system, and then analyze the formal proof system instead. By doing this, theorists are able to hide many of the practical details of implementing these algorithms, while preserving the class of methods that the algorithms can feasibly employ. Indeed, this approach has been applied to study many different families of algorithms, such as

  • Conflict-driven clause-learning algorithms for SAT [BayardoS97, MarquesS99, MoskewiczMZZM01], which can be formalized using resolution proofs [DavisP60].

  • Optimization algorithms using semidefinite programming [Goemans1994, parrilo2000], which can often be formalized using Sums-of-Squares proofs [dima-sos, BarakBHKSZ12].

  • The classic cutting planes algorithms for integer programming [gomory1963algorithm, Chvatal73a], which are formalized by cutting planes proofs [Chvatal73a, CookCT87].

In the present work, we continue the study of formal proof systems corresponding to modern integer programming algorithms. Recall that in the integer programming problem, we are given a polytope

and a vector

, and our goal is to find a point maximizing . The classic approach to solving this problem — pioneered by Gomory [gomory1963algorithm] — is to add***Throughout, we will say that a cutting plane, or an inequality is added to a polytope to mean that it is added to the set of inequalities defining . cutting planes to . A cutting plane for is any inequality of the form , where is an integral vector, is rational, and every point of is satisfied by . By the integrality of , it follows that cutting planes preserve the integral points of , while potentially removing non-integral points from

. The cutting planes algorithms then proceed by heuristically choosing “good” cutting planes to add to

to try and locate the integral hull of as quickly as possible.

As mentioned above, these algorithms can be naturally formalized into a proof system — the Cutting Planes proof system, denoted — as follows [CookCT87]. Initially, we are given a polytope , presented as a list of integer-linear inequalities . From these inequalities we can then deduce new inequalities using two deduction rules:

  • Linear Combination. From inequalities , deduce any non-negative linear combination of these two inequalities with integer coefficients.

  • Division Rule. From an inequality , if divides all entries of then deduce .

A Cutting Planes refutation of is a proof of the trivially false inequality from the inequalities in ; clearly, such a refutation is possible only if does not contain any integral points. While Cutting Planes has grown to be an influential proof system in propositional proof complexity, the original cutting planes algorithms suffered from numerical instabilities, as well as difficulties in finding good heurisitics for the next cutting planes to add [gomory1963algorithm].

The modern algorithms in integer programming improve on the classical cutting planes method by combining them with a second technique, known as branch-and-bound, resulting in a family of optimization algorithms broadly referred to as branch-and-cut algorithms. These algorithms search for integer solutions in a polytope by recursively repeating the following two procedures: First, is split into smaller polytopes such that (i.e. branching). Next, cutting planes deductions are made in order to further refine the branched polytopes (i.e. cutting). In practice, branching is usually performed by selecting a variable and branching on all possible values of ; that is, recursing on for each feasible integer value . More complicated branching schemes have also been considered, such as branching on the hamming weight of subsets of variables [FischettiL03], branching using basis-reduction techniques [AardalL04, KrishnamoorthyP09, AardalBHLS00], and more general linear inequalities [OwenM01, mahajanRT09, KaramanovC11].

However, while these branch-and-cut algorithms are much more efficient in practice than the classical cutting planes methods, they are no longer naturally modelled by Cutting Planes proofs. So, in order to model these solvers as proof systems, Beame et al. [BeameFIKPPR18] introduced the Stabbing Planes proof system. Given a polytope containing no integral points, a Stabbing Planes refutation of proceeds as follows. We begin by choosing an integral vector , an integer , and replacing with the two polytopes and . Then, we recurse on these two polytopes, continuing until all descendant polytopes are empty (that is, they do not even contain any real solutions). The majority of branching schemes used in practical branch-and-cut algorithms (including all of the concrete schemes mentioned above) are examples of this general branching rule.

It is now an interesting question how the two proof systems — Cutting Planes and Stabbing Planes — are related. By contrasting the two systems we see at least three major differences:

  • Top-down vs. Bottom-up. Stabbing Planes is a top-down proof system, formed by performing queries on the polytope and recursing; while Cutting Planes is a bottom-up proof system, formed by deducing new inequalities from old ones.

  • Polytopes vs. Halfspaces. Individual “lines” in a Stabbing Planes proof are polytopes, while individual “lines” in a Cutting Planes proof are halfspaces.

  • Tree-like vs. DAG-like. The graphs underlying Stabbing Planes proofs are trees, while the graphs underlying Cutting Planes proofs are general DAGs: intuitively, this means that Cutting Planes proofs can “re-use” their intermediate steps, while Stabbing Planes proofs cannot.

When taken together, these facts suggest that Stabbing Planes and Cutting Planes could be incomparable in power, as polytopes are more expressive than halfspaces, while DAG-like proofs offer the power of line-reuse. Going against this natural intuition, Beame et al. proved that Stabbing Planes can actually efficiently simulate Cutting Planes [BeameFIKPPR18]. Furthermore, they proved that Stabbing Planes is equivalent to the proof system tree-like , denoted , which was introduced by Krajíček [Krajicek98], and whose relationship to Cutting Planes was previously unknown.

This leaves the converse problem — of whether Stabbing Planes can also be simulated by Cutting Planes — as an intriguing open question. Beame et al. conjectured that such a simulation was impossible, and furthermore that the Tseitin formulas provided a separation between these systems [BeameFIKPPR18]. For any graph and any -labelling of the vertices of , the Tseitin formula of is the following system of -linear equations: for each edge we introduce a variable , and for each vertex we have an equation

asserting that the sum of the edge variables incident with must agree with its label (note such a system is unsatisfiable as long as

is odd). On the one hand, Beame et al. proved that there are

quasi-polynomial size Stabbing Planes refutations of the Tseitin formulas [BeameFIKPPR18]. On the other hand, Tseitin formulas had long been conjectured to be exponentially hard for Cutting Planes [CookCT87], as they form one of the canonical families of hard examples for algebraic and semi-algebraic proof systems, including Nullstellensatz [dima-nsatz], Polynomial Calculus [bgip], and Sum-of-Squares [dima-sos, Schoenebeck08].

In a recent breakthrough, the long-standing conjecture that Tseitin was exponentially hard for Cutting Planes was refuted by Dadush and Tiwari [TseitinUpperBound], who gave quasi-polynomial size Cutting Planes refutations of Tseitin instances. Moreover, to prove their result, Dadush and Tiwari showed how to translate the quasipolynomial-size Stabbing Planes refutations of Tseitin into Cutting Planes refutations. This translation result is interesting for several reasons. First, it brings up the possibility that Cutting Planes can actually simulate Stabbing Planes. If possible, such a simulation would allow the significant analysis done on the Cutting Planes system to be lifted directly to branch-and-cut solvers. In particular, this would mean that the known exponential-size lower bounds for Cutting Planes refutations would immediately imply the first exponential lower bounds for these algorithms for arbitrary branching heuristics. Second, the translation converts shallow Stabbing Planes proofs into very deep Cutting Planes proofs: the Stabbing Planes refutation of Tseitin has depth and quasi-polynomial size, while the Cutting Planes refutation has quasipolynomial size and depth. This is quite unusual since simulations between proof systems typically preserve the structure of the proofs, and thus brings up the possibility that the Tseitin formulas yield a supercritical size/depth tradeoff – formulas with short proofs, requiring superlinear depth. For contrast: another simulation from the literature which emphatically does not preserve the structure of proofs is the simulation of bounded-size resolution by bounded-width resolution by Ben-Sasson and Wigderson [BenSassonW01]. In this setting, it is known that this simulation is tight [BonetG01], and even that there exist formulas refutable in resolution width requiring maximal size [AtseriasLN16]. Furthermore, under the additional assumption that the proofs are tree-like, Razborov [Razborov16] proved a supercritical trade-off between width and size.

1.1 Our Results

A New Characterization of Cutting Planes

Our first main result gives a characterization of Cutting Planes proofs as a natural subsystem of Stabbling Planes that we call facelike Stabbing Planes. A Stabbing Planes query is facelike if one of the sets or is either empty or is a face of the polytope , and a Stabbing Planes proof is said to be facelike if it only uses facelike queries. Our main result is the following theorem.

[] The proof systems and Facelike are polynomially equivalent.

Using this equivalence we prove the following surprising simulation, stating that Stabbing Planes proofs with relatively small coefficients (quasi-polynomially bounded in magnitude) can be quasi-polynomially simulated by Cutting Planes.

[] Let be any unsatisfiable CNF formula on variables, and suppose that there is a refutation of in size and maximum coefficient size . Then there is a refutation of in size .

As a second application of subsection 1.1, we generalize Dadush and Tiwari’s result to show that Cutting Planes can refute any unsatisfiable system of linear equations over a finite field. This follows by showing that, like Tseitin, we can refute such systems of linear equations in quasipolynomial-size Facelike . Let be the CNF encoding of an unsatisfiable system of linear equations over a finite field. There is a refutation of of size .

This should be contrasted with the work of Filmus, Hrubeš, and Lauria [FilmusHL16], which gives several unsatisfiable systems of linear equations over that require exponential size refutations in Cutting Planes.

Lower Bounds

An important open problem is to prove superpolynomial size lower bounds for Stabbing Planes proofs. We make significant progress toward this goal by proving the first superpolynomial lower bounds on the size of low-weight Stabbing Planes proofs. Let denote the family of Stabbing Planes proofs in which each coefficient has at most quasipolynomial () magnitude.

There exists a family of unsatisfiable CNF formulas such that any refutation of requires size at least for constant .

Our proof follows straightforwardly from subsection 1.1 together with known Cutting Planes lower bounds. We view this as a step toward proving SP lower bounds (with no restrictions on the weight). Indeed, lower bounds for (low-weight Cutting Planes) [BonetPR97] were first established, and led to (unrestricted) CP lower bounds [Pudlak97].

Our second lower bound is a new linear depth lower bound for semantic Cutting Planes proofs. (In a semantic Cutting Planes proof the deduction rules for CP are replaced by a simple and much stronger semantic deduction rule.)

For all sufficiently large there is a graph on vertices and a labelling such that the Tseitin formula for requires depth to refute in Semantic Cutting Planes.

We note that lower bounds for Semantic Cutting Planes were already established via communication complexity arguments. However, since Tseitin formulas have short communication protocols, our depth bound for semantic Cutting Planes proofs of Tseitin is new.

Our main motivation behind this depth bound is as a step towards proving a supercritical tradeoff. A supercritical tradeoff for , roughly speaking, states that efficient proofs must sometimes be very deep — that is, beyond the trivial depth upper bound of [Razborov16, BerkholzN20]. Establishing supercritical tradeoffs is a major challenge, both because hard examples witnessing such a tradeoff are rare, and because current methods seem to fail beyond the critical regime. In fact, the only supercritical tradeoff between size and depth established to date is for bounded-width, tree-like Resolution [Razborov16].

As we mentioned above, Dadush and Tiwari’s quasipolynomial-size refutations of Tseitin are quasipolynomially deep, and similarly our simulation of Facelike Stabbing Planes by Cutting Planes in subsection 1.1 is similarly far from depth-preserving. We therefore conjecture that the Tseitin formulas exhibit a supercritical tradeoff for . Our proof of subsection 1.1 is a novel geometric argument which generalizes the top-down “protection lemma” approach [BGHMP06] that crucially relied on the exact deduction rules of CP. As our argument has no inherent barrier for going behind the critical regime, we hope that it is a step towards proving a supercritical tradeoff, which we leave as an open problem.

There exists a family of unsatisfiable formulas such that has quasipolynomial-size proofs, but any quasipolynomial-size proof requires superlinear depth.

Semantic

Facelike

Figure 1: Known relationships between proof systems considered in this paper. A solid black (red) arrow from proof system to indicates that can polynomially (quasi-polynomially) simulate . A dashed arrow indicates that this simulation cannot be done.

1.2 Related Work

Lower Bounds on and .

Several lower bounds on subsystems of and have already been established. Krajíček [Krajicek98] proved exponential lower bounds on the size of proofs in which both the width of the clauses and the magnitude of the coefficients of each line in the proof are bounded. Concretely, let these bounds be and respectively. The lower bound that he obtains is . Kojevnikov [Kojevnikov07] removed the dependence on the coefficient size for proofs, obtaining a bound of . Beame et al. [BeameFIKPPR18] provide a size-preserving simulation of Stabbing Planes by which translates a depth Stabbing Planes proof into a width proof, and therefore this implies lower bounds on the size of proofs of depth . Beame et al. [BeameFIKPPR18] exhibit a function for which there are no refutations of depth via a reduction to the communication complexity of the CNF search problem.

Supercritical Tradeoffs.

Besides the work of Razborov [Razborov16], a number of supercritical tradeoffs have been observed in proof complexity, primarily for proof space. Beame et al. [BeameBI12] and Beck et al. [BeckNT13] exhibited formulas which admit polynomial size refutations in Resolution and the Polynomial Calculus respectively, and such that any refutation of sub-linear space necessitates a superpolynomial blow-up in size. Recently, Berkholz and Nordström [BerkholzN20] gave a supercritical trade-off between width and space for Resolution.

Depth in Cutting Planes and Stabbing Planes.

It is widely known (and easy to prove) that any unsatisfiable family of CNF formulas can be refuted by exponential size and linear depth Cutting Planes. It is also known that neither Cutting Planes nor Stabbing Planes can be balanced, in the sense that a depth- proof can always be transformed into a size proof [BeameFIKPPR18, BGHMP06]. This differentiates both of these proof systems from more powerful proof systems like Frege, for which it is well-known how to balance arbitrary proofs [CookR79]. Furthermore, even though both the Tseitin principles and systems of linear equations in finite fields can be proved in both quasipolynomial-size and depth in Facelike , the simulation of Facelike by cannot preserve both size and depth, as the Tseitin principles are known to require depth to refute in [BGHMP06].

We first recall the known depth lower bound techniques for both Cutting Planes and Stabbing Planes proofs. In both proof systems, arguably the primary method for proving depth lower bounds is by reducing to real communication complexity [ipu:cp, BeameFIKPPR18] ; however, communication complexity is always trivially upper bounded by , and it is far from clear how to use the assumption on the size of the proof to boost this to superlinear.

A second method has been developed for proving lower bounds (applicable to Cutting Planes but not to Stabbing Planes) using so-called protection lemmas [BGHMP06], which seems much more amenable to applying a small-size assumption on the proof. We also remark that for many formulas (such as the Tseitin formulas!) it is known how to achieve -depth lower bounds in Cutting Planes via protection lemmas, while proving even lower bounds via communication complexity is impossible, due to a known folklore upper bound.

2 Preliminaries

We first recall the definitions of some key proof systems.

Resolution.

Fix an unsatisfiable CNF formula over variables . A Resolution refutation of is a sequence of clauses ending in the empty clause such that each is in or is derived from earlier clauses with using one of the following rules:

  • Resolution. where , is a literal.

  • Weakening. .

The size of the resolution proof is , the number of clauses. It is useful to visualize the refutation as a directed acyclic graph; with this in mind the depth of the proof (denoted ) is the length of the longest path in the proof DAG. The resolution depth of is the minimal depth of any resolution refutation of .

Cutting Planes and Semantic Cutting Planes.

A Cutting Planes proof () of an inequality from a system of linear inequalities is given by a sequence of inequalities

such that , , and each inequality is either in or is deduced from earlier inequalities in the sequence by applying one of the two rules Linear Combination or Division Rule described at the beginning of Section 1. We will usually be interested in the case that the list of inequalities defines a polytope.

An alternative characterization of Cutting Planes uses Chvátal-Gomory cuts (or just CG cuts) [CookCT87, Chvatal73a]. Let

be a polytope. A hyperplane

is supporting for if , and if is a supporting hyperplane then the set is called a face of . An inequality is valid for if every point of satisfies the inequality and is a supporting hyperplane of . Let be a polytope, and let be any valid inequality for such that all coefficients of are relatively prime integers. The halfspace is called a CG cut for . (We will sometimes abuse notation and refer to the inequality also as a CG cut.) If is a CG cut for the polytope , then we can derive from in steps of Cutting Planes by Farkas Lemma (note that the inequality is valid for by definition, so we can deduce as a linear combination of the inequalities of and then apply the division rule). If is a polytope and is a CG cut, then we will write , and say that is derived from .

Given a CNF formula , we can translate into a system of linear inequalities in the following natural way. First, for each variable in add the inequality . If is a clause in , then we add the inequality

It is straightforward to see that the resulting system of inequalities will have no integral solutions if and only if the original formula is unsatisfiable. With this translation we consider Cutting Planes refutations (defined in the introduction) of to be refutations of the translation of to linear inequalities.

The semantic Cutting Planes proof system (denoted or Semantic ) is a strengthening of Cutting Planes proofs to allow any deduction that is sound over integral points [BonetPR97]. Like Cutting Planes, an proof is given by a sequence of halfspaces , but now we can use the following very powerful semantic deduction rule:

  • Semantic Deduction. From and deduce if every solution of is also an integral solution of both and .

Filmus et al. [FilmusHL16] showed that is extremely strong: there are instances for which any refutation in requires exponential size, and yet these instances admit polynomial-size refutations in semantic .

The size of a Cutting Planes proof is the number of lines (it is known that for unsatisfiable CNF formulas that this measure is polynomially related to the length of the bit-encoding of the proof [CookCT87]). As with Resolution, it is natural to arrange Cutting Planes proofs into a proof DAG. With this in mind we analogously define and to be the smallest depth of any (semantic) Cutting Planes proof of .

It is known that any system of linear inequalities in the unit cube has depth at most , and moreover there are examples requiring -depth more than [EisenbrandS99]. However for unsatisfiable CNF formulas, the -depth is at most [BockmayrEHS99].

Stabbing Planes.

Let be an unsatisfiable system of linear inequalities. A Stabbing Planes () refutation of is a directed binary tree, , where each edge is labelled with a linear integral inequality satisfying the following consistency conditions:

  • Internal Nodes. For any internal node of , if the right outgoing edge of is labelled with , then the left outgoing edge is labelled with its integer negation .

  • Leaves. Each leaf node of is labelled with a non-negative linear combination of inequalities in with inequalities along the path leading to that yields .

For an internal node of , the pair of inequalities ) is called the query corresponding to the node. Every node of has a polytope associated with it, where is the polytope defined by the intersection of the inequalities in together with the inequalities labelling the path from the root to this node. We will say that the polytope corresponds to this node. The slab corresponding to the query is , which is the set of points ruled out by this query. The width of the slab is the minimum distance between and , which is . The size of a refutation is the bit-length needed to encode a description of the entire proof tree, which, for CNF formulas as well as sufficiently bounded systems of inequalities, is polynomially equivalent to the number of queries in the refutation [TseitinUpperBound]. As well, the depth of the refutation is the depth of the binary tree. The proof system is the subsystem of Stabbing Planes obtained by restricting all coefficients of the proofs to have magnitude at most quasipolynomial () in the number of input variables.

The Stabbing Planes proof system was introduced by Beame et al. [BeameFIKPPR18] as a generalization of Cutting Planes that more closely modelled query algorithms and branch-and-bound solvers. Beame et al. proved that is equivalent to the proof system introduced by Krajíček [Krajicek98] which can be thought of as a generalization of Resolution where the literals are replaced with integer-linear inequalities.

3 Translating Stabbing Planes into Cutting Planes

3.1 Equivalence of with Subsystems of

In this section we prove subsection 1.1, restated below, which characterizes Cutting Planes as a non-trivial subsystem of Stabbing Planes.

See 1.1

We begin by formally defining Facelike . A Stabbing Planes query at a node is facelike if one of the sets , is empty or a face of (see (b)). An refutation is facelike if every query in the refutation is facelike.

Enroute to proving subsection 1.1, it will be convenient to introduce the following further restriction of Facelike Stabbing Planes.

A Stabbing Planes query at a node corresponding to a polytope is pathlike if at least one of and is empty (see (a)). A Pathlike refutation is one in which every query is pathlike.

The name “pathlike” stems from the fact that the underlying graph of a pathlike Stabbing Planes proof is a path, since at most one child of every node has any children (see Figure 2). In fact, we have already seen (nontrivial) pathlike queries under another name: Chvátal-Gomory cuts.

(a) A Pathlike query. The polytope , and is a CG cut for .

(b) A Facelike query. The polytope is a face of .
Figure 2: Pathlike and Facelike queries on a polytope . On the left are the proofs and on the right are the corresponding effects on the polytope.

Let be a polytope and let be a pathlike Stabbing Planes query for . Assume w.l.o.g. that and that . Then is a CG cut for .

Proof.

Since is falsified by some point in , it follows that there exists some such that is valid for — note that since otherwise would not have empty intersection with . This immediately implies that is a CG cut for . ∎

With this observation we can easily prove that Pathlike is equivalent to . Throughout the remainder of the section, for readability, we will use the abbreviation for , for any polytope and linear inequality .

Pathlike is polynomially equivalent to .

Proof.

First, let be a refutation of an unsatisfiable system of linear inequalities . Consider the sequence of polytopes and . By inspecting the rules of , it can observed that and thus can be deduced using one pathlike query from for all .

Conversely, let be any polytope and let be any pathlike query to (so, suppose w.l.o.g. that the halfspace defined by has empty intersection with ). By subsection 3.1, is a CG cut for , and so can be deduced in Cutting Planes from the inequalities defining in length (cf. Section 2). Applying this to each query in the Pathlike proof yields the theorem.

Next, we show how to simulate Facelike proofs by Pathlike proofs of comparable size. The proof of subsection 3.1 is inspired by Dadush and Tiwari [TseitinUpperBound], and will use the following lemma due to Schrijver [SCHRIJVER1980291] (although, we use the form appearing in [CookCT87]). Recall that we write for polytopes to mean that can be obtained from by adding a single CG cut to . [Lemma 2 in [CookCT87]] Let be a polytope defined by a system of integer linear inequalities and let be a face of . If then there is a polytope such that and .

Facelike is polynomially equivalent to Pathlike .

Proof.

That Facelike simulates Pathlike follows by the fact that any Pathlike query is a valid query in Facelike . For the other direction, consider an refutation of size . We describe a recursive algorithm for generating a Pathlike proof from . The next claim will enable our recursive case.

Claim. Let be a polytope and suppose is valid for . Assume that has a Pathlike refutation using queries. Then can be derived from in Pathlike using queries.

Proof of Claim..

Since is valid for it follows that is a face of by definition. Consider the Pathlike refutation , where the th polytope for is obtained from by applying a pathlike query and proceeding to the non-empty child. Without loss of generality we may assume that for all , and so applying subsection 3.1 we have that for all . Thus, by applying Figure 2 repeatedly, we get a sequence of polytopes such that . This means that , and so is Pathlike query for . This means that . Since any CG cut can be implemented as a Pathlike query the claim follows by applying the CG cuts as pathlike queries, followed by the query . ∎

We generate a Pathlike refutation by the following recursive algorithm, which performs an in-order traversal of . At each step of the recursion (corresponding to a node in ) we maintain the current polytope we are visiting and a Pathlike proof — initially, is the initial polytope and . We maintain the invariant that when we finish the recursive step at node , the Pathlike refutation is a refutation of . The algorithm is described next:

  1. Let be the current query and suppose that is valid for .

  2. Recursively refute , obtaining a Pathlike refutation with queries.

  3. Apply the above Claim to deduce from in queries.

  4. Refute by using the refutation for the right child.

Correctness follows immediately from the Claim, and also since the size of the resulting proof is the same as the size of the refutation. ∎

subsection 1.1 then follows by combining subsection 3.1 with subsection 3.1.

3.2 Simulating by

In this section we prove subsection 1.1, restated below for convenience.

See 1.1

To prove this theorem, we will show that any low coefficient proof can be converted into a Facelike proof with only a quasi-polynomial loss. If is a polytope let denote the diameter of , which is the maximum Euclidean distance between any two points in . subsection 1.1 follows immediately from the following theorem. Let be a polytope and suppose there is an refutation of with size and maximum coefficient size . Then there is a Facelike refutation of in size

Proof.

The theorem is by induction on . Clearly, if then the tree is a single leaf and the theorem is vacuously true.

We proceed to the induction step. Let be the initial polytope and be the proof. Consider the first query made by the proof, and let be the proof rooted at the left child (corresponding to ) and let be the proof rooted at the right child. Let denote the polytope at the left child and denote the polytope at the right child. By induction, let and be the Facelike refutations for and guaranteed by the statement of the theorem.

Suppose w.l.o.g. that . Let be the largest integer such that is satisfied for any point in . The plan is to replace the first query with a sequence of queries such that

  • For each , .

  • The query is the root of the tree and is attached to the right child of for .

  • .

After doing this replacement, instead of having two child polytopes below the top query, we have polytopes where and . To finish the construction, for each use the proof to refute and the proof to refute .

We need to prove three statements: this new proof is a valid refutation of , the new proof is facelike, and that the size bound is satisfied.

First, it is easy to see that this is a valid proof, since for each the polytope and — thus, the refutations and can be used to refute the respective polytopes.

Second, to see that the proof is facelike, first observe that all the queries in the subtrees are facelike queries by the inductive hypothesis. So, we only need to verify that the new queries at the top of the proof are facelike queries, which can easily be shown by a quick induction. First, observe that the query is a facelike query, since was chosen so that is valid for the polytope . By induction, the query is a facelike query since the polytope associated with that query is by definition. Thus is valid for the polytope at the query.

Finally, we need to prove the size upper bound. Let be the size of the original proof, be the size of and be the size of . Observe that the size of the new proof is given by the recurrence relation

where . Since the queries cover the polytope with slabs of width , it follows that

where we have used that the maximum coefficient size in the proof is . Thus, by induction, the previous inequality, and the assumption that , we can conclude that the size of the proof is

subsection 1.1 follows immediately, since for any CNF formula the encoding of as a system of linear inequalities is contained in the -dimensional cube , which has diameter . We may also immediately conclude subsection 1.1 by applying the known lower bounds on the size of Cutting Planes proofs [Pudlak97, FlemingPPR17, HrubesP17, GargGKS18].

As a consequence of subsection 1.1 and the non-automatability of Cutting Planes [GoosKMP20], we can conclude that proofs cannot be found efficiently assuming . Indeed, non-automatability of follows by observing that the argument [GoosKMP20] does not require large coefficients.

4 Refutations of Linear Equations over a Finite Field

In this section we prove subsection 1.1. To do so, we will extend the approach used by Beame et al. [BeameFIKPPR18] to prove quasi-polynomial upper bounds on the Tseitin formulas to work on any unsatisfiable set of linear equations over any finite field.

If is a linear equation we say the width of the equation is the number of non-zero variables occurring in it. Any width- linear equation over characteristic can be represented by a CNF formula with width- clauses — one ruling out each falsifying assignment. For a width- system of linear equations over a finite field, we will denote by the size of the CNF formula encoding .

Let be a width-, unsatisfiable set of linear equations over characteristic . There is an refutation of (the CNF encoding of) in size .

First we sketch the idea over characteristic . In this case the proof corresponds to a branch decomposition procedure which is commonly used to solve SAT (see e.g. [LodhaOS19, AlekhnovichR02, Dechter96, Darwiche01]). View the system as a hypergraph over vertices (corresponding to the variables) and with a -edge for each equation. Partition the set of hyperedges into two sets of roughly the same size, and consider the cut of vertices that belong to both an edge in and in . Using the rule we branch on all possible values of the sum of the cut variables in order to isolate and . Once we know this sum, we are guaranteed that either is unsatisfiable or is unsatisfiable depending on the parity of the of the sum of the cut variables. This allows us to recursively continue on the side of the cut ( or ) that is unsatisfiable. Since there are Boolean variables, each cut corresponds to at most possibilities for the sum, and if we maintain that the partition of the hyper edges defining the cut is balanced, then we will recurse at most times. This gives rise to a tree decomposition of fanout and height .

Over characteristic the proof will proceed in much the same way. Instead of a subgraph, at each step we will maintain a subset of the equations such that must contain a constraint that is violated by the queries made so far. We partition into two sets and of roughly equal size and query the values and of and . Because is unsatisfiable, at least one of or , meaning that that it is unsatisfiable, and we recurse on it.

In the following, we will let stand for a vector of -valued variables . When we discuss any form where and is a vector of variables , we will implicitly associate it with the linear form where are the many Boolean variables encoding in the CNF encoding of .

Proof of section 4.

Let be a system of unsatisfiable linear equations over , where each for , and . Because is unsatisfiable, there exists a linear combination of the equations in witnessing this; formally, there exists such that , but .

Stabbing Planes will implement the following binary search procedure for a violated equation; we describe the procedure first, and then describe how to implement it in Stabbing Planes. In each round we maintain a subset and an integer representing the value of . Over the algorithm, we maintain the invariant that , which implies that there must be a contradiction to inside of the constraints .

Initially, and we obtain by querying the value of the sum . If then this contradicts the fact that ; thus, the invariant holds. Next, perform the following algorithm.

  1. Choose a balanced partition (so that ).

  2. Query the value of and ; denote these values by and respectively.

  3. If then recurse on with . Otherwise, if then recurse on with .

  4. Otherwise (if ), then this contradicts the invariant:

This recursion stops when , at which point we have an immediate contradiction between and the single equation indexed by .

It remains to implement this algorithm in . First, we need to show how to perform the queries in step 2. Querying the value of any sum can be done in a binary tree with at most leaves, one corresponding to every possible query outcome. Internally, this tree queries all possible integer values for this sum (e.g. ). For the leaf where we have deduced we use the fact that each variable is non-negative to deduce that as well. Note that is an upper bound on this sum because there are equations, each containing at most variables, each taking value at most Note that instead of querying the value of we could have queried to decrease the number of leaves to .. Thus, step 2 can be completed in queries.

Finally, we show how to derive refutations in the following cases: (i) when we deduced that at the beginning, (ii) in step 4, (iii) when .

  1. Suppose that we received the value from querying . Note that every variable in is a multiple of . Query

    At the leaf that deduces , we can derive as a non-negative linear combination of this inequality together with . Similarly, at the other leaf can be combined with to derive .

  2. Suppose that . Then is derived by summing , and , all of which have already been deduced.

  3. When then we deduced that for and we would like to derive a contradiction using the axioms encoding . These axioms are presented to as the linear-inequality encoding of a CNF formula, and while there are no integer solutions satisfying both these axioms and , there could in fact be rational solutions. To handle this, we simply force that each of the at most variables in takes an integer value by querying the value of each variable one by one. As there are at most variables, each taking an integer value between and , this can be done in a tree with at most many leaves. At each leaf of this tree we deduce by a non-negative linear combination with the axioms, the integer-valued variables, and .

The recursion terminates in at most many rounds because the number of equations under consideration halves every time. Therefore, the size of this refutation is . Note that by making each query in a balanced tree, this refutation can be carried out in depth . ∎

Finally, we conclude subsection 1.1.

Proof of subsection 1.1.

Observe that the refutation from section 4 is facelike. Indeed, to perform step 2 we query from . For , the halfspace is valid for the current polytope because the polytope belongs to the cube. For each subsequent query, is valid because the previous query deduced . Similar arguments show that the remaining queries are also facelike. Thus, subsection 3.1 completes the proof. ∎

We note that the refutations that result from subsection 1.1 have a very particular structure: they are extremely long and narrow. Indeed, they has depth . We give a rough sketch of the argument: it is enough to show that most lines in the refutation are derived using some previous line with . This is because the final line would have depth proportional to the size of the proof. To see that the refutation satisfies this property, observe that for each node visited in the in-order traversal, the nodes in the right subproof depend on the halfspace labelling the root, which in turn depends on the left subproof .

5 Lower Bound on the Depth of Semantic Refutations

Our results from section 3 suggest an interesting interplay between depth and size of Cutting Planes proofs. In particular, we note that there is a trivial depth and exponential size refutation of any unsatisfiable CNF formula in Cutting Planes; however, it is easy to see that the Dadush–Tiwari proofs and our own quasipolynomial size proofs of Tseitin are also extremely deep (in particular, they are superlinear). Even in the stronger Semantic it is not clear that the depth of these proofs can be decreased. However, this does not hold for , which has quasi-polynomial size and poly-logarithmic depth refutations. This motivates subsection 1.1, regarding the existence of a “supercritical” trade-off between size and depth for Cutting Planes [Razborov16, BerkholzN20]. The Tseitin formulas are a natural candidate for resolving this conjecture.

In this section we develop a new method for proving depth lower bounds which we believe should be more useful for resolving this conjecture. Our method works not only for but also for semantic . Using our technique, we establish the first linear lower bounds on the depth of Semantic refutations of the Tseitin formulas.

Lower bounds on the depth of syntactic refutations of Tseitin formulas were established by Buresh-Openheim et al. [BGHMP06] using a rank-based argument. Our proof is inspired by their work, and so we describe it next. Briefly, their proof proceeds by considering a sequence of polytopes where is the polytope defined by all inequalities that can be derived in depth from the axioms in . The goal is to show that is not empty. To do so, they show that a point is also in if for every coordinate such that , there exists points such that if and otherwise. The proof of this fact is syntactic: it relies on the careful analysis of the precise rules of .

When dealing with Semantic , we can no longer analyze a finite set of syntactic rules. Furthermore, it is not difficult to see that the aforementioned criterion for membership in is no longer sufficient for Semantic . We develop an analogous criterion for Semantic given later in this section. As well, we note that the definition of is not well-suited to studying the depth of bounded-size proofs like those in subsection 1.1 — there does not appear to be a useful way to limit to be a polytope derived by a bounded number of halfspaces. Therefore we develop our criterion in the language of lifting, which is more amenable to supercritical tradeoffs [Razborov16, BerkholzN20].

Through this section we will work with the following top-down definition of Semantic . Let be an -variate unsatisfiable CNF formula. An refutation of is a directed acyclic graph of fan-out where each node is labelled with a halfspace (understood as a set of points satisfying a linear inequality) satisfying the following:

  1. Root. There is a unique source node labelled with the halfspace (corresponding to the trivially true inequality ).

  2. Internal-Nodes. For each non-leaf node with children , we have

  3. Leaves. Each sink node is labeled with a unique clause such that .

The above definition is obtained by taking a (standard) proof and reversing all inequalities: now, a line is associated with the set of assignments falsified at that line, instead of the assignments satisfying the line.

To prove the lower bound we will need to find a long path in the proof. To find this path we will be taking a root-to-leaf walk down the proof while constructing a partial restriction on the variables. For a partial restriction , denote by and . Let the restriction of by be the halfspace

It is important to note that is itself a halfspace on the free coordinates of .

One of our key invariants needed in the proof is the following. A halfspace is good if it contains the all- vector, that is, .

We will need two technical lemmas to prove the lower bounds. The first lemma shows that if a good halfspace has its boolean points covered by halfspaces , then one of the two covering halfspaces is also good modulo restricting a small set of coordinates. Let be any good halfspace, and suppose for halfspaces . Then there is a restriction and an such that