    Let G be any n-vertex graph whose random walk matrix has its nontrivial eigenvalues bounded in magnitude by 1/√(Δ) (for example, a random graph G of average degree Θ(Δ) typically has this property). We show that the (c n/Δ)-round Sherali--Adams linear programming hierarchy certifies that the maximum cut in such a G is at most 50.1% (in fact, at most 12 + 2^-Ω(c)). For example, in random graphs with n^1.01 edges, O(1) rounds suffice; in random graphs with n ·polylog(n) edges, n^O(1/ n) = n^o(1) rounds suffice. Our results stand in contrast to the conventional beliefs that linear programming hierarchies perform poorly for and other CSPs, and that eigenvalue/SDP methods are needed for effective refutation. Indeed, our results imply that constant-round Sherali--Adams can strongly refute random Boolean k-CSP instances with n^ k/2 + δ constraints; previously this had only been done with spectral algorithms or the SOS SDP hierarchy.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Linear programming (LP) is a fundamental algorithmic primitive, and is the method of choice for a huge number of optimization and approximation problems. Still, there are some very basic tasks where it performs poorly. A classic example is the simplest of all constraint satisfaction problems (CSPs), the max-cut problem: Given a graph , partition into two parts so as to maximize the fraction of ‘cut’ (crossing) edges. The standard LP relaxation for this problem [BM86, Pol92] involves optimizing over the metric polytope. Using “ notation”, we have a variable for each pair of vertices (with supposed to be if the edge is cut, otherwise); the LP is:

 −1≤Yuv≤1 (for all u,v∈V) {\sc max-cut}(G) ≤ max 12−12⋅1|E|∑uv∈EYuvs.t. −Yuv−Yvw−Ywu≤1 (for all u,v,w∈V) −Yuv+Yvw+Ywu≤1 (for all u,v,w∈V)

While this LP gives the optimal bound for some graphs (precisely, all graphs not contractible to  [BM86]), it can give a very poor bound in general. Indeed, although there are graphs with maximum cut arbitrarily close to (e.g., ), the above LP bound is at least for every graph, since is always a valid solution. Worse, there are graphs with arbitrarily close to  but with LP value arbitrarily close to  — i.e., graphs where the integrality ratio is . For example, this is true [PT94] of an Erdős–Rényi

random graph with high probability (whp) when

satisfies .

There have been two main strategies employed for overcoming this deficiency: strengthened LPs, and eigenvalue methods.

##### Strengthened LPs.

One way to try to improve the performance of LPs on max-cut is to add more valid inequalities to the LP relaxation, beyond just the “triangle inequalities”. Innumerable valid inequalities have been considered: -gonal, hypermetric, negative type, gap, clique-web, suspended tree, as well as inequalities from the Lovász–Schrijver hierarchy; see Deza and Laurent [DL97, Ch. 28–30] for a review.

It is now known that the most principled and general form of this strategy is the Sherali–Adams LP hierarchy [SA90], reviewed in Section 2.4. At a high level, the Sherali–Adams LP hierarchy gives a standardized way to tighten LP relaxations of Boolean integer programs, by adding variables and constraints. The number of new variables/constraints is parameterized by a positive integer , called the number of “rounds”. Given a Boolean optimization problem with variables, the -round Sherali–Adams LP has variables and constraints corresponding to monomials of degree up to , and thus has size . A remarkable recent line of work [CLRS16, KMR17] has shown that for any CSP (such as max-cut), the -round Sherali–Adams LP relaxation achieves essentially the tightest integrality ratio among all LPs of its size. Nevertheless, even this most powerful of LPs arguably struggles to certify good bounds for max-cut. In a line of work [dlVM07, STT07] concluding in a result of Charikar–Makarychev–Makarychev [CMM09], it was demonstrated that for any constant , there are graphs (random -regular ones, ) for which the -round Sherali–Adams LP has a max-cut integrality gap of . As a consequence, every max-cut LP relaxation of size up to has such an integrality gap.

##### Eigenvalue and SDP methods.

But for max-cut, there is a simple, non-LP, algorithm that works very well to certify that random graphs have maximum cut close to : the eigenvalue bounds. There are two slight variants here (that coincide in the case of regular graphs): Given graph with adjacency matrix  and diagonal degree matrix , the eigenvalue bounds are

 {\sc max-cut}(G) ≤|V|4|E|λmax(D−A) (1) {\sc max-cut}(G) ≤12+12λmax(−D−1A). (2)

Here and are the Laplacian matrix and the random walk matrix, respectively. The use of eigenvalues to bound various cut values in graphs (problems like max-cut, min-bisection, 2-xor, expansion, etc.) has a long history dating back to Fieldler and Donath–Hoffman [Fie73, DH03] among others (Equation 1 is specifically from Mohar–Poljak [MP93]). It was recognized early on that eigenvalue methods work particularly well for solving planted-random instances (e.g., of 2-xor [Hås84] and min-bisection [Bop87]) and for certifying max-cut values near for truly random instances. Indeed, as soon as one knows (as we now do [TY16, FO05]) that has all nontrivial eigenvalues bounded in magnitude by (whp) for a random -regular graph (or an Erdős–Rényi graph with ), the eigenvalue bound Equation 2 certifies that . This implies an integrality ratio tending to ; indeed, in such random graphs (whp).

Furthermore, if one extends the eigenvalue bound Equation 1 above to

 {\sc max-cut}(G)≤minU diagonaltr(U)=0|V|4|E|λmax(D−A+U)

(as suggested by Delorme and Poljak [DP93], following [DH03, Bop87]), one obtains the polynomial-time computable semidefinite programming (SDP) bound. Goemans and Williamson [GW95] showed this bound has integrality ratio less than for worst-case , and it was subsequently shown [Zwi99, FL01, CW04] that the SDP bound is whenever .

##### LPs cannot compete with eigenvalues/SDPs?

This seemingly striking separation between the performance of LPs and SDPs in the context of random max-cut instances is now taken as a matter of course. To quote, e.g., [Tre09],

[E]xcept for semidefinite programming, we know of no technique that can provide, for every graph of max cut optimum , a certificate that its optimum is . Indeed, the results of [dlVM07, STT07][[CMM09]] show that large classes of Linear Programming relaxations of max cut are unable to distinguish such instances.

Specifically, the last statement here is true for -regular random graphs when is a certain large constant. The conventional wisdom is that for such graphs, linear programs cannot compete with semidefinite programs, and cannot certify even the eigenvalue bound.

Our main result challenges this conception.

### 1.1 Our results

We show that whenever the eigenvalue bound Equation 2 certifies the bound , then -round Sherali–Adams can certify this as well.111Actually, there is a slight mismatch between our result and Equation 2: in Footnote 1 we need the maximum eigenvalue in magnitude to be small; i.e., we need to be not too negative. This may well just be an artifact of our proof. [Simplified version of Section 3 and Section 3] Let be a simple -vertex graph and assume that for all eigenvalues of ’s random walk matrix (excluding the trivial eigenvalue of ). Then for any , Sherali–Adams with rounds certifies that .

For example, if ’s random walk matrix has its nontrivial eigenvalues bounded in magnitude by , as is the case (whp) for random graphs with about edges, then Sherali–Adams can certify with constantly many rounds. We find this result surprising, and in defiance of the common belief that polynomial-sized LPs cannot take advantage of spectral properties of the underlying graph.222We should emphasize that it is not concomitant running time that is surprising, since the eigenvalue bound itself already achieves that.

(As an aside, in Section 4, we show that the -round Sherali–Adams relaxation for max-cut has value at least for every graph . This at least demonstrates that some dependence of the refutation strength on the number of Sherali–Adams rounds is always necessary.)

One might ask whether Footnote 1 even requires the assumption of small eigenvalues. That is, perhaps -round Sherali–Adams can certify whenever this is true. As far as we know, this may be possible; after all, the basic SDP relaxation has this property [Zwi99, FL01, CW04]. On the other hand, the eigenvalue bound itself does not have this property; there exist graphs with large (nontrivial) eigenvalues even though the maximum cut is close to .333Consider, for example, a graph given by the union of a -regular random graph on vertices and a -regular bipartite graph on vertices. This will have max-cut value close to , but will also have large negative eigenvalues coming from the bipartite component.

#### 1.1.1 Subexponential-sized LPs for max-cut in sparse random graphs

One setting in which the spectral radius is understood concretely is in random regular graphs. Building upon [FKS89, BFSU99, CGJ18], the following was recently shown: [[TY16]] There is a fixed constant such that for all with even, it holds that a uniformly random -vertex -regular simple graph satisfies the following with high probability: all eigenvalues of ’s normalized adjacency matrix, other than , are at most in magnitude. Combining the above with Footnote 1, we have the following consequence for max-cut on random regular graphs: Let , , and be positive integers. Then if is a random -regular -vertex graph, with high probability -round Sherali–Adams can certify that .

For example, if (for the constant in the bound on ), then -rounds of Sherali–Adams can certify . This result serves as a partial converse to [CMM09]: ([CMM09, Theorem 5.3]) For every fixed integer , with high probability over the choice of an -vertex -regular random graph ,444In [CMM09], the graph is actually a pruned random graph, in which edges are removed; this does not affect compatibility with our results, as the LP value is Lipschitz and so the pruning changes the LP value by . the -round Sherali–Adams relaxation for max-cut has value at least , where is a function that grows with .555Though is not specified in [CMM09] (and in their proof is dictated by a combination of lemmas in prior works) it appears we can take .

While [CMM09] show that -regular random graphs require Sherali–Adams (and by [KMR17] any LP) relaxations of at least subexponential size, our result implies that subexponential LPs are sufficient. Further, though the function is not specified in [CMM09], by tracing back through citations (e.g. [ABLT06, ALN12, CMM10]) to extract a dependence, it appears we may take . So our upper bound is tight as a function of , up to constant factors.

Prior to our result, it was unclear whether even -round Sherali–Adams could certify that the max-cut value was for sparse random regular graphs. Indeed, it was equally if not more conceivable that Charikar et al.’s result was not tight, and could be extended to -rounds. In light of our result, we are left to wonder whether there are instances of max-cut which have truly exponential extension complexity.

#### 1.1.2 Refuting Random CSPs with linear programs

With minor modifications, our argument extends as well to 2-xor. Then following the framework in [AOW15], we have the following consequence for certifying bounds on the value of random -CSPs: [Simplified version of Section 5] Suppose that is a -ary Boolean predicate, and that . Let be the probability that a random satisfies . Then for a random instance of on variables with expected clauses, with high probability Sherali–Adams can certify that using rounds.

This almost matches the comparable performance of sum-of-squares and spectral algorithms [AOW15], which are known to require clauses to certify comparable bounds in polynomial time [Gri01, Sch08, KMOW17].666 The expert may notice that we require the number of clauses , whereas the best sum-of-squares and spectral algorithms require only

. This is because we do not know how to relate the Sherali–Adams value of the objective function to its square (local versions of the Cauchy-Schwarz argument result in a loss). Such a relation would allow us to apply our techniques immediately to prove that Sherali Adams matches the SOS and spectral performance for odd as well as even

. Prior to our work it was known that Sherali–Adams admits weak refutations (i.e. a certificate that ) when , but it was conceivable (and even conjectured) that -rounds could not certify for constant at .

The result above also extends to -wise independent predicates as in [AOW15] (see Section 5). Also, one may extract the dependence on the parameters to give nontrivial results when these parameters depend on .777Though for 2-xor and max-cut we have done this explicitly, for higher-arity random CSPs we have left this for the interested reader.

### 1.2 Prior work

It is a folklore result that in random graphs with average degree , -round Sherali–Adams (SA) certifies a max-cut value of at most (observed for the special case of in [BHHS11, PT94]); this is simply because of concentration phenomena, since most edges participate in roughly the same number of odd cycles of length , after which one can apply the triangle inequality. However this observation does not allow one to take the refutation strength independent of the average degree.

There are some prior works examining the performance of Sherali–Adams hierarchies on random CSPs. The work of de la Vega and Mathieu [dlVM07] shows that in dense graphs, with average degree , -rounds of Sherali–Adams certifies tight bounds on max-cut. This result heavily leverages the density of the graph, and is consistent with the fact that dense max-cut is known to admit a PTAS [FK96]. However, their work does not give -round approximations better than for graphs with edges.

Another relevant line of work is a series of LP hierarchy lower bounds (both for Sherali–Adams and for the weaker Lovász-Schrijver hierarchy) for problems such as max-cut, vertex cover, and sparsest cut, including [AAT11, ABLT06, dlVM07, STT07], and culminating in the already mentioned result of Charikar, Makarychev and Makarychev; in [CMM09], they give subexponential lower bounds on the number of rounds of Sherali–Adams required to strongly refute max-cut in random regular graphs. Initially, one might expect that this result could be strengthened to prove that sparse random graphs require almost-exponential-sized LPs to refute max-cut; our result demonstrates instead that [CMM09] is almost tight.

We also mention the technique of global correlation rounding in the sum-of-squares hierarchy, which was used to give subexponential time algorithms for unique games [BRS11] and polynomial-time approximations to max-bisection [RT12]. One philosophical similarity between these algorithms and ours is that both relate local properties (correlation among edges) to global properties (correlation of uniformly random pairs). But [BRS11, RT12] use the fact that the relaxation is an SDP (whereas our result is interesting because it is in the LP-only setting), and the “conditioning” steps that drive their algorithm are a fundamentally different approach.

There are many prior works concerned with certifying bounds on random CSPs, and we survey only some of them here, referring the interested reader to the discussion in [AOW15]. The sequence of works [Gri01, Sch08, KMOW17] establishes sum-of-squares lower bounds for refuting any random constraint satisfaction problem, and these results are tight via the SOS algorithms of [AOW15, RRS17]. The upshot is that for -sat and -xor,888This is more generally true for any CSP that supports a -wise independent distribution over satisfying assignments. rounds of SOS are necessary to strongly refute an instance with clauses, and rounds of SOS suffice when . Because SOS is a tighter relaxation than Sherali–Adams, the lower bounds [Gri01, Sch08, KMOW17] apply; our work can be seen to demonstrate that SA does not lag far behind SOS, strongly refuting with rounds as soon as for any .

In a way, our result is part of a trend in anti-separation results for SDPs and simpler methods for pseudorandom and structured instances. For example, we have for planted clique that the SOS hierarchy performs no better than the Lovász-Schrijver+ hierarchy [FK03, BHK16]

, and also no better than a more primitive class of estimation methods based on local statistics (see e.g.

for a discussion). Similar results hold for problems relating to estimating the norms of random tensors

[HKP17]. Further, in [HKP17] an equivalence is shown between SOS and spectral algorithms for a large class of average-case problems. Our result shows that for random CSPs, the guarantees of linear programs are surprisingly not far from the guarantees of SOS.

Finally, we mention related works in extended formulations. The sequence of works [CLRS16, KMR17] show that SA lower bounds for CSPs imply lower bounds for any LP relaxation; the stronger (and later) statement is due to [KMR17], who show that subexponential-round integrality gaps for CSPs in the Sherali–Adams hierarchy imply subexponential-size lower bounds for any LP. These works are then applied in conjunction with [Gri01, Sch08, CMM09] to give subexponential lower bounds against CSPs for any LP; our results give an upper limit to the mileage one can get from these lower bounds in the case of max-cut, as we show that the specific construction of [CMM09] cannot be strengthened much further.

### 1.3 Techniques

Our primary insight is that while Sherali–Adams is unable to reason about spectral properties globally, it does enforce that every set of variables behave locally according to the marginals of a valid distribution, which induces local spectral constraints on every subset of up to variables.

At first, it is unclear how one harnesses such local spectral constraints. But now, suppose that we are in a graph with a small spectral radius. This implies that random walks mix rapidly, in say

steps, to a close-to-uniform distribution. Because a typical pair of vertices at distance

is distributed roughly as a uniformly random pair of vertices, any subset of vertices which contains a path of length already allows us to relate global and local graph properties.

To see why this helps, we take for a moment the “pseudoexpectation” view, in which we think of the

-round Sherali–Adams as providing a proxy for the degree- moments of a distribution over max-cut solutions , with max-cut value

 {\sc max-cut}(G)=12−12\E(u,v)∈E(G)˜\E[xuxv], (3)

where is the “pseudo-correlation” of variables . Because there is no globally consistent assignment, the pseudo-correlation for vertices sampled uniformly at random will be close to .999This is implicit in our proof, but intuitively it should be true because e.g. should be connected by equally many even- and odd-length paths. But in any fixed subgraph of size , enforcing for pairs at distance has consequences, and limits the magnitude of correlation between pairs of adjacent vertices as well. In particular, because the pseudo-second moment matrix for the restriction of to a set of up to vertices must be PSD, forcing some entries to gives a constraint on the magnitude of edge correlations.

For example, suppose for a moment that we are in a graph with , and that is a star graph in , given by one “root” vertex with children , and call . Notice that pairs of distinct children are at distance in . If we then require for every , the only nonzero entries of are the diagonals (which are all ), and the entries corresponding to edges from the root to its children, , which are

. Now defining the vector

with a at the root , and on each child , , we have from the PSDness of that

 0≤c⊤Xc=∥c∥22+∑u∈U2crcu⋅˜\E[xrxu]=(1+α2k)+2αk\E(u,v)∈E(S)˜\E[xuxv].

Choosing , this implies that within , the average edge correlation is lower bounded by . Of course, for a given star we cannot know that , but if we take a well-chosen weighted average over all stars, this will (approximately) hold on average.

Our strategy is to take a carefully-chosen average over specific subgraphs of with . By our choice of distribution and subgraph, the fact that the subgraphs locally have PSD pseudocorrelation matrices has consequences for the global average pseudocorrelation across edges, which in turn gives a bound on the objective value eq. 3. This allows us to show that Sherali–Adams certifies much better bounds than we previously thought possible, by aggregating local spectral information across many small subgraphs.

### Organization

We begin with technical preliminaries in Section 2. In Section 3 we prove our main result. Section 4 establishes a mild lower bound for arbitrary graphs. Finally, Section 5 applies Footnote 1 to the refutation of arbitrary Boolean CSPs.

## 2 Setup and preliminaries

We begin by recalling preliminaries and introducing definitions that we will rely upon later.

### 2.1 Random walks on undirected graphs

Here, we recall some properties of random walks in undirected graphs that will be of use to us. Let be an undirected finite graph, with parallel edges and self-loops allowed101010Self-loops count as “half an edge”, and contribute  to a vertex’s degree., and with no isolated vertices. The standard random walk on

is the Markov chain on

in which at each step one follows a uniformly random edge out of the current vertex. For , we use the notation to denote that is the result of taking one random step from . We write for the transition operator of the standard random walk on . That is, is obtained from the adjacency matrix of  by normalizing the th row by a factor of . We write

for the probability distribution on

defined by . As is well known, this is an invariant distribution for the standard random walk on , and this Markov chain is reversible with respect to . For and , the distribution of is that of a uniformly random (directed) edge from . We will also use the notation . For we use the notation for . This is an inner product on the vector space ; in case is regular and hence is the uniform distribution, it is the usual inner product scaled by a factor of . It holds that

 ⟨f,Kg⟩π=⟨Kf,g⟩π=\E(\bu,\bv)∼E[f(\bu)g(\bv)]. (4)

A stationary -step walk is defined to be a sequence formed by choosing an initial vertex , and then taking a standard random walk, with . Generalizing Equation 4, it holds in this case that

 \E[f(\bu0)g(\bud)]=⟨f,Kdg⟩π.

### 2.2 Tree-indexed random walks

To prove our main theorem we define a class of homomorphisms we call tree-indexed random walks.

Suppose we have a finite undirected tree with vertex set . A stationary -indexed random walk in  is a random homomorphism defined as follows: First, root the tree at an arbitrary vertex . Next, define . Then, independently for each “child” of  in the tree, define ; that is, define to be the result of taking a random walk step from . Recursively repeat this process for all children of ’s children, etc., until each vertex has been assigned a vertex . We note that the homomorphism defining the -indexed random walk need not be injective. Consequently, if is a tree with maximum degree , we can still have a -indexed random walk in a -regular graph with .

The following fact is simple to prove; see, e.g., [LP17]. The definition of does not depend on the initially selected root . Further, for any two vertices at tree-distance , if is the unique path in the tree between them, then the sequence is distributed as a stationary -step walk in .

### 2.3 2XOR and signed random walks

The 2-xor constraint satisfaction problem is defined by instances of linear equations in . For us it will be convenient to associate with these instances a graph with signed edges, and on such graphs we perform a slightly modified random walk.

We assume that for each vertex pair where has edge, there is an associated sign .111111If has multiple edges, we think of them as all having the same sign. We arrange these signs into a symmetric matrix . If has no edge then the entry will not matter; we can take it to be . We write for the signed transition operator. The operator is self-adjoint with respect to , and hence has real eigenvalues. It also holds that

 ⟨f,\olKg⟩π=⟨\olKf,g⟩π=\E(\bu,\bv)∼E[ξ\bu\bvf(\bu)g(\bv)]. (5)

We may think of and as defining a 2-xor constraint satisfaction problem (CSP), in which the task is to find a labeling so as to maximize the fraction of edges for which the constraint is satisfied. The fraction of satisfied constraints is

 \E(\bu,\bv)∼E[12+12ξ\bu\bvf(\bu)f(\bv)]=12+12⟨f,\olKf⟩π. (6)

We will typically ignore the ’s and think of the 2-xor CSP as maximizing the quadratic form . When all signs in the matrix are , we refer to this as the max-cut CSP. We say that a signed stationary -step walk is a sequence of pairs for , chosen as follows: first, we choose a stationary -step walk in ; second, we choose uniformly at random; finally, we define . Generalizing Equation 5, it holds in this case that

We extend the notion from Section 2.2 to that of a signed stationary -indexed random walk in . Together with the random homomorphism , we also choose a random signing as follows: for the root , the sign is chosen uniformly at random; then, all other signs are deterministically chosen — for each of we set , and in general where is the parent of . Again, it is not hard to show that the definition of does not depend on the choice of root , and that for any path of vertices in the tree, the distribution of is that of a signed stationary -step walk in .

### 2.4 Proof systems

Our central object of study is the Sherali–Adams proof system, but our results also apply to a weaker proof system which we introduce below. We define the -local, degree- (static) sum of squares (SOS) proof system over indeterminates as follows. The “lines” of the proof are real polynomial inequalities in . The “default axioms” are any real inequalities of the form , where is a polynomial in at most variables and of degree at most . The “deduction rules” allow one to derive any nonnegative linear combination of previous lines. This is a sound proof system for inequalities about real numbers .

In addition to the default axioms, one may also sometimes include problem-specific “equalities” of the form . In this case, one is allowed additional axioms of the form the polynomial depends on at most indeterminates and has degree at most . The case of (equivalently, ) corresponds to the well-known degree- SOS proof system. Suppose one includes the Boolean equalities, meaning for all .121212Or alternatively, for all . In this case is equivalent to , and the corresponding proof system is the well-known -round Sherali–Adams proof system. It is well known that every inequality that is true for -valued is derivable in this system. There is a -time algorithm based on Linear Programming for determining whether a given polynomial inequality of degree at most  (and rational coefficients of total bit-complexity ) is derivable in the -round Sherali–Adams proof system. We will often be concerned with the -local, degree- SOS proof system, where all lines are quadratic inequalities. In this case, we could equivalently state that the default axioms are all those inequalities of the form

 x⊤Px≥0, (7)

where is a length- subvector of , and is an positive semidefinite (PSD) matrix. In fact, we will often be concerned with the -round, degree- Sherali–Adams proof system. Despite the restriction to , we only know the -time algorithm for deciding derivability of a given quadratic polynomial (of bit-complexity ).

## 3 2XOR certifications from spider walks

In this section, we prove our main theorem: given a 2-xor or max-cut instance on a graph with small spectral radius, we will show that the -local degree- SOS proof system gives nontrivial refutations with not too large.

Our strategy is as follows: we select a specific tree of size , and we consider the distribution over copies of in our graph given by the -indexed stationary random walk. We will use this distribution to define the coefficients for a degree-, -local proof that bounds the objective value of the CSP. We will do this by exploiting the uniformity of the graph guaranteed by the small spectral radius, and the fact that degree- -local SOS proofs can certify positivity of quadratic forms , where is the restriction of to a set of variables with and .

Intuitively, in the “pseudoexpectation” view, the idea of our proof is as follows. When there is no globally consistent assignment, a uniformly random pair of vertices will have pseudocorrelation close to zero. On the other hand, if -step random walks mix to a roughly uniform distribution over vertices in the graph, then pairs of vertices at distance will also have pseudocorrelation close to zero. But also, in our proof system the degree- pseudomoments of up to variables obey a positive-semidefiniteness constraint. By choosing the tree with diameter at least , while also choosing to propagate the effect of the low-pseudocorrelation at the diameter to give low-pseudocorrelation on signed edges, we show that the proof system can certify that the objective value is small. Specifically, we will choose to be a spider graph:

For integers , we define a -spider graph to be the tree formed by gluing together paths of length  at a common endpoint called the root. This spider has vertices and diameter . While we were not able to formally prove that the spider is the optimal choice of tree, intuitively, we want to choose a tree that maximizes the ratio of the number of pairs at maximum distance (since such pairs relate the local properties to the global structure) to the number of vertices in the tree (because we need to take our number of rounds to be at least the size of the tree). Among trees, the spider is the graph that maximizes this ratio.

Let us henceforth fix a -spider graph, where the parameters and will be chosen later. We write for the vertex set of this tree (and sometimes identify  with the tree itself). For , we define the matrix to be the “distance-” adjacency matrix of the spider; i.e., is  if and is  otherwise. (We remark that

is the identity matrix.) The following key technical theorem establishes the existence of a matrix

which will allow us to define the coefficients in our -local degree- SOS proof. It will be proven in Section 3.2: For any parameter , there is a PSD matrix with the following properties:

 \laΨ,A(0)\ra =1+12kα2ℓ+1k−1α2ℓ−α2α2−1, \laΨ,A(1)\ra =α, \laΨ,A(d)\ra =0for 1

Here we are using the notation for the “matrix (Frobenius) inner product” . Assuming that and taking , the PSD matrix satisfies

 3/2≤\laΨ,A(0)\ra≤2,\laΨ,A(1)\ra=k1/2ℓ,\laΨ,A(d)\ra=0 for 1

We will also use the following small technical lemma: Let and recall . Then the -local, degree- SOS proof system can derive

 \E\bu∼π∑v∈VM\buvX\buXv≤π−1/2∗∥M∥2\E\bu∼πX2\bu.
###### Proof.

The proof system can derive the following inequality for any , since the difference of the two sides is a perfect square:

 MuvXuXv≤M2uv2γπ(v)X2u+γπ(v)2X2v.

Thus it can derive

 \E\bu∼π∑v∈VM\buvX\buXv≤\E\bu∼πX2\bu∑v∈VM2\buv2γπ(v)+γ2\E\bv∼πX2v. (8)

We’ll take . Since we can certainly derive whenever , we see that it suffices to establish

 ∑v∈VM2\buv2γπ(v)≤γ2

for every outcome of . But this is implied by for all , which is indeed true. ∎

We can now prove the following main theorem: Given parameters , let and define

 β=kπ−1/2∗2k1/2ℓρ(\olK)2ℓ+2k1/2ℓ.

Then -local, degree- SOS can deduce the bound “”; more precisely, it can deduce the two inequalities

 −β⟨X,X⟩π≤⟨X,\olKX⟩π≤β⟨X,X⟩π.

Before proving this theorem, let us simplify the parameters. For any , we can choose to be the smallest integer so that , and . This gives the corollary:

Suppose we have a graph with signed transition operator and . Given , take , and take . Then for , it holds that -local degree- SOS can deduce the bound . In particular, if we think of as a 2-xor CSP, it holds that -round Sherali–Adams can deduce the bound .

###### Proof.

Taking the parameters as above, and using that the constraints imply that -round Sherali–Adams can deduce that whenever , and that as noted in eq. 6, , so Section 3 gives the result. ∎

Section 3 implies the 2-xor version of Footnote 1 since in simple graphs, .

###### Proof of Section 3.

For our -spider graph on , let be a signed stationary -indexed random walk in . Define to be the -indexed vector with . Then letting be the PSD matrix from Section 3, the -local, degree- SOS proof system can derive

 \laΨ,\olx\olx⊤\ra=\olx⊤Ψ\olx≥0.

(This is in the form of Equation 7 if we take .) Furthermore, the proof system can deduce this inequality in expectation; namely,

 \laΨ,Y\ra≥0, where Y=\E[\olx\olx⊤]. (9)

Now by the discussion in Sections 2.3 and 6,

 Yij=\E[\bsigma(i)X\bphi(i)\bsigma(j)X\bphi(j)]=⟨X,\olK\distS(i,j)X⟩π. (10)

Thus recalling the notation from Section 3,

 Y=2ℓ∑d=0⟨X,\olKdX⟩πA(d), (11)

and hence from Equation 9 we get that -local, degree- SOS can deduce

 0≤2ℓ∑d=0\laΨ,A(d)\ra⟨X,\olKdX⟩π=c0⟨X,X⟩π+k1/2ℓ⟨X,\olKX⟩π+12(k−1)⟨X,\olK2ℓX⟩π, (12)

for some constant (here we used Section 3). Regarding the last term, we have:

 ⟨X,\olK2ℓX⟩π=\E\bu∼π∑v∈V(\olK2ℓ)\buvX\buXv. (13)

If we cared only about the Sherali–Adams proof system with Boolean equalities, we would simply now deduce

 \E\bu∼π∑v∈V(\olK2ℓ)\buvX\buXv≤\E\bu∼π∑v∈V∣∣(\olK2ℓ)\buv∣∣≤√|V|\E\bu∼π∥\olK2ℓ\bu,⋅∥2≤√|V|ρ(\olK2ℓ)=√|V|ρ(\olK)2ℓ,

and later combine this with . But proceeding more generally, we instead use Section 3 to show that our proof system can derive

 \E\bu∼π∑v∈V(\olK2ℓ)\buvX\buXv≤π−1/2∗ρ(\olK)2ℓ⟨X,X⟩π.

Putting this into Equation 13 and Equation 12 we get

 ⟨X,\olKX⟩π≥−c0+12(k−1)π−1/2∗ρ(\olK)2ℓk1/2ℓ⟨X,X⟩π≥−β⟨X,X⟩π.

Repeating the derivation with in place of completes the proof. ∎

### 3.1 Max-Cut

The following theorem is quite similar to Section 3. In it, we allow  to have the large eigenvalue , and only certify that it has no large-magnitude negative eigenvalue. The subsequent corollary is deduced identically to Section 3. Given transition operator for the standard random walk on , let , where is the all-’s matrix. For parameters , let and define

 β=kπ−1/2∗2k1/2ℓρ(K′)2ℓ+2k1/2ℓ.

(Note that is equal to maximum-magnitude eigenvalue of  when the trivial  eigenvalue is excluded.) Then -local, degree- SOS can deduce the bound “”; more precisely, it can deduce the inequality

 ⟨X,\olKX⟩π≥−β⟨X,X⟩π.

Suppose we have a graph with transition operator and centered transition operator , and . Given , take , and take . Then for , it holds that -local degree- SOS can deduce the bound . In particular, if we think of as a max-cut CSP, it holds that -round Sherali–Adams can deduce the bound . Again, Section 3.1 implies Footnote 1 since in simple graphs, .

###### Proof of Section 3.1..

The proof is a modification of the proof of Section 3. Letting be the -spider vertices, instead of taking a signed stationary -indexed random walk in , we take two independent unsigned stationary -indexed random walks, and . For , define to be the -indexed vector with th coordinate equal to , and write for the concatenated vector . Also, for a parameter131313This parameter is introduced to fix a small annoyance; the reader might like to imagine at first. slightly less than , let be the PSD matrix from Section 3, and define the PSD block-matrix

 ˙Ψ=12⎛⎝1θΨ−Ψ−ΨθΨ⎞⎠.

Then as before, the -local, degree- SOS proof system can derive

 0≤\la˙Ψ,\E˙x˙x⊤\ra=ι\laΨ,Y\ra−\laΨ,Z\ra,where ι=1/θ+θ2,Y=\E[xx⊤],Z=\E[x1x⊤2], (14)

and (which will play the role of ) denotes the common distribution of and . Similar to Equations 11 and 10, we now have

 Y=2ℓ∑d=0⟨X,KdX⟩πA(d),

and by independence of and we have

 Z=⟨1,X⟩2π⋅J=⟨1,X⟩2π⋅2ℓ∑d=0A(d).

Thus applying Section 3 to Equation 14, our proof system can derive

 0≤ι⋅\parens∗c0⟨X,X⟩π+k1/2ℓ⟨X,KX⟩π+12(k−1)⟨X,K2ℓX⟩π−\E\bu∼π[X\bu]2⋅\parens∗c0+k1/2ℓ+12(k−1). (15)

By selecting appropriately, we can arrange for the factor on the right to equal