    # Limitations of the Hyperplane Separation Technique for Bounding the Extension Complexity of Polytopes

This note illustrates the limitations of the hyperplane separation bound, a non-combinatorial lower bound on the extension complexity of a polytope. Most notably, this bounding technique is used by Rothvoß (J ACM 64.6:41, 2017) to establish an exponential lower bound for the perfect matching polytope. We point out that the technique is sensitive to the particular choice of slack matrix. For the canonical slack matrices of the spanning tree polytope and the completion time polytope, we show that the lower bounds produced by the hyperplane separation method are trivial.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The extension complexity of a polytope , denoted by , is the minimum number of facets of any polytope that affinely projects onto . A linear description of such a polytope (together with the corresponding projection) is an extended formulation of . If we define the size of an extended formulation as the number of its inequalities, the minimum size of any extended formulation of equals .

Building on Yannakakis’ seminal work , there has recently been a renewed interest in the study of extended formulations (see, e.g., [19, 12, 25, 1, 13, 24, 18, 11, 8]). For many polytopes associated with NP

-hard combinatorial optimization problems, we now know that their extension complexity cannot be bounded by a polynomial in their dimension; among them are TSP polytopes, cut and correlation polytopes, and stable set polytopes

[19, 12]. An exponential lower bound also holds for the extension complexity of the (perfect) matching polytope  (even though one can optimize over it in polynomial time). Well-known polytopes that do admit nontrivial polynomial-size extended formulations include, among many others, parity polytopes [31, 2], independence polytopes of regular matroids , and two families of polytopes considered here, spanning tree polytopes and completion time polytopes. We refer to the surveys by Conforti et al.  and Kaibel  for an overview and more examples.

The spanning tree polytope of a connected graph

is the convex hull of the incidence vectors of the spanning trees in

,

 (1)

where denotes the incidence vector of . Although has exponentially many facets in general, there are extended formulations of size due to Wong  and Martin  (see also [31, 5]). Special classes of graphs admit even smaller extended formulations: For instance, Williams  gives a formulation of size for planar graphs. Some progress has also been made for graphs of bounded genus, more generally, by Fiorini et al. .

On the other hand, it is known that the extension complexity of a polytope is at least its dimension . Thus, if is the complete graph on vertices, is a trivial lower bound on the extension complexity of . The question whether this bound can be improved is open . Khoshkhah and Theis  show that every combinatorial lower bound (that is, one that depends only on the vertex-facet incidence structure of the polytope and, thus, is unable to distinguish between combinatorially equivalent polytopes ), achieves at most . In , the authors ask whether using non-combinatorial techniques instead may lead to stronger lower bounds.

One candidate is the hyperplane separation bound proposed by Fiorini  and applied by Rothvoß  in his proof of the exponential lower bound for the matching polytope. It is a lower bound on the extension complexity of a polytope that essentially depends on the coefficients in a given linear description of . We show that, for Edmonds’  canonical description of , the hyperplane separation technique fails to produce a lower bound stronger than . In this sense, the trivial dimension bound is already at least as strong. Our proof in Section 3 relies on a dual interpretation of the method, which will be explained in Section 2.

At the same time, we stress that our result does not rule out the possibility of obtaining meaningful bounds for a description of by a different system of linear inequalities. To the best of the author’s knowledge, this issue has not been addressed explicitly in the study of the hyperplane separation technique. In particular, we consider a description of obtained from the canonical one by suitably scaling the inequalities. While scaling in this particular way does improve on the hyperplane separation bound, we are able to prove that it can only do so by a factor of at most .

The limitations of the hyperplane separation method can be observed in another family of well-understood polytopes as well. Consider jobs with processing times to be scheduled on a single machine. Every permutation (the symmetric group on ) defines a feasible schedule without idle time where job is completed at time for . The completion time polytope is defined as

Wolsey observed (see remark in ) that is a zonotope, the affine linear image of a hypercube with facets. In fact, no smaller extended formulation of is known to date. In case that for all , is known as the th permutahedron and equals For this polytope, Goemans  gives an asymptotically minimal extended formulation of size . The lower bound in  is established via a purely combinatorial argument. Since any two completion time polytopes on jobs are combinatorially equivalent for strictly positive processing times, is therefore best possible for any combinatorial lower bound on , for any .

In Section 4, we show that, regardless of , the hyperplane separation bound for the canonical linear description of due to Wolsey  and Queyranne  is at most a constant. In fact, we obtain our result in the more general setting of graphic zonotopes, a natural generalization of completion time polytopes inspired by Wolsey’s observation.

## 2 Slack matrices and the hyperplane separation bound

Given a nonnegative matrix , the nonnegative rank of , denoted by , is defined as the minimum such that for two nonnegative matrices . Equivalently, it is the minimum such that can be written as the sum of nonnegative matrices of rank one .

Consider a polytope for some finite set and such that every inequality in defines a nonempty face of . The matrix whose th column equals is a slack matrix of . If is the set of vertices of , we refer to the corresponding slack matrix as the slack matrix of with respect to the linear description above. In particular, any slack matrix of a polytope is a nonnegative matrix whose nonnegative rank satisfies the following property due to Yannakakis .

###### Proposition 1.

Let be a slack matrix of a polytope . Then .

This result is the key to many techniques for bounding the extension complexity of . This paper is concerned with one such technique. For two matrices , , we denote by their Frobenius inner product and let .

###### Proposition 2 (Hyperplane separation bound ).

Let with at least one positive entry, and let denote the set of rank-one matrices in . We further let

 (2)

where for every . Then

Normalizing such that in the definition of , we may rewrite creftype 2 as follows:

 =sup{⟨S,X⟩:X∈Rm×n,ρ(X)=1} =sup{⟨S,X⟩:X∈Rm×n,ρ(X)≤1} =max{⟨S,X⟩:X∈Rm×n,⟨X,R⟩≤1∀R∈R}. (3)

In the last step, we used the fact that the supremum of is finite: Any with satisfies for all with singleton support, that is, every entry of is at most one. As is nonnegative, the sum of its entries is an upper bound on .

Note that Section 2

is a linear program (LP). From strong LP duality, we obtain the following dual characterization of the hyperplane separation bound, which already appears in

, although derived differently.

###### Proposition 3.

In the situation of Proposition 2, we have that

 (4)

The feasible region of the LP in creftype 4 corresponds to a particular type of nonnegative factorization of , namely the decomposition of into the weighted sum of 0/1 matrices of rank one. Not only will this observation be the key ingredient of our proofs in Sections 4 and 3, it also motivates an alternate proof of Proposition 2, which is slightly simpler than the original one in .

###### Proof of Proposition 2.

Without loss of generality, we may assume that . Let for and rank-one matrices . We claim that for every , we have that , i.e., for some coefficients . Then defined by , is a feasible solution of the LP in creftype 4, and by Proposition 3.

It remains to prove the claim. Let be of rank one. Then for some , which, by scaling, can be assumed to be -valued vectors. Then can be written as a convex combination for some and . Similarly, for some , and . Then

 A=vwT=p∑i=1q∑j=1λiμj⋅vi(wj)T∈Randp∑i=1q∑j=1λiμj=1.\qed

Note that is invariant under multiplying by positive scalars, under transposition, and under permutations of rows and columns of , respectively. It further satisfies the following two useful properties on submatrices, both of which are immediate consequences of Proposition 3.

###### Lemma 1.

Let for nonnegative matrices and . Then

,

.

Recall that any two slack matrices of a given polytope have identical nonnegative rank. (This is a consequence of Proposition 1.) In this sense, the nonnegative rank is well-defined for polytopes. The situation for the hyperplane separation bound, however, is fundamentally different. Let us highlight this difference with two examples.

Consider the standard hypercube and let denote its slack matrix w.r.t. the (minimal) description . The inequality is valid for (defining the vertex ). Adding this inequality to the minimal description of adds one row to , which equals the sum of the rows corresponding to the facets defined by for . Let denote the slack matrix with this additional row. Then and it is not difficult to check that . Thus, .

Not even slack matrices w.r.t. minimal linear descriptions behave identically under the hyperplane separation bound: The -simplex spanned by the canonical unit vectors in and the origin is the set of all satisfying for , and for any . Every inequality defines a facet of the simplex. Modulo permutations of rows and columns, the associated slack matrix is obtained from the identity by multiplying the first row by . One can show that while .

Motivated by the latter example, let us consider the effect of normalizing the rows of a slack matrix independently. Note that this leaves the nonnegative rank unchanged.

###### Lemma 2.

Let with rows , , and suppose that every row contains at least one positive entry. Let denote the matrix obtained from by dividing the th row by . Then

where .

###### Proof.

We clearly have that . Let be a feasible solution of the LP in Section 2 for (and, thus, for too). Denoting the th row of by , we obtain

 ⟨S′,X⟩=m∑i=1∥si∥−1∞(si)Txi≥m∑i=1∥S∥−1∞(si)Txi=∥S∥−1∞⟨S,X⟩.

Now observe that defines a partition of into row submatrices consisting of all rows of maximum norm . The corresponding row submatrix of is . It follows that

using parts Lemma 1 and Lemma 1 of Lemma 1 in the first and second inequality, respectively. ∎

By transposition, an analogous statement holds true for normalizing columns instead of rows.

## 3 The spanning tree polytope

Let be a connected graph. The spanning tree polytope of given in creftype 1 is completely described by the following system due to Edmonds :

 Pst(G)={x∈RE≥0:x(E)=|V|−1,x(E(U))≤|U|−1∀∅≠U⊆V}, (5)

where is the set of all edges with both endpoints in . We will denote an edge by . Further, let denote the set of vertices of a subgraph of and let for and denote the number of connected components of the subgraph .

###### Theorem 1.

Let be a connected graph and let denote the slack matrix of w.r.t. the description creftype 5. Then

###### Proof.

Since there are many nonnegativity constraints in creftype 5, it suffices to consider the row submatrix of restricted to the set inequalities in creftype 5 only, which will be denoted by again. The bound for the entire slack matrix then follows from Lemma 1Lemma 1.

Indexing the rows of by the nonempty subsets of and the columns by the spanning trees in , the entry in row and column equals . First, observe that

 ∥SG∥∞≥12|V|−1. (6)

For, if is a spanning tree in and a stable set in , then . Because is a bipartite graph, both vertex classes in a bipartition are stable sets in . At least one of them is of size .

We shall now construct a nonnegative factorization of which is feasible in the sense of the dual LP in creftype 4. Our construction is inspired by Martin’s extended formulation . For every spanning tree in , let be the set of all triples of vertices such that is an edge of the unique - path in . From each nonempty subset , we choose an arbitrary representative . For every triple where , define the set

 R(i,j,k):={U⊆V:i∈U,j∉U,k=k(U)}×{T % spanning tree:(i,j,k)∈τ(T)}.

For every such triple , there is a unique 0/1 matrix indexed in the same way as whose support equals . We claim that these matrices, which clearly are of rank at most one, add up to . Indeed, let and be a spanning tree in . Letting , it suffices to show that

 ∣∣{(i,j,k)∈V3:ij∈E,(U,T)∈R(i,j,k)}∣∣=c−1.

If , the statement is clear. Let , and let be the connected components of the subgraph . Without loss of generality, we may assume that . For , we say that a path in connects and if its two endpoints are and some vertex in , and no other vertex on the path belongs to . For every , there exists a unique path in connecting and . Let be its endpoint in , and let be the neighbour of on the path. Then , and for every .

On the other hand, if for some with , then and , say, . Since , the path connecting and in cannot visit any other vertex in . Hence, it connects and and we conclude that .

This shows that the sets induce a decomposition of into summands which are 0/1 matrices. From creftype 6 and Proposition 3, we conclude that

In the light of the previous section, let us briefly discuss how normalizing the rows of the slack matrix defined above may strengthen the hyperplane separation bound. Note that the entries of are in . Normalizing row by row, we obtain another slack matrix with . Lemma 2 then implies that

It is easy to see that, if is the complete graph, is the slack matrix of w.r.t. the description obtained from creftype 5 by dividing every set inequality for by .

## 4 Graphic zonotopes

Given two sets , their Minkowski sum is . A zonotope is the affine linear image of a hypercube. Equivalently, every zonotope is the Minkowski sum of a finite number of line segments, where a line segment in is a set for some . Given a graph on , a graphic zonotope of is the Minkowski sum of line segments in the directions (see ), where denotes the th canonical unit vector in . Let be a symmetric nonnegative matrix. We associate with a zonotope as follows:

 Z(A):=∑1≤j≤najjuj+∑1≤i

Up to translations, the graphic zonotopes of graphs on vertices are exactly those of the above form for some symmetric and nonnegative matrix (where if and only if ).

We will now derive a description of the facets of , generalizing remarks in [32, Example 7.15] and . To this end, define the set function by

 [n]⊇S↦gA(S):=∑i,j∈S:i≤jaij.

Note that is supermodular, and it is strictly supermodular if and only if is positive. The supermodular base polytope (see, e.g., ) of a supermodular function with is defined as

 B(g):={x∈Rn:x([n])=g([n]),x(S)≥g(S)∀S⊆[n]}. (8)
###### Lemma 3.

Let be symmetric and nonnegative. Then .

###### Proof.

It suffices to show that, for every linear functional , the minima of over and coincide. After a permutation of the coefficients of , we may assume that . The greedy rule (see ) then implies that a minimizer over is given by

 ¯¯¯xj:=gA([j])−gA([j−1])=j∑i=1aij,j=1,…,n.

Minimizing over the zonotope can be done over each summand in the Minkowski sum in creftype 7 individually. For , it is easy to see that the minimum of on the line segment is attained in the first endpoint since . Hence the minimum over is attained in the point . ∎

###### Lemma 4.

For every symmetric positive matrix , the th permutahedron and are combinatorially equivalent.

###### Proof.

From the proof of Lemma 3, we conclude that the vertices of and the permutations in correspond via the map

 Sn∋π⟼xπ∈Rn;xπj=π(j)∑i=1aπ−1(i),j,j=1,…,n. (9)

Since is positive, this is a bijection. Moreover, is strictly supermodular and therefore, all inequalities in creftype 8 for define facets of .

Now let and . Let denote the vertex of induced by via the bijection in creftype 9. Then

 xπ(S)−gA(S)=∑i∈[n],j∈S:π(i)≤π(j)aij−∑i,j∈S:π(i)≤π(j)aij=∑i∉S,j∈S:π(i)≤π(j)aij, (10)

using symmetry of in the first equation. Since is positive, it follows that belongs to the facet defined by if and only if . In other words, creftype 9 is a bijection between the vertices of and those of the th permutahedron which preserves all vertex-facet incidences. ∎

Given the structural insights above, it is not difficult to recognize that graphic zonotopes do indeed generalize completion time polytopes: If for some column vector , then is the image of

under the linear transformation which sends a vector

to its componentwise product with . Note that has rank one in this case. Conversely, every symmetric positive of rank one can be written as an outer product of some positive vector with itself. Indeed, if with , then by symmetry. Letting , we obtain . Thus, up to a coordinate transformation, the completion time polytopes for strictly positive processing times are exactly the zonotopes for symmetric positive rank-one matrices .

Moreover, this generalization is compatible with both the upper and lower bounds on the extension complexity known for completion time polytopes. Recall that , as defined in creftype 7, can be written as the affine linear image of the hypercube and, hence, . If is positive, is a lower bound on by combining Lemma 4 and the lower bound for the th permutahedron in .

Let us now study the hyperplane separation bound in the case of for symmetric and nonnegative . To this end, let denote the slack matrix of w.r.t. its linear description creftype 8. In what follows, we shall identify the rows of with the nontrivial subsets of and the columns with the permutations in .

###### Lemma 5.

Let be symmetric and positive, and let be the slack matrix of w.r.t. creftype 8. Then

 ∥MA∥∞=maxS⊆[n]∑i∉S,j∈Saij.
###### Proof.

From creftype 10, we conclude that the entry in row and column of equals

 xπ(S)−gA(S)=∑i∉S,j∈S:π(i)≤π(j)aij≤∑i∉S,j∈Saij

with equality if and only if . ∎

###### Theorem 2.

Let be symmetric and positive, and let be the slack matrix of w.r.t. creftype 8. Then

###### Proof.

For every pair , let

 R(i,j):={S⊆[n]:i∉S,j∈S}×{π∈Sn:π(i)≤π(j)}

and let denote the unique 0/1 matrix indexed like whose support equals . Note that has rank one and, by creftype 10,

 ∑i≠jaijˆR(i,j)=MA.

The first inequality in the statement then follows from Proposition 3.

In order to show the second inequality, let be a subset attaining the maximum in the equation in Lemma 5. We claim that

 ∑i∈S∗∖{j}aij≤∑i∉S∗aij for every j∈S∗. (11)

Indeed, suppose that there were some such that Letting and using the symmetry of , we obtain

 ∑i∉S′,j∈S′aij−∑i∉S∗,j∈S∗aij=∑j∈S∗∖{k}akj−∑i∉S∗aik>0,

contradicting the choice of . From creftype 11, it follows that

 ∑i,j∈S∗:i≠jaij=∑j∈S∗∑i∈S∗∖{j}aij≤∑j∈S∗∑i∉S∗aij=∥MA∥∞.

The symmetric argument for yields

 ∑i≠jaij =∑i,j∈S∗:i≠jaij+∑i,j∉S∗:i≠jaij+2∑i∉S∗,j∈S∗aij≤4∥MA∥∞.

This completes the proof. ∎

We conclude this section with a remark on the normalized slack matrix proposed in Lemma 2. More precisely, let us revisit the special case of the th permutahedron. Recall that this is the zonotope where is the all-one matrix. From Lemma 5 and the proof thereof, we have that the slack matrix as defined above satisfies and the set of row maxima of equals . One can further show that

 ∑δ∈Δ1δ=⌊n/2⌋∑k=11k(n−k)=Θ(lognn).

From Lemma 2, we conclude that where is the slack matrix obtained from by normalizing rows independently. Recall that .

## 5 Concluding remarks

For both families of polytopes studied in this note and their canonical linear descriptions, we have shown that the hyperplane separation technique is unable to improve on the currently best known lower bounds on their extension complexity. In contrast to the nonnegative rank, the hyperplane separation bound depends on the choice of slack matrix. By making a more careful choice, it is conceivable that the technique does indeed yield more meaningful bounds than the ones in Sections 4 and 3.

In particular, the rows and columns of a given nonnegative matrix can be scaled in such a way that the maximum entry in every row and column equals one. While preserving the nonnegative rank, this strengthens the hyperplane separation bound as argued in Section 2. In other words, the hyperplane separation method produces the strongest lower bounds for slack matrices which have been scaled in this way.

How much can one gain by this? Although Lemma 2 attempts to provide an answer to this question, it is not clear to the author whether the ratio in Lemma 2 attains the given upper bound when applied to the polytopes considered in this note and their canonical slack matrices. For instance, assuming this to hold true (up to a multiplicative constant) for the permutahedron, the hyperplane separation method would be capable of confirming Goemans’ lower bound  in a non-combinatorial way. In the case of the spanning tree polytope of , the improvement factor gained by scaling rows in the sense of Lemma 2 is at most . Of course, the hyperplane separation bound of any slack matrix of is at most .

Acknowledgements. The author is grateful to Andreas S. Schulz and Stefan Weltge for helpful discussions and comments.

## References

•  M. Aprile, S. Fiorini. Regular matroids have polynomial extension complexity. arXiv:1909.08539 (2019)
•  R. D. Carr, G. Konjevod. Polyhedral combinatorics. In: H. J. Greenberg (ed.), Tutorials on Emerging Methodologies and Applications in Operations Research, Springer, pp. 2-1–2-46 (2005)
•  J. E. Cohen, U. G. Rothblum. Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra Appl. 190, 149–168 (1993)
•  M. Conforti, G. Cornuéjols, G. Zambelli. Extended formulations in combinatorial optimization. Ann. Oper. Res. 204.1, 97–143 (2013)
•  M. Conforti, V. Kaibel, M. Walter, S. Weltge. Subgraph polytopes and independence polytopes of count matroids. Oper. Res. Lett. 43.5, 457–460 (2015)
•  J. Edmonds. Submodular functions, matroids, and certain polyhedra. In: R. Guy, H. Hanani, N. Sauer, J. Schönheim (eds.), Combinatorial Structures and Their Applications, Gordon and Breach, New York, pp. 69–87 (1970)
•  J. Edmonds. Matroids and the greedy algorithm. Math. Prog. 1.1, 127–136 (1971)
•  Y. Faenza, S. Fiorini, R. Grappe, H. R. Tiwary. Extended formulations, nonnegative factorizations, and randomized communication protocols. In: Combinatorial Optimization, Springer, pp. 129–140 (2012)
•  S. Fiorini. Personal communication in  (2013)
•  S. Fiorini, T. Huynh, G. Joret, K. Pashkovich. Smaller extended formulations for the spanning tree polytope of bounded-genus graphs. Discrete Comput. Geom. 57.3, 757–761 (2017)
•  S. Fiorini, V. Kaibel, K. Pashkovich, D. O. Theis. Combinatorial bounds on nonnegative rank and extended formulations. Discrete Math. 313.1, 67–83 (2013)
•  S. Fiorini, S. Massar, S. Pokutta, H. R. Tiwary, R. D. Wolf. Exponential lower bounds for polytopes in combinatorial optimization. J. ACM 62.2, 17 (2015)
•  S. Fiorini, T. Rothvoß, H. R. Tiwary. Extended formulations for polygons. Discrete Comput. Geom. 48.3, 658–668 (2012)
•  S. Fujishige. Submodular Functions and Optimization, 2nd edition, vol. 58 of Annals of Discrete Mathematics, Elsevier (2005)
•  M. X. Goemans. Smallest compact formulation for the permutahedron. Math. Prog. 153.1, 5–11 (2015)
•  V. Kaibel. Extended formulations in combinatorial optimization. In: Optima 85 (2011)
•  V. Kaibel, K. Pashkovich. Constructing extended formulations from reflection relations. In: M. Jünger, G. Reinelt (eds.), Facets of Combinatorial Optimization, Springer, pp. 77–100 (2013)
•  V. Kaibel, K. Pashkovich, D. O. Theis. Symmetry matters for sizes of extended formulations. SIAM J. Discrete Math. 26.3, 1361–1382 (2012)
•  V. Kaibel, S. Weltge. A short proof that the extension complexity of the correlation polytope grows exponentially. Discrete Comput. Geom. 53.2, 397–401 (2015)
•  K. Khoshkhah, D. O. Theis. On the combinatorial lower bound for the extension complexity of the spanning tree polytope. Oper. Res. Lett. 46.3, 352–355 (2018)
•  R. K. Martin. Using separation algorithms to generate mixed integer model reformulations. Oper. Res. Lett. 10.3, 119–128 (1991)
•  A. Postnikov, V. Reiner, L. Williams. Faces of generalized permutohedra. Doc. Math. 13, 207–273 (2008)
•  M. Queyranne. Structure of a simple scheduling polyhedron. Math. Prog. 58.1-3, 263–285 (1993)
•  T. Rothvoß. Some 0/1 polytopes need exponential size extended formulations. Math. Prog. 142.1-2, 255–268 (2013)
•  T. Rothvoß. The matching polytope has exponential extension complexity. J. ACM 64.6, 41 (2017)
•  J. Seif. Bounding techniques for extension complexity. Master’s thesis, Université Libre de Bruxelles (2017)
•  S. Weltge. Sizes of linear descriptions in combinatorial optimization. PhD dissertation, Otto-von-Guericke-Universität Magdeburg (2016)
•  J. C. Williams. A linear-size zero-one programming model for the minimum spanning tree problem in planar graphs. Networks 39.1, 53–60 (2002)
•  L. A. Wolsey. Mixed integer programming formulations for production planning and scheduling problems. Invited talk at the 12th International Symposium on Mathematical Programming, MIT, Cambridge (1985)
•  R. T. Wong. Integer programming formulations of the traveling salesman problem. In: Proceedings of the IEEE International Conference on Circuits and Computers, pp. 149–152 (1980)
•  M. Yannakakis. Expressing combinatorial optimization problems by linear programs. J. Comput. System Sci. 43.3, 441–466 (1991)
•  G. M. Ziegler. Lectures on Polytopes, vol. 152 of Graduate Texts in Mathematics, Springer (2012)