A homogeneous polynomial in is said to be hyperbolic with respect to a direction if and all its univariate restrictions along the direction are real-rooted, i.e., for every , the polynomial has all real roots. Gårding showed that every hyperbolic polynomial is associated with a closed111The usual definition considers open cones, but we will work with their closures instead. convex cone defined as follows,
The cone is referred to as the hyperbolicity cone associated with the polynomial .
From the standpoint of convex optimization, hyperbolicity cones yield a rich family of convex sets that one can efficiently optimize over — in particular, interior point methods can be used to efficiently optimize over the hyperbolicity cone for a polynomial , given an oracle to evaluate and its gradient and Hessian [Gül97, Ren06]. This optimization primitive, referred to as hyperbolic programming, captures linear and semidefinite programming as special cases. Specifically, the hyperbolicity cone for the polynomial is the positive orthant
, which corresponds to linear programming, while semidefinite programming arises from the symmetric determinant polynomial
. Is hyperbolic programming as an algorithmic primitive strictly more powerful than semidefinite programming? Can hyperbolicity cones other than these two be harnessed towards obtaining better algorithms for combinatorial optimization problems? These compelling questions remain open.
Lately, the relationship between hyperbolicity cones and the semidefinite cone has been a subject of intense study. It is easy to see that every section of the semidefinite cone is a hyperbolicity cone, so it is natural to ask whether the converse holds, i.e., if every hyperbolicity cone can be realized as a section of the semidefinite cone. Formally, a spectrahedral cone is specified by:
for some matrices . It is conjectured that every hyperbolicity cone is a spectahedral cone.
(Generalized Lax Conjecture) Every hyperbolicity cone is a spectahedral cone, i.e., a linear section of the cone of positive semidefinite matrices in some dimension.
The Lax conjecture in its original stronger algebraic form asked whether every polynomial in three variables hyperbolic with respect to the direction , could be written as for some symmetric matrices (sometimes called a definite determinantal representation). This immediately implies that all hyperbolicity cones in three dimensions are spectrahedral, and was proved by Helton and Vinnikov and Lewis, Parrilo, and Ramana [HV07, LPR05]. The algebraic conjecture is easily seen to be false for by a count of parameters, since the set of hyperbolic polynomials is known to have nonempty interior [Nui69] and is of dimension , whereas the set of tuples of matrices has dimension . This led to the weaker conjecture that for every hyperbolic there is an integer such that admits a definite determinantal representation, which was disproved by Bränden in [Brä11]. The Generalized Lax conjecture, which is a geometric statement, is equivalent to the yet weaker algebraic statement that for every hyperbolic there is a hyperbolic such that and admits a definite determinantal representation.
For several special classes of hyperbolic polynomials, the corresponding hyperbolicity cones are known to be spectrahedral. The elementary symmetric polynomial of degree in variables is given by . Bränden [Brä14] showed that the hyperbolicity cones of elementary symmetric polynomials are spectrahedral with matrices of dimension . If a polynomial is hyperbolic with respect to a direction , then its directional derivatives along are hyperbolic too [Går59]. Directional derivatives of the polynomial [Zin08, San13, Brä14, COSW04] and the first derivatives of the determinant polynomial [Sau17] are also known to satisfy the generalized Lax conjecture. Amini [Ami16] has shown that certain multivariate matching polynomials are hyperbolic and that their cones are spectrahedral, of dimension .
Given the above examples, it is natural to wonder whether exponential blowup in dimension is an essential feature of passing from hyperbolicity cones to spectrahedral representations, and even assuming the generalized Lax conjecture to be true, the size of the spectrahedral representation of a hyperbolicity cone is interesting from a complexity standpoint. In this paper, we obtain exponential lower bounds on the size of the spectrahedral representation in general, even if the representation is allowed to be approximate (up to an exponentially small error).
Recall that the Hausdroff distance between two cones and is defined as
where is the unit ball in . We say that a spectrahedral cone is an -approximate spectrahedral representation of another cone if . Our main theorem is the following.
Theorem 2 (Main Theorem).
There exists an absolute constant such that for all sufficiently large , there exists an n-variate degree hyperbolic polynomial whose hyperbolicity cone does not admit an approximate spectrahedral representation of dimension , for .
Our proof is analytic and does not rely on algebraic obstacles to representability; in fact the polynomials we construct have very simple coefficients (they are essentially binary perturbations of ). However, since there are coefficients, the examples require bits to write down. It is still unknown whether itself admits a low dimensional spectrahedral representation, though Saunderson and Parrilo [SP15] have shown that if one allows projections of sections of semidefinite cones, then such a represntation exists with size .
Algebraically, the notion of (exact) spectrahedral representation of a hyperbolicity cone for a polynomial corresponds to an algebraic identity of the form,
where contains . Thus, our main theorem implies the existence of a degree polynomial such that the degree of any identity of the above form is at least .
1.1 Proof Overview
The starting point for our proof is the theorem of Nuij [Nui69], which says that the space of hyperbolic polynomials of degree in variables has a nonempty interior, immediately implying that it has dimension . The Generalized Lax Conjecture concerns the cones of these polynomials, which are geometric rather than algebraic objects. If we could show that this space of cones also has large “dimension” in some appropriate quantitative sense, and that the maps between the hyperbolicity cones and their spectrahedral representations are suitably well-behaved, then it would rule out the existence of small spectrahedral representations for all of them uniformly, since such representations are parameterized by tuples of matrices, which have dimension .
The difficulties in turning this idea into a proof are: (1) There are no quantitative bounds on Nuij’s theorem. (2) the space of hyperbolicity cones is hard to parameterize and it is not clear how to define dimension. (3) the mapping from hyperbolicity cones to their representations can be arbitrary, and needn’t preserve the usual notions of dimension anyway. We surmount these difficulties by a packing argument, which consists of the following steps.
Exhibit a large family of hyperbolicity cones of size , every pair of which are at least apart from each other in the Hausdroff metric between the cones.
Show that the spectrahedral representations of two distant cones and are distant from each other, once the representations are appropriately normalized. Formally, the matrices and representing the two cones are at least away in operator norm if (see Lemma 14).
We work only with cones which contain the positive orthant in order to ensure that they have normalized representations.
By considering the volume, there is a upper bound on the number of pairwise -distant spectrahedral representations in matrices, thus giving the lower bound on .
By far, the most technical part of the proof is the first step exhibiting a large family of pairwise distant hyperbolicity cones. The set of hyperbolic polynomials is known to have a non-empty interior in the set of all degree homogenous polynomials. Although this implies the existence of a full-dimensional family of hyperbolic polynomials, it is not clear how far apart they are quantitatively. Moreover, without any further understanding of the structure of the polynomials, it is difficult to argue that their cones are far in the Hausdorff metric. To this end, we work with an explicit family of hyperbolic polynomials which are all perturbations of the elementary symmetric polynomials, whose cones we are able to understand. Specifically, we will show the following:
There exists an explicit family of perturbations of the degree elementary symmetric polynomial that are all hyperbolic, and pairwise distant from each other. These perturabations are indexed by a hypercube of dimension , as depicted in Figure 1. The subspace of perturbations is carefully chosen to preserve real-rootedness of all the restrictions, thereby preserving the hyperbolicity of the polynomial (see Section 2), as well as to yield perturbations of an especially simple structure.
The hyperbolicity cones for every pair of polynomials in are far from each other. In order to lower bound the Hausdroff distance between these cones, we identify an explicit set of points on the boundary of the hyperbolicity cone for as markers. We will lower bound the perturbation of these markers as the polynomial is perturbed, in order to argue that the corresponding hyperbolicity cone is also perturbed (see Section 4). Again, the structure of the chosen subspace ensures that there are no “interactions” between the markers, and the analysis is reduced to understanding the perturbation of a single univariate Jacobi polynomial.
2 Many Hyperbolic Perturbations of
In this section we prove that even though is not in the interior of the set of Hyperbolic polynomials, there is a large subspace of homogeneous polynomials of degree in variables such that all sufficiently small perturbations of in this subspace remain hyperbolic.
The subspace will be spanned by certain homogeneous polynomials corresponding to matchings. For any matching containing edges on , define the polynomial
We say that a matching on fully crosses a subset of if every edge of has exactly one endpoint in .
Lemma 3 (Many Uniquely Crossing Matchings).
There is a set of matchings on of size at least
and a set of subsets such that for every there is a unique matching which fully crosses it, and for every there is at least one which it fully crosses.
Moreover, for every indicator vector
Moreover, for every indicator vector, there is a unique such that , so the dimension of the span of
is exactly .
Let denote the set of all matchings on and let be the set of all subsets of . Let
be the number of matchings fully crossing a fixed set . Let be a random subset of
in which each matching is included independently with probabilityfor some to be determined later, and let
be the indicator random variable that. For any set , define the random variable
and observe that
Call a set good if and let be the number of good . Observe that
Setting we therefore have
so with nonzero probability there are at least good subsets.
Let be the set of good subsets. Finally, remove from all the matchings that do not fully cross a good subset. Since there are at least good subsets, every good subset fully crosses at least one matching in , and at most subsets cross any given matching, the number of matchings whichremain in is at least
The moreover part is seen by observing that if and only if fully crosses . ∎
In fact, the dimension of the span of all of the polynomials is exactly , by properties of the Johnson Scheme, but in order to obtain the additional property above we have restricted to .
Henceforth we fix to be a basis of and define the norm of a polynomial
as . We will use the notation
to denote the product of the largest entries of a vector in absolute value, and occasionally we will write and for a real-rooted polynomial , which means applying or to its roots. The operator refers to differentiation with respect to .
The main theorem of this section is:
Theorem 5 (Effective Relative Nuij Theorem for ).
then is hyperbolic with respect to .
Recalling the definition of hyperbolicity, our task is to show that all of the restrictions
remain real-rooted after perturbation by . Many of these restrictions lie on the boundary of the set of (univariate) real-rooted polynomials, or arbitrarily close to it, so it is not possible to simply discretize the set of by a net and choose to be uniformly small on this net; one must instead carry out a more delicate restriction-specific analysis which shows that for small , the perturbation is less than the distance of to the boundary of the set of real-rooted polynomials, for each . Since we are comparing vanishingly small quantities, it is not a priori clear that such an approach will yield an effective bound on depending only on and ; Lemma 6 shows that this is indeed possible.
Proof of Theorem 5.
Fix any nonzero vector and perturbation with and consider the perturbed restriction:
Let and observe that since is translation invariant, we have , so in fact
Let be the largest constant such that is real-rooted for all (note that could be zero if has a repeated root). It is sufficient to show that . Observe that
since the boundary of the set of real-rooted polynomials consists of polynomials with repeated roots, and at any double root of the we have . Let be the minimizer in (1) and replace by , noting that this translates to and does not change or , so that we now have:
On the other hand, observe that:
Thus we have as long as or
which is implied by
by Lemma 6, as advertised. ∎
The following lemma may be seen as a quantitative version of the fact that if a real-rooted polynomial has two consecutive zero coefficients then it must have a root of multiplicity at zero.
If satisfies then
Let and let , noting that is real-rooted of degree exactly
. Assume for the moment that all of theare distinct and that for all (note that this is equivalent to assuming that the last coefficients of are nonzero). Note that these conditions imply that all of the polynomials have distinct roots, since differentiation cannot increase the multiplicity of a root.
If then the claim is trivially true since
Observe that behaves predictably under differentiation:
We will show by induction that:
Case Let and be the smallest (in magnitude) negative and positive roots of , respectively. Let be the unique root of between and ; assume without loss of generality that (otherwise consider the polynomial ). Let and be the smallest in magnitude negative and positive roots of other than , so that . There are two subcases, depending on whether or not is close to zero — if it is, then it prevents any root from shrinking too much under differentiation, and if it is not, the hypothesis shows that and are comparable, which also yields the conclusion.
Subcase . In this case we may assume that is witnessed by the roots of excluding , call this set , losing a factor of at most . Observe that every root in is separated from zero by another root of , so such roots shrink by at most under differentiation by Lemma 8. Noting that , we have by interlacing that:
and our task is reduced to showing is not small compared to
The hypothesis implies that ; applying Lemma 8, we find that the magnitudes of the innermost roots of must be comparable:
We now have
so we conclude that
Case . We proceed by induction. Assume is witnessed by a set of roots , where contains negative roots and contains positive ones. If there is a negative root not in and a positive root not in then as before every root in is separated from zero by another root of , and by Lemma 7 we have
so we are are done. So assume all of the negative roots are contained in ; since this implies that there are at least two positive roots not in ; let be the largest positive root not contained in . Let and be the negative and positive roots of of least magnitude. There are two cases:
. This means that we can delete from and add to , and reduce to the previous situation, incurring a loss of at most , which means by (4):
. By Lemma 7, the smallest in magnitude negative root of has magnitude at least
and all the positive roots decrease by at most upon differentiating by Lemma 7, whence we have
To finish the proof, the requirements that for all and that all coordinates of are distinct may be removed by a density argument, since the set of for which this is true is dense in the set of satisfying .
We suspect that the dependence on and in the above lemma can be improved, and it is even plausible that it holds with a polynomial rather than exponential dependence of on . Since we do not know how to do this at the moment, we have chosen to present the simplest proof we know, without trying to optimize the parameters.
The following lemma is a quantitative version of the fact that the roots of the derivative of a polynomial interlace its roots.
Lemma 8 (Quantitative Interlacing).
If is real rooted of degree then every root of between two distinct consecutive roots of divides the line segment between them in at most the ratio .
Begin by recalling that if has distinct roots then the roots of satisfy for :
Note that the solution is monotone increasing in the on the LHS and monotone decreasing in the on the RHS. Thus, is at least the solution to:
which means that it is at least . A similar argument shows that it it at most . Adding the common roots of and back in, we conclude that these inequalities are satisfied by all of the .∎
3 Separation in Restriction Distance
For a parameter to be chosen later and let
For any polynomial hyperbolic with respect to , define the restriction embedding
If and are distinct then both and are hyperbolic with respect to and
Since , we have so by Theorem 5 both of them must be hyperbolic with respect to .
Since , suppose and for some matching , and let be a set which fully crosses . By Lemma 3, is the only matching in which crosses , so we have and for all other . Thus, along the restriction , one has
where we have again used that the are translation invariant. Note that has positive leading coefficient, so subtracting a constant from it increases its largest root. Thus, our task is reduced to showing that
To analyze the behavior of this perturbation, first we note that
Since has roots in and the roots of the derivative of a polynomial interlace its roots, we immediately conclude that the zeros of satisfy . Again by interlacing, we see that and for , whence is monotone and convex above . Thus, is the least such that . Let be a parameter to be set later. Then either or we have by convexity that
The first term is zero and we can upperbound the second term as:
since for every . Thus, we have
whenever , which is less than for . This means that in either case we must have
as desired. ∎
4 Separation in Hausdorff Distance
Suppose . There must be some restriction along which the intersections with the boundaries of the cones differ by . Moreover, this must be fully crossed by some matching for which or is nonzero, since otherwise we would have . By Lemma 3, there is a unique such and we may assume that and , and moreover for all other , so that we have
where is the largest root of the polynomial from (5).
denote the corresponding points in on the boundaries of and , respectively. Let
be the hyperplane tangent toat . Let be a unit vector normal to . Since the hyperbolicity cones are convex, the distance from to is at least the distance to , which is given by:
Normalizing so that is a unit vector, we obtain:
so if we can prove a uniform lower bound on this quantity over all and corresponding to , then we are done.
Computing the normal we find that
The first inner product is just the directional derivative of along at , which is:
by Lemma 11, proven below.
We now prove crude upper bounds on , which will be negligible when is small. First we have
since because has roots in and
To control , we compute the th coordinate of :
where zero if and if . Since has coordinates in , for all and
Applying the triangle inequality and noting that gives
Finally, we have
where is the vector obtained by deleting the coordinate of . Since these coordinates are bounded in magnitude by , the above norm is bounded by
and applying the triangle inequality once more we get
Lemma 11 (Sensitivity of Jacobi Root).
Let be the largest root of . Then
The proof is deferred to the appendix.
5 Separation of Matrix Parameterizations
Given , define the cone
Here is said to be a spectrahedral representation of the cone .
A spectrahedral representation of a cone as ,is said to be a normalized if,
If a spectrahedral cone contains the positive orthant then admits a normalized representation.
Let be a spectrahedral representation of . Let be the subspace in the kernel of all the , and let denote the projection on to . It is easy to check that for all ,
By a basis change of , that contains a basis for , we obtain matrices in where such that