A polynomial optimization problem (POP) is an optimization problem of the form
where are polynomial functions in variables and with real coefficients. It is well-known that polynomial optimization is a hard problem to solve in general. For example, simply testing whether the optimal value of problem (1) is smaller than or equal to some rational number is NP-hard already when the objective is quadratic and the constraints are linear . Nevertheless, these problems remain topical due to their numerous applications throughout engineering, operations research, and applied mathematics (see, e.g., [15, 6, 2]). In this paper, we are interested in obtaining lower bounds on the optimal value of problem (1). We focus on a class of methods which construct hierarchies of tractable convex optimization problems whose optimal values are lower bounds on the optimal value of (1), with convergence to it as the sequence progresses. This implies that even though the original POP is nonconvex, one can obtain increasingly accurate lower bounds on its optimal value by solving convex optimization problems. One method for constructing these hierarchies of optimization problems that has gained attention in recent years relies on the use of Positivstellensätze (see, e.g.,  for a survey). Positivstellensätze are algebraic identities that certify infeasibility of a set of polynomial inequalities, or equivalently111Note that the set is empty if and only if on the set ., positivity of a polynomial on a basic semialgebraic set. (Recall that a basic semialgebraic set is a set defined by finitely many polynomial inequalities.) These Positivstellensätze can be used to prove lower bounds on POPs. Indeed, if we denote the feasible set of (1) by , the optimal value of problem (1) is equivalent to
Hence if is a strict lower bound on (1), we have that on , a fact that can be certified using Positivstellensätze. At a conceptual level, hierarchies that provide lower bounds on (1) are constructed thus: we fix the “size of the certificate” at each level of the hierarchy and search for the largest such that the Positivstellensätze at hand can certify positivity of over with a certificate of this size. As the sequence progresses, we increase the size of the certificates allowed, hence obtaining increasingly accurate lower bounds on (1).
Below, we present three of the better-known Positivstellensätze, given respectively by Stengle , Schmüdgen , and Putinar . These all rely on sum of squares certificates. We recall that a polynomial is a sum of squares (sos) if it can be written as a sum of squares of other polynomials. We start with Stengle’s Positivstellensatz, which certifies infeasibility of a set of polynomial inequalities. It is sometimes referred to as “the Positivstellensatz” in related literature as it requires no assumptions, contrarily to Schmüdgen and Putinar’s theorems which can be viewed as refinements of Stengle’s result under additional assumptions. This Positivstellensatz was in fact discovered by Krivine in 1964 , and rediscovered by Stengle later222We thank an anonymous referee for pointing this out to us and for providing us with the appropriate references.; see [24, Section 4.7] for a more complete history of this result.
Theorem 1.1 (Stengle’s Positivstellensatz )
The closed basic semialgebraic set
is empty if and only if there exist sum of squares polynomials ,,, , , ,, such that
The next two theorems, due to Schmüdgen and Putinar, certify positivity of a polynomial over a closed basic semialgebraic set . They impose additional compactness assumptions comparatively to Stengle’s Positivstellensatz.
Theorem 1.2 (Schmüdgen’s Positivstellensatz )
Assume that the set
is compact. If a polynomial is positive on , then
where ,,, , , ,, are sums of squares.
Theorem 1.3 (Putinar’s Positivstellensatz )
and assume that satisfy the Archimedean property, i.e., there exists such that
where are sums of squares. If a polynomial is positive on , then
where are sums of squares.
Note that these three Positivstellensätze involve in their expressions sum of squares polynomials of unspecified degree. To construct hierarchies of tractable optimization problems for (2), we fix this degree: at level , we search for the largest such that positivity of over can be certified using the Positivstellensätze where the degrees of all sos polynomials are taken to be less than or equal to . Solving each level of these hierarchies is then a semidefinite program (SDP). This is a consequence of the fact that one can optimize over (or test membership to) the set of sum of squares polynomials of fixed degree using semidefinite programming [21, 20, 14]. Indeed, a polynomial of degree and in variables is a sum of squares if and only if there exists a symmetric matrix such that , where
is the standard vector of monomials invariables and of degree less than or equal to . We remark that the hierarchy obtained from Stengle’s Positivstellensatz was proposed and analyzed by Parrilo in ; the hierarchy obtained from Putinar’s Positivstellensatz was proposed and analyzed by Lasserre in . There have been more recent works that provide constructive proofs of Schmüdgen and Putinar’s Positivstellensätze; see [5, 28, 30]. These proofs rely on other Positivstellensätze, e.g., a result by Polyá (see Theorem 1.6 below) in [28, 30], and the same result by Polyá, Farkas’ lemma, and Stengle’s Positivstellensatz in . We would like to thank an anonymous referee for pointing out that the construction in  can be used to develop converging hierarchies of lower bounds for POPs with compact feasible sets. These hierarchies rely on Gröbner bases computations and linear programs involving only two variables. Some experiments with this technique were carried out by Datta  and Averkov has more recently shown  that the (potentially expensive) Gröbner bases computations can be avoided in this approach. Other recent research efforts relating to Positivstellensätze have been focused around deriving complexity bounds for Schmüdgen and Putinar’s Positivstellensätze; see [18, 29].
On a historical note, Stengle, Schmüdgen, and Putinar’s Positivstellensätze were derived in the latter half of the 20th century. As mentioned previously, they all certify positivity of a polynomial over an arbitrary basic semialgebraic set (modulo compactness assumptions). By contrast, there are Positivstellensätze from the early 20th century that certify positivity of a polynomial globally. Perhaps the most well-known Positivstellensatz of this type is due to Artin in 1927, in response to Hilbert’s 17th problem. Artin shows that any nonnegative polynomial is a sum of squares of rational functions. Here is an equivalent formulation of this statement:
Theorem 1.4 (Artin )
For any nonnegative polynomial , there exists a nonzero sos polynomial such that is a sum of squares.
To the best of our knowledge, in this area, all converging hierarchies of lower bounds for POPs are based off of Positivstellensätze that certify nonnegativity of a polynomial over an arbitrary basic semialgebraic set. In this paper, we show that in fact, under compactness assumptions, it suffices to have only global certificates of nonnegativity (such as the one given by Artin) to produce a converging hierarchy for general POPs. As a matter of fact, even weaker statements that apply only to globally positive (as opposed to globally nonnegative) forms are enough to derive converging hierarchies for POPs. Examples of such statements are due to Habicht  and Reznick . With such an additional positivity assumption, more can usually be said about the structure of the polynomial in Artin’s result. Below, we present the result by Reznick.
Theorem 1.5 (Reznick )
For any positive definite form , there exists such that is a sum of squares.
We show in this paper that this Positivstellensatz also gives rise to a converging hierarchy for POPs with a compact feasible set similarly to the one generated by Artin’s Positivstellensatz.
Through their connections to sums of squares, the two hierarchies obtained using the theorems of Reznick and Artin are semidefinite programming-based. In this paper, we also derive an “optimization-free” converging hierarchy for POPs with compact feasible sets where each level of the hierarchy only requires that we be able to test nonnegativity of the coefficients of a given fixed polynomial. To the best of our knowledge, this is the first converging hierarchy of lower bounds for POPs which does not require that convex optimization problems be solved at each of its levels. To construct this hierarchy, we use a result of Polyá , which just like Artin’s and Reznick’s Positivstellensätze, certifies global positivity of forms. However this result is restricted to even forms. Recall that a form is even if each of the variables featuring in its individual monomials has an even power. This is equivalent (see [8, Lemma 2]) to being invariant under change of sign of each of its coordinates, i.e.,
Theorem 1.6 (Polyá )
For any positive definite even form , there exists such that has nonnegative coefficients.
A perhaps better-known but equivalent formulation of this theorem is the following: for any form that is positive on the standard simplex, there exists such that has nonnegative coefficients. The two formulations are equivalent by simply letting . The latter formulation has been used to derive similar optimization-free converging hierarchies of lower bounds for polynomial minimization problems over the simplex; see, e.g., [9, 10].
Our aforementioned optimization-free hierarchy also enables us to obtain linear programming (LP) and second-order cone programming (SOCP)-based hierarchies for general POPs with compact feasible sets that rely on the concepts of dsos and sdsos polynomials. These are recently introduced inner approximations to the set of sos polynomials that have shown much better scalability properties in practice .
As a final remark, we wish to stress the point that the goal of this paper is first and foremost theoretical, i.e., to provide methods for constructing converging hierarchies of lower bounds for POPs using as sole building blocks certificates of global positivity. We do not make any claims that these hierarchies can outperform the popular existing hierarchies due, e.g., to Lasserre  and Parrilo . In particular, all hierarchies that we generate increase the number of variables and the degree of the polynomials involved from to , and from to respectively. They also necessitate the use of bisection, which, while not a problem in theory, increases the computational overload. We remark however that each level of our hierarchies only involves either one sum of squares constraint (the hierarchy based on the certificate of Reznick; Theorem 3.1), two sum of squares constraints (the hierarchy based on the certificate of Artin; Theorem 3.2), or nothing but elementary computations (the hierarchy based on the certificate of Polyá; Theorem 4.1). By contrast, each level of the hierarchy based on Putinar’s (resp. Schmüdgen’s) certificate involves (resp. ) sum of squares constraints, but necessitates no need to use bisection or to increase the number of variables/degree of the problem. Similarly, a hierarchy based on Stengle’s certificate, which would work by showing infeasibility of the constraints , requires the use of bisection on and sum of squares constraints in each level, but necessitates no increase in the number of variables/degree of the problem. Of course, such comparisons would become more meaningful if one could also relate the quality of the bounds obtained from the different approaches. Some remarks on why it is nontrivial to connect our hierarchies to previous ones in this sense are made in Section 5.
1.1 Outline of the paper
The paper is structured as follows. In Section 2, we show that if one can inner approximate the cone of positive definite forms arbitrarily well (with certain basic properties), then one can produce a converging hierarchy of lower bounds for POPs with compact feasible sets (Theorem 2.2). This relies on a reduction (Theorem 2.1) that reduces the problem of certifying a strict lower bound on a POP to that of proving positivity of a certain form. In Section 3, we see how this result can be used to derive semidefinite programming-based converging hierarchies (Theorems 3.1 and 3.2) from the Positivstellensätze by Artin (Theorem 1.4) and Reznick (Theorem 1.5). In Section 4, we derive an optimization-free hierarchy (Theorem 4.1) from the Positivstellensatz of Polyá (Theorem 1.6) as well as LP and SOCP-based hierarchies which rely on dsos/sdsos polynomials (Corollary 3). We conclude with a few open problems in Section 5.
1.2 Notation and basic definitions
We use the standard notation to denote that a symmetric matrix is positive semidefinite. Recall that a form is a homogeneous polynomial, i.e., a polynomial whose monomials all have the same degree. We denote the degree of a form by . We say that a form is nonnegative (or positive semidefinite) if , for all (we write ). A form is positive definite (pd) if for all nonzero in (we write ). Throughout the paper, we denote the set of forms (resp. the set of nonnegative forms) in variables and of degree by (resp ). We denote the ball of radius and centered at the origin by and the unit sphere in -space, i.e., , by . We use the shorthand for to denote We say that a scalar is a strict lower bound on (1) if . Finally, we ask the reader to carefully read Remark 2 which contains the details of a notational overwriting occurring before Theorem 2.2 and valid from then on throughout the paper. This overwriting makes the paper much simpler to parse.
2 Constructing converging hierarchies for POP using global certificates of positivity
Consider the polynomial optimization problem in (1) and denote its optimal value by . Let be such that is the smallest even integer larger than or equal to the maximum degree of . We denote the feasible set of our optimization problem by
and assume that is contained within a ball of radius . From this, it is easy to provide (possibly very loose) upper bounds on over the set : as is contained in a ball of radius , we have , for all . We then use this to upper bound each monomial in and consequently itself. We use the notation to denote these upper bounds, i.e., , for all and for all . Similarly, we can provide an upperbound on . We denote such a bound by , i.e.,
The goal of this section is to produce a method for constructing converging hierarchies of lower bounds for POPs if we have access to arbitrarily accurate inner approximations of the set of positive definite forms. The first theorem (Theorem 2.1) connects lower bounds on (1) to positive definiteness of a related form. The second theorem (Theorem 2.2) shows how this can be used to derive a hierarchy for POPs.
Consider the general polynomial optimization problem in (1) and recall that is such that is the smallest even integer larger than or equal to the maximum degree of . Suppose for some positive scalar . Let (resp. ) be any finite upper bounds on (resp. ).
Then, a scalar is a strict lower bound on (1) if and only if the homogeneous sum of squares polynomial
of degree and in variables is positive definite.333The reader will observe in the proof that the variables will serve as slack variables and the variable will be used for homogenization.
It is easy to see that is a strict lower bound on (1) if and only if the set
is empty. Indeed, if is nonempty, then there exists a point such that . This implies that cannot be a strict lower bound on (1). Conversely, if is empty, the intersection of with is empty, which implies that , .
We now define the set:
Note that is empty if and only if is empty. Indeed, if is nonempty, then there exists and such that the three sets of equations are satisfied. This obviously implies that and that , for all It further implies that as by assumption, if , then is in a ball of radius . Conversely, suppose now that is nonempty. There exists such that , for , and Hence, there exist such that
Combining the fact that and the fact that , (resp. ) are upperbounds on (resp. ), we obtain:
By raising both sides of the inequality to the power , we show the existence of .
We now show that is empty if and only if is positive definite. Suppose that is nonempty, i.e., there exists such that the equalities given in (4) hold. Note then that . As is nonzero, this implies that is not positive definite.
For the converse, assume that is not positive definite. As is a sum of squares and hence nonnegative, this means that there exists nonzero such that . We proceed in two cases. If , it is easy to see that and is nonempty. Consider now the case where . The third square in being equal to zero gives us:
This implies that and that which contradicts the fact that is nonzero.
Note that Theorem 2.1 implies that testing feasibility of a set of polynomial inequalities is no harder than checking whether a homogeneous polynomial that is sos has a zero. Indeed, as mentioned before, the basic semialgebraic set
is empty if and only if is a strict lower bound on the POP
In principle, this reduction can open up new possibilities for algorithms for testing feasibility of a basic semialgebraic set. For example, the work in  shows that positive definiteness of a form is equivalent to global asymptotic stability of the polynomial vector field One could as a consequence search for Lyapunov functions, as is done in [1, Example 2.1.], to certify positivity of forms. Conversely, simulating trajectories of the above vector field can be used to minimize and potentially find its nontrivial zeros, which, by our reduction, can be turned into a point that belongs to the basic semialgebraic set at hand.
We further remark that one can always take the degree of the sos form in (3) whose positivity is under consideration to be equal to four. This can be done by changing the general POP in (1) to only have quadratic constraints and a quadratic objective via an iterative introduction of new variables and new constraints in the following fashion: .
Remark 2 (Notational remark)
where is the dimension of the decision variable of problem (1), is such that is the smallest even integer larger than or equal to the maximum degree of and in (1), and is the number of constraints of problem (1). Note now that the form is a polynomial in variables and of degree .
Our next theorem shows that, modulo some technical assumptions, if one can inner approximate the set of positive definite forms arbitrarily well (conditions (a) and (b)), then one can construct a converging hierarchy for POPs.
Let be a sequence of sets (indexed by ) of homogeneous polynomials in variables and of degree with the following properties:
and there exists a pd form
If , then such that
If , then ,
Recall the definition of given in (3). Consider the hierarchy of optimization problems indexed by :
Then, for all , is nondecreasing, and
We first show that the sequence is upperbounded by . Suppose that a scalar satisfies
We now show monotonicity of the sequence . Let be such that
We have the following identity:
Now, using the assumption and properties (c) and (d), we conclude that
This implies that
Note that as the sequence is upper bounded and nondecreasing, it converges. Let us show that the limit of this sequence is . To do this, we show that for any strict lower bound on (1), there exists a positive integer such that . By Theorem 2.1, as is a strict lower bound, is positive definite. As a form is positive definite if and only if it is positive on the unit sphere, by continuity, there exists a positive integer such that is positive definite. Using (b), this implies that there exists a positive integer such that
We now proceed in two cases. If , we take and use property (c) to conclude. If , we have
We take and use (6) and properties (c) and (d) to conclude.
Note that condition (d) is subsumed by the more natural condition that be a convex cone for any and . However, there are interesting and relevant cones which we cannot prove to be convex though they trivially satisfy condition (d) (see Theorem 3.1 for an example).
3 Semidefinite programming-based hierarchies obtained from Artin’s and Reznick’s Positivstellensätze
In this section, we construct two different semidefinite programming-based hierarchies for POPs using Positivstellensätze derived by Artin (Theorem 1.4) and Reznick (Theorem 1.5). To do this, we introduce two sets of cones that we call the Artin and Reznick cones.
We define the Reznick cone of level to be
Similarly, we define the Artin cone of level to be
We show that both of these cones produce hierarchies of the type discussed in Theorem 2.2. Recall that is the optimal value of problem (1) and that is a polynomial in variables and of degree as defined in (3) and Remark 2.
Consider the hierarchy of optimization problems indexed by :
Then, for all , is nondecreasing, and
It suffices to show that the Reznick cones satisfy properties (a)-(d) in Theorem 2.2. The result will then follow from that theorem. For property (a), it is clear that, as and is a sum of squares and hence nonnegative, must be nonnegative, so Furthermore, the form belongs to and is positive definite. Property (b) is verified as a consequence of Theorem 1.5. For (c), note that if is sos, then is sos since the product of two sos polynomials is sos. Finally, for property (d), note that is a convex cone. Indeed, for any ,
is sos if and are in . Combining the fact that is a convex cone and the fact that , we obtain (d).
To solve a fixed level of the hierarchy given in Theorem 3.1, one must proceed by bisection on (since the parameter appears with a power in the definition of in (3)). Bisection here would produce a sequence of upper bounds and lower bounds on as follows. At iteration , we test whether is feasible for (7). If it is, then we take and . If it is not, we take and . We stop when , where is a prescribed accuracy, and the algorithm returns Note that and that to obtain , one needs to take a logarithmic (in ) number of steps using this method.
Hence, solving the level of this hierarchy using bisection can be done by semidefinite programming. Indeed, for a fixed and given by the bisection algorithm, one simply needs to test membership of
to the set of sum of squares polynomials. This amounts to solving a semidefinite program. We remark that all semidefinite programming-based hierarchies available only produce an approximate solution to the optimal value of the SDP solved at level in polynomial time. This is independent of whether they use bisection (e.g., such as the hierarchy given in Theorem 3.1 or the one based on Stengle’s Positivstellensatz) or not (e.g., the Lasserre hierarchy).
Our next theorem improves on our previous hierarchy by freeing the multiplier and taking advantage of our ability to search for an optimal multiplier using semidefinite programming.
Recall the definition of Artin cones from Definition 1. Consider the hierarchy of optimization problems indexed by :
Then, for all , is nondecreasing, and
Just as the previous theorem, it suffices to show that the Artin cones satisfy properties (a)-(d) of Theorem 2.2. The proof of property (a) follows the proof given for Theorem 3.1. Property (b) is satisfied as a (weaker) consequence of Artin’s result (see Theorem 1.4). For (c), we have that if is sos for some nonzero sos polynomial of degree , then is sos, and has degree . Finally, for (d), suppose that . Then there exists an sos form such that is sos. We have
which is sos as the product (resp. sum) of two sos polynomials is sos.
Note that again, for any fixed , the level of the hierarchy can be solved using bisection which leads to a sequence of semidefinite programs.
Our developments in the past two sections can be phrased in terms of a Positivstellensatz.
Corollary 1 (A new Positivstellensatz)
Consider the basic semialgebraic set
and a polynomial . Suppose that is contained within a ball of radius . Let and be any finite upperbounds on and, respectively, over the set .444As discussed at the beginning of Section 2, such bounds are very easily computable. Let be such that is the smallest integer larger than or equal to the maximum degree of . Then, for all if and only if there exists a positive integer such that
is a sum of squares, where the form in variables is as follows:
4 Polyá’s theorem and hierarchies for POPs that are optimization-free, LP-based, and SOCP-based
In this section, we use a result by Polyá on global positivity of even forms to obtain new hierarchies for polynomial optimization problems. In Section 4.1, we present a hierarchy that is optimization-free, in the sense that each level of the hierarchy only requires multiplication of two polynomials and checking if the coefficients of the resulting polynomial are nonnegative. In Section 4.2, we use the previous hierarchy to derive linear programming and second-order cone programming-based (converging) hierarchies that in each level produce a lower bound on the POP whose quality is at least as good as that of the optimization-free hierarchy. These rely on the recently developed concepts of dsos and sdsos polynomials (see Definition 3 and ), which are alternatives to sos polynomials that have been used in diverse applications to improve scalability; see [3, Section 4].
4.1 An optimization-free hierarchy of lower bounds for POPs
The main theorem in this section presents an optimization-free hierarchy of lower bounds for general POPs with compact feasible sets:
Recall the definition of as given in (3), with and Let and define
Consider the hierarchy of optimization problems indexed by :
Let . Then for all , is nondecreasing, and .
As before, we use bisection to obtain the optimal value of the level of the hierarchy up to a fixed precision (see Remark 4). At each step of the bisection algorithm, one simply needs to multiply two polynomials together and check nonnegativity of the coefficients of the resulting polynomial to proceed to the next step. As a consequence, this hierarchy is optimization-free as we do not need to solve (convex) optimization problems at each step of the bisection algorithm. To the best of our knowledge, no other converging hierarchy of lower bounds for general POPs (whose feasible sets are contained within a ball of known radius) dispenses altogether with the need to solve convex subprograms. We also provide a Positivstellensatz counterpart to the hierarchy given above (see Corollary 2). This corollary implies in particular that one can always certify infeasibility of a basic semialgebraic set by recursively multiplying polynomials together and simply checking nonnegativity of the coefficients of the resulting polynomial.
We now make a few remarks regarding the techniques used in the proof of Theorem 4.1. Unlike Theorems 3.1 and 3.2, we do not show that satisfies properties (a)-(d) as given in Theorem 2.2 due to some technical difficulties. It turns out however that we can avoid showing properties (c) and (d) by using a result by Reznick and Powers  that we present below. Regarding properties (a) and (b), we show that a slightly modified version of (a) holds and that (b), which is the key property in Theorem 2.2, goes through as is. We note though that obtaining (b) from Polyá’s result (Theorem 1.6) is not as immediate as obtaining (b) from Artin’s and Reznick’s results. Indeed, unlike the theorems by Artin and Reznick (see Theorems 1.4 and 1.5) which certify global positivity of any form, Polyá’s result only certifies global positivity of even forms. To make this latter result a statement about general forms, we work in an appropriate lifted space. We make the simple observation that any scalar can be written as with (take and ). We then replace the form in variables by the even form in variables . This lifting operation preserves nonnegativity, but unfortunately it does not preserve positivity: even if is pd, always has zeros (e.g., when ). Hence, though we now have access to an even form, we still cannot use Polyá’s property as is not positive. This is what leads us to consider the slightly more complicated form in (9).
Theorem 4.2 (Powers and Reznick )
Let , , and write Denote the standard simplex by , i.e., . Assume that is a form of degree that is positive on and let
Define We have: