New Dependencies of Hierarchies in Polynomial Optimization

03/12/2019 ∙ by Adam Kurpisz, et al. ∙ ETH Zurich Berlin Institute of Technology (Technische Universität Berlin) 0

We compare four key hierarchies for solving Constrained Polynomial Optimization Problems (CPOP): Sum of Squares (SOS), Sum of Diagonally Dominant Polynomials (SDSOS), Sum of Nonnegative Circuits (SONC), and the Sherali Adams (SA) hierarchies. We prove a collection of dependencies among these hierarchies both for general CPOPs and for optimization problems on the Boolean hypercube. Key results include for the general case that the SONC and SOS hierarchy are polynomially incomparable, while SDSOS is contained in SONC. A direct consequence is the non-existence of a Putinar-like Positivstellensatz for SDSOS. On the Boolean hypercube, we show as a main result that Schmüdgen-like versions of the hierarchies SDSOS*, SONC*, and SA* are polynomially equivalent. Moreover, we show that SA* is contained in any Schmüdgen-like hierarchy that provides a O(n) degree bound.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

A Constrained Polynomial Optimization Problem (CPOP) is of the form

subject to

where and are -variate real polynomials. Solving CPOP is a crucial nonconvex optimization problem, which lies at the core of both theoretical and applied computer science. A special case of CPOP is a Binary Constrained Polynomial Optimization Problem (BCPOP) where the polynomials are among the polynomials defining the feasibility set. Many important optimization problems belong to the BCPOP class. However, solving these is NP-hard in general.

A CPOP can be equivalently seen as the problem of maximizing a real such that is nonnegative over the semialgebraic set defined by the polynomials . This is an interesting perspective since various techniques form real algebraic geometry provide methods for certifying nonnegativity of a real polynomial over semialgebraic sets. The class of such theorems is called Positivstellensätze. These theorems state that, under some assumptions, a polynomial , which is positive (or nonnegative) over the feasibility set, can be expressed in a particular algebraic way. Typically, this algebraic expression is a sum of nonnegative polynomials from a chosen ground set of nonnegative polynomials multiplied by the polynomials defining the feasibility set. Choosing a proper ground set of nonnegative polynomials is crucial from the perspective of optimization. Ideally, both testing membership in the ground set and deciding nonnegativity of a polynomial in the ground set should be efficiently doable. Moreover, fixing the maximum degree of polynomials in the ground sets, used for a representation of , provides a family of algorithms parameterized by an integer , which gives a sequence of lower bounds for the value of CPOP. If the ground set of polynomials is chosen properly, then the sequence of lower bounds converges in to the optimal value of CPOP.

One of the most successful approaches for constructing theoretically efficient algorithms is the Sum of Squares (SOS) method [GV01, Nes00, Par00, Sho87], known as Lasserre relaxation [Las01]. The method relies on Putinar’s Positivstellensatz [Put93] using sum of squares of polynomials as the ground set. Finding a degree SOS certificate for nonnegativity of can be performed by solving a semidefinite programming (SDP) formulation of size . Finally, for every (feasible) -variate hypercube optimization problem, with constraints of degree at most , there exists a degree SOS certificate, see e.g., [BS16].

The SOS algorithm is a frontier method in algorithm design. It was used to provide the best available algorithms for a variety of combinatorial optimization problems. The Lovász

-function [Lov79] for the Independent Set problem is implied by the SOS algorithm of degree 2. Moreover, the Goemans-Williamson relaxation [GW95] for the Max Cut problem and the Goemans-Linial relaxation for the Sparsest Cut problem (analyzed in [ARV09]) can be obtained by the SOS algorithm of degree 2 and 6, respectively. SOS was also proven to be a successful method for Maximum Constraint Satisfaction problems (Max CSP). For Max CSP, the SOS algorithm is as powerful as any SDP relaxation of comparable size  [LRS15]. Furthermore, SOS was applied to problems in dictionary learning [BKS15, SS17], tensor completion and decomposition [BM16, HSSS16, PS17], and

robust estimation

 [KSS18]. For other applications of the SOS method see e.g., [BRS11, BCG09, Chl07, CS08, CGM13, dlVKM07, GS11, MM09, Mas17, RT12], and the surveys [CT12, Lau03, Lau09].

From a practical perspective however, solving SDP problems is known to be very time consuming. Moreover, from a theoretical point of view, it is an open problem whether an SDP of size can be solved in time  [O’D16, RW17]. Hence, various methods have been proposed to choose different ground sets of polynomials to make a resulting problem easier to solve, but still effective.

In [AM14] Ahmadi and Majumdar propose an algorithmic framework by choosing the ground set of polynomials to be scaled diagonally-dominant polynomials (SDSOS). SDSOS polynomials can be seen as the binomial squares. Thus, the SDSOS algorithm is not stronger than the SOS algorithm. However, searching for a degree- SDSOS certificate can be performed using Second Order Conic Program (SOCP) of size ; see [AM14]. Since, in practice, an SOCP can be solved much faster than an SDP, the algorithm attracted a lot of attention and has been used to solve problems in Robotics and Control [AMT14, Leo18, PP15, SA16, ZFP18], Option Pricing [AM14], Power Flow [KGNSZ18, SSTL18], and Discrete Geometry [DL16].

An alternative approach, that is a more tractable method than the SOS, was initiated by Sherali and Adams in [SA90]. The technique was first introduced as a method to tighten the Linear Program (LP) relaxations for BCPOP problems and for such settings finding the degree certificate can be done by solving an LP of size . The Sherali Adams (SA) algorithm arises from using the set of polynomials depending on at most variables, which are nonnegative on the Boolean hypercube. These polynomials are called -juntas. The SA algorithm was used to construct some of the most prominent algorithms with good asymptotic running time in combinatorial optimization [CLRS16, LR16, TZ17], logic [AM13], and other fields of computer science.

Finally, a method independent from SOS was introduced in [IdW16] using Sum of Nonnegative Circuit Polynomials (SONC)

as a ground set. These polynomials form a full dimensional cone in the cone of nonnegative polynomials, which is not contained in the SOS cone. For example, the well-known Motzkin polynomial is a nonnegative circuit polynomial, but not an SOS. Moreover, SONCs generalize polynomials which are certified to be nonnegative via the arithmetic-geometric mean inequality

[Rez89]. SONC certificates of degree can be computed via a convex optimization program called Relative Entropy Programming (REP) of size [DIdW17, Theorem 5.3]; see also [CS16, CMW18]. Recently, an experimental comparison of SONC with the SOS method for unconstrained optimization was presented in [SdW18].

For all presented algorithms, one can define a potentially stronger algorithm without changing the corresponding ground set of polynomials, by using a more general construction for the certificate of nonnegativity. Such a certificate expresses a polynomial, which is nonnegative over a given semialgebraic set, as a sum of polynomials from the ground set multiplied by the product of polynomials defining the semialgebraic set; see section 2 for further details. We call the resulting systems , , and . Some of these extensions were intensively studied in the literature, see e.g., [GHP02, Wor15].

Our Results

In this paper, we provide an extensive comparison of the presented semialgebraic proof systems. More precisely, following the definitions in e.g., [BFI18], we analyze their polynomial comparability: Let and be semialgebraic proof systems. contains if for every semialgebraic set and a polynomial admitting a degree certificate of nonnegativity over in , admits also a degree certificate in . System strictly contains if contains but does not contain . Systems and are polynomially equivalent if contains and contains . Finally, systems and are polynomially incomparable if neither contains nor contains .

For a more detailed definition of proof systems and their comparability, see section 2.5.

In this article, we show the dependencies between the proof systems presented in fig. 1.

SONC

SDSOS

SOS

3.2

4.1

SONC

SDSOS

SOS

3.2

4.1

SONC

SDSOS

SOS

SA

5.2 5.2

5.2 5.2 5.3

SONC

SDSOS

SOS

SA

5.2

5.4

5.3

5.3

: polynomially incomparable

: polynomially equivalent

: contained in

: strictly contained in

: known

: new contribution

: implied new contribution
Figure 1. A visualization of our results. Left hand side: CPOP. Right hand side: BCPOP. Labels on the arrows refer to theorems.

In particular, in section 3, we show that for general CPOP problems the SOS proof system is polynomially incomparable with the SONC proof system. We also proved that the same relation holds for SOS and SONC proof systems; see section 3.2. So far, it was only known that the cones of SOS and SONC polynomials are not contained in each other [IdW16, Proposition 7.2] however, it has no direct implication on the relation between the SOS and the SONC methods for the CPOP optimization. Similarly, in a very recent result [CMW18], the authors point out that the SONC cone contains SDSOS cone. In this paper, in section 4, we extend this result for CPOP problems by proving, that SONC certificate strictly contains the SDSOS certificate and the same relation holds for SONC and SDSOS certificates; see section 4.1. As a consequence, we conclude that there exists no Putinar-like Positivstellensatz for SDSOS; see section 4.1.

For the BCPOP we provide a general, sufficient condition for the proof system to contain proof system, see section 5. This combined with the results from section 5.2, and section 5.3 proves the polynomial equivalence of , , and on the Boolean hypercube. Moreover, by proving some properties of SONC, SDSOS, and SA polynomials in section 4.1, and section 5.2, we prove additional dependencies between the hierarchies in section 5.2, section 5.3, and section 5.4.

We remark that all results in this article concern the minimal degrees for certificates in a particular proof system as these are the standard way to measure the complexity of algorithms in theoretical computer science. Our results do not directly imply a particular behaviour of actual runtimes in an experimental setting, as these depend on various further factors other than the degree.

Acknowledgements

AK is supported by SNSF project PZ00P2174117, TdW is supported by the DFG grant WO 2206/1-1.

2. Preliminaries

In this section, we introduce the proof systems used in this article. Moreover, for the sake of clarity, we provide dual formulations for some of the presented proof systems for the BCPOP case. We begin with introducing basic notation. For any we denote and . Let and () be the set of nonnegative (positive) real numbers. Let

\Vector

x]=\R[x1,,xn] be the ring of -variate real polynomials and for every we define the real zero set as . We denote the Newton polytope of by and the vertices of by . A lattice point is called even if it is in , and a term is called a monomial square if and is even.

In what follows we introduce different proof systems and their notation. Next to the specific sources that we provide later in the section, we refer the reader to introductory literature like [BPT13, Lau09, Las15, Mar08] on the mathematical side, and [Raz16, Rot13] on the computer science side. Moreover, we fix the notation

for a set of polynomials. Throughout the paper we assume that the cardinality of the set is polynomial in the size of . For a given , we define the corresponding semi-algebraic set

Furthermore, for any given semialgebraic set , we consider the set of nonnegative polynomials with respect to

For a given and a set of constraints , we define the corresponding constrained polynomial optimization problem (CPOP) as (see e.g., [BV04])

(CPOP)

Hence, corresponds to the feasibility region of the program (CPOP).

The problem (CPOP) is NP-hard in general. Thus, one chooses proper subsets such that, on the one hand, the corresponding polynomial optimization problem provides a lower bound on the value of (CPOP) and on the other hand, is computationally tractable. Such subsets are called certificates of nonnegativity. The choice of a suitable certificate of nonnegativity is crucial for obtaining a good lower bound for the problem (CPOP).

Let us be more specific. For a given the induced preprime is given by

Note that . Throughout the paper we assume that for a given is the cardinality of the set restricted to polynomials of degree at most . In order to relax (CPOP) to a finite size optimization problem we introduce polynomial hierarchies. Let be a collection of polynomials and let be a subset of . We define the following degree depending hierarchy of certificates of nonnegativity:

In several contexts it is more useful to consider the preprime of the constraints, i.e., . Every such hierarchy of polynomials yields a sequence of lower bounds given by the following optimization program:

(_^2d)
(_G^2d)

Throughout this paper we assume that the set is chosen such that is Archimedean, a property which is e.g., implied by the compactness of . In what follows we occasionally enforce compactness of by adding box constraints to with sufficiently large for .

Under this assumption we obtain from Krivine’s general Positivstellensatz [Kri64a, Kri64b], see also [Mar08, Theorem 5.4.4], the following Schmüdgen-type Positivstellensatz; see [Sch02, Theorem 5.1]:

Let be Archimedean and let such that is closed under addition. Let for all . Then there exists a such that .

For the SOS hierarchy this theorem was first shown by Schmüdgen in [Sch91].

In the following subsections we introduce some of the most prominent inner approximations of the cone .

2.1. Sum of Squares

The SOS method approximates the cone by using the set of sum of square polynomials instead of the entire set of nonnegative polynomials. Let be the set of (finite) sum of square polynomials (SOS). The SOS program of degree takes the following form:

(SOS_^2d)

analogously the SOS program of degree takes the form . For the SOS-hierarchy Putinar proved the following Positivstellensatz, which is an improvement of Schmüdgen’s Positivstellensatz. [Putinar’s Positivstellensatz; [Put93]] Let be a set of polynomial constraints with being Archimedean, and let with for all . Then there exists a such that .

section 2.1 provides a sequence of cones that approximate from the inside, such that the values of give a sequence of lower bounds that converges in to the optimal value of (CPOP).

The program (SOS_^2d) can be solved using a semidefinite program (SDP) of size ; see e.g., [Las01, Nes00, Par00, Sho87]. This is implied by the following fact; see e.g., [Par00]. A polynomial is a SOS of degree if and only if there exists a positive semidefinite matrix , called the Gram matrix, such that , for being the vector of -variate monomials of total degree at most .

The size of the SDP program is . Moreover, for BCPOP problems, when hypercube constraints are incorporated in , it is known for that solves the problem exactly, i.e., ; see e.g., [BS14].

2.1.1. SOS - The dual perspective: Lasserre hierarchy

Consider a BCPOP. Let be such that , for some

. By the hyperplane separation theorem for convex cones, there exists a hyperplane that separates

from . Note that for BCPOP we can restrict to polynomials defined on the hypercube , i.e., to the vector space of multi-linear polynomials. The hyperplane is represented by the polynomial , which is a normal vector to the hyperplane, such that for every polynomial we have and . By scaling we can assume that . To every function we can associate a linear operator mapping polynomials to real numbers, defined by

which is called the pseudoexpectation. The dual problem to (SOS_^2d) is the program of degree . It takes the form

SOS_^2d)

and is known as the Lasserre relaxation (of degree ) . It can be solved using an SDP of size  [Las01]. Analogously, the program of degree takes the form .

Problem (¯SOS_^2d) can be reformulated in terms of moments / localizing matrices. Consider , for and . Let and be the vector of coefficients of , such that . We can write

(1)

where is a real, symmetric matrix whose rows and columns are indexed by sets of size at most such that . For the matrix is called the moment matrix, and for all other it is called the localizing matrix for the constraint . Since for every real valued vector the requirement is equivalent to being positive semidefinite (PSD), denoted by , we can reformulate (¯SOS_^2d) as:

(2)

2.2. Scaled Diagonally Dominant Sum of Squares

In [AM14] Ahmadi and Majumdar proposed an approximation of the cone based on scaled diagonally-dominant polynomials (SDSOS), defined below in section 2.2.1. Let be the set of finite sums of scaled diagonally-dominant polynomials. We obtain the following program:

(_^2d)

analogously, the SDSOS program takes the form . Since for every , we have . Moreover, can be solved using Second Order Conic Programming (SOCP) of size ; see [AM14].

2.2.1. Scaled diagonally-dominant polynomials

We introduce the formal details for SDSOS certificates. A real symmetric matrix is called diagonally-dominant (dd) if for every we have . Moreover, is called scaled diagonally-dominant (sdd) if there exist a positive real diagonal matrix such that is dd. A polynomial of total degree is scaled diagonally-dominant, denoted , if there exist an sdd matrix such that , for being the vector of -variate monomials of total degree at most .

Every SDSOS polynomials is an SOS polynomial: By section 2.2.1, consider an sdd matrix , for being a dd matrix. By the Gershgorin circle theorem, the matrix is PSD. Moreover, . Since is a congruent transformation of

, that does not change the sign of the eigenvalues, the matrix

is also PSD.

Next, we provide a further characterization of SDSOS polynomials. We start with recalling the known characterization of diagonally dominant (dd) matrices. [[BC75]] A symmetric matrix is dd if and only if

for and being a set of vectors, each with at most two nonzero entries at positions and which equal .

By section 2.2.1 and section 2.2.1, every -variate sdd polynomial of degree at most is of the form

where is the vector of -variate monomials of maximal degree , is a dd matrix, and is a positive diagonal matrix. Since every vector has at most two nonzero entries, both equal to , the SDSOS polynomial is always of the form , where are monomials and .

2.2.2. SDSOS - Dual perspective

For the BCPOP the dual of the problem (_^2d) is a relaxation of the problem (¯SOS_^2d). Indeed, similar as for formulation (¯SOS_^2d), a conic duality theory can be used to transform program (_^2d) into its dual of the form

(_^2d)

for being a linear map, defined as in section 2.1.1. Analogously the program of degree takes the form .

Similar as in (1), Formulation (_^2d) can be transformed into matrix form. In this case we obtain a set of matrices that are required to be PSD. More formally, let . For being a real, symmetric matrix whose rows/columns are indexed with sets of size at most , let be the principal submatrix of of entries that lie in the rows and columns indexed by the sets in .

We obtain that is equivalent to:

(3)

For BCPOP, both (_^2d) and (3) are solvable via an SOCP of size . For more details we refer the reader to [AM17].

2.3. Sherali Adams

An alternative method to approximate the sum of squares cone is based on nonnegative polynomials that depend on a limited number of variables, called -juntas. The resulting program is called the Sherali Adams algorithm (SA) and was first introduced in [SA90] as a method to tighten the linear programming relaxations for 0/1 hypercube optimization problems. Thus, we assume throughout the section and whenever we consider (SA) that the hypercube constraints are contained in , meaning that .

For we denote and . Let

A nonnegative -junta is a function which depends only on at most input coordinates. It is easy to check that the set is precisely the set of nonnegative -juntas over the Boolean hypercube . The degree- Sherali Adams is the following problem:

(_^2d)

analogously SA takes the form . Note that the superscript in is (not ), because of the way SA was defined historically, providing that . However, this does not affect the polynomial equivalence between the proof systems; see section 1.

The program can be solved using the linear program (LP) of size .

2.3.1. SA - Dual perspective

Similarly as in section 2.1.1 and section 2.2.2 one can use a conic duality theory to transform the program (_^2d) into its dual of the form:

(_^2d)

The program (_^2d) is a linear system of size . Analogously, the program of degree takes the form .

2.4. Sum of Nonnegative Circuit

A method for approximating the cone , which is independent of SOS, is based on sums of nonnegative circuit polynomials (SONC), defined below in section 2.4.1. The technique was introduced by Iliman and the second author in [IdW16]. Let be the set of finite sums of nonnegative circuit polynomials. We consider the following program:

(_^2d)

analogously takes the form . As shown in [DIdW17, Theorem 4.8], for an arbitrary real polynomial that is strictly positive on a compact, basic closed semialgebraic set there exists a certificate of nonnegativity, i.e., the Schmüdgen-type Positivstellensatz section 2 applies to SONC. Moreover, searching through the space of degree certificates can be done via a relative entropy program (REP) [DIdW17] of size ; see also [CS17, CS16, CMW18]. REPs are convex optimization programs and are efficiently solvable with interior point methods; see e.g., [CS17, NN94] for more details.

2.4.1. Nonnegative Circuit Polynomials

We recall the most relevant statements about SONCs.

A polynomial is called a circuit polynomial if it is of the form

with , exponents , , and coefficients , , such that is a simplex with even vertices and the exponent is in the strict interior of .

For every circuit polynomial we define the corresponding circuit number as

One determines nonnegativity of circuit polynomials via its circuit number as follows:

[[IdW16], Theorem 3.8] Let be a circuit polynomial then is nonnegative if and only if and or and .

Let

Following Reznick, we define maximal mediated sets; note that these objects are well-defined due to [Rez89, Theorem 2.2]

Let such that . We call a set (-)mediated if every element of is the midpoint of two distinct points in .

We define the maximal mediated set as the unique -mediated set which contains every other mediated set.

Let be a simplex. If , then we call an -simplex. If consist only of and the midpoints of the vertices, then we call an -simplex.

Generalizing a result by Reznick in [Rez89], Iliman and the second author proved that maximal mediated sets are exactly the correct object for determining whether a nonnegative circuit polynomial is a sum of squares.

[[IdW16], Theorem 5.2] Let be a nonnegative circuit polynomial with inner term . Then is a sum of squares if and only if is a sum of monomial squares or if .

Especially is always an SOS if is an -simplex, and is never an SOS if is an -simplex.

For further details about SONCs see e.g., [dW15, DIdW17, DKdW18, IdW16, SdW18]. A description of the dual of the SONC cone was recently provided in [DNT18], which we, however, do not need for the purpose of this article.

2.5. Comparing proof systems

In this section we introduce the notation used for comparing the proof systems presented in Section 2, from the proof complexity perspective. For a gentle introduction to proof complexity we refer the reader to e.g., [Raz16].

Following the notation in section 2: Let be a set of polynomials which we axiomatically assume to be nonnegative and be a set of polynomials, which form the semialgebraic set . The GEN proof system is the set of all algebraic derivations such that deducing nonnegativity of polynomials over . Analogously, the GEN proof system is the set of all algebraic derivations such that deducing nonnegativity of polynomials over . The proof systems SOS, SOS, SDSOS, SDSOS, SA, SA, SONC and SONC are defined analogously.

The complexity of the certificate depends on the needed to certify the nonnegativity. Revising section 1 we say that a proof system contains a proof system if for every set of polynomials and a polynomial admitting a degree certificate of nonnegativity over in , admits also a degree certificate in . A system strictly contains if contains but does not contain . I.e., there exist at least one set and a polynomial nonnegative over such that admits a degree certificate in but for every does not admit a degree certificate in . Systems and are polynomially equivalent if contains and contains . Finally, systems and are polynomially incomparable if neither contains , nor contains . I.e., there exist sets , and polynomials , nonnegative over , , respectively, such that admits a degree certificate in but for every does not admit a degree certificate in and admits a degree certificate in but for every does not admit a degree certificate in .

3. SOS vs. SONC

It is well-known that the cone and cone are not contained in each other [IdW16, Proposition 7.2] This statement, however, gives no prediction whether or not for CPOPs these systems are polynomially equivalent or not. In this section we show that for every there exist CPOPs such that the difference between the minimal degrees of a SOS and a SONC certificate is arbitrarily large and vice versa.

3.1. SONC does not contain SOS

We consider the following family of polynomials:

We define the family of signed quadrics by .

It is obvious that every is SOS and that its zero set is the unit ball of the 1-norm, i.e., for all we have

(4)

The support of and is depicted together with their Newton polytopes in fig. 2.

Figure 2. The support and the Newton polytopes of and .

It is known that for every the function cannot be written as a combination of -juntas [Lee15, Theorem 1.12]. It is, however, also straightforward to conclude that for every the polynomial is not a SONC polynomial.

For all it holds that .

Proof.

By eq. 4 the real zero set is equal to the boundary of the -dimensional cross-polytope; see e.g., [Zie07]. In particular, it is an dimensional piecewise-linear set. A SONC, however, has at most many distinct real zeros by [IdW16, Corollary 3.9]. ∎

In [SdW18, Example 3.7] it is shown that is not a SONC due to a term by term inspection. We point out that one could build over that argument and reprove inductively fig. 2 using the fact that the support set of equals the restriction of the support set of restricted to a specific -face of .

For every with and every there exist infinitely many systems such that