1 Introduction
A fundamental problem in mathematical optimization is to maximize a linear function over a convex constraint set. An important consideration for the development of tractable numerical algorithms for such problems is that the constraint set admits a compact description. A powerful paradigm for obtaining compact descriptions of convex sets is based on lifts over structured convex cones. More precisely, given a convex set , an extended formulation or lift of over a (fulldimensional closed) convex cone is a description of as the projection of an affine slice of the cone ; i.e., , where is a linear projection and an affine subspace.
Given a lift, the problem of maximizing linear functions over reduces to one of solving a linear conic program over the cone . Linear conic programs (LCPs) capture many important classes of optimization problems β for instance, LCPs where the cone is the dimensional nonnegative orthant correspond to Linear Programs (LPs), LCPs where is the cone of positive semidefinite matrices correspond to Semidefinite Programs (SDPs), and linear conic programs where is the dimensional second order cone correspond to Second Order Cone programs (SOCPs). We often refer to extended formulations over as LP, SDP, and SOCPlifts respectively.
Lifted descriptions are powerful tool for optimization because there are many examples of convex sets whose descriptions (at least on the surface) appear to be complex (such as by being specified by a number of inequalities that is exponential in the ambient dimensions) but do in fact admit compact lifted descriptions. Examples of such constraint sets include the unit ball of the norm, the spanning tree polytopesΒ [20, 27], and the permutahedraΒ [10].
The choice of cone also matters β the smallest known LPlift for the stable set polytope of a perfect graph with vertices is [28] whereas there exists an SDPlift of size using the theta body [21], and in fact it is the latter representation that gives rise to the only known polynomialtime algorithm for finding the largest stable set in a perfect graph. Moreover, the dimension of the cone is fundamentally linked to the computational complexity required to solve the associated linear conic program. As such, for an extended formulation to be practically useful for optimization, it is important to seek extended formulations involving cones of low dimensionality.
Despite the usefulness of extended formulations, it is not immediately clear how one systematically searches for such descriptions. To this end, Yannakakis shows that LPlifts of polytopes are fundamentally linked to the existence of a structured matrix factorization of a slack matrixΒ [28], while Gouveia, Parrilo, and Thomas [11] extend this connection to lifts using more general cones.
We explain this connection more precisely: Given an entrywise nonnegative matrix and a (fulldimensional convex) cone lying in inner product space , the cone factorization problem
concerns finding two collections of vectors
in the cone and in the dual cone Β where(1) 
In the case where , the cone factorization problem reduces to the Nonnegative Matrix Factorization (NMF) problem in which we seek a collection of nonnegative vectors and such that for all and . The smallest for which has an dimensional NMF is the nonnegative rank of the matrix . NMFs were initially studied within the field of linear algebra (see, e.g.,Β [4]), and later gained prominence as a dimensionality reduction tool providing interpretable partsbased representations of nonnegative data [18].
In the context of LPlifts, YannakakisΒ [28] showed that the existence of an lift for a polytope is equivalent to the existence of a dimensional NMF of its slack matrixΒ
, a nonnegative rectangular matrix where the rows are indexed by the bounding hyperplanes
ofΒ , the columns are indexed by the extreme points , and the entry is given by . Subsequently, Gouveia, Parrilo, and Thomas [11] showed that a polytope admits an extended formulation over a (fulldimensional) closed convex cone if its slack matrix admits an exact cone factorization over . In fact, this relationship holds for convex sets beyond polyhedral ones, although it is then necessary to extend the notion of a slack matrix to that of a slack operator indexed over the infinite set of extreme points and/or facets.Cone factorizations also have important applications beyond optimization. One such prominent example is in quantum information science, where factorizations over the PSD cone are relevant to the quantum correlation generation problem [13] and oneway quantum communication complexity [9]. Specifically, given a nonnegative matrix , the positive semidefinite matrix factorization (PSDMF) problem concerns computing two families of PSD matrices and such that for all . The smallest for which admits an dimensional PSDMF is known as the PSDrank of Β [7].
Computing cone factorizations.
The task of computing an exact cone factorization is in general intractable. Concretely, in the specific setting where is the nonnegative orthant, Vavasis showed that even computing the nonnegative rank is NPhard [26]. On the positive side, Arora etΒ al. propose an algorithm for computing dimensional NMFs whose complexity is polynomial time in the dimensions of the matrix, provided the rank parameter is held constantΒ [1].
Despite the hardness of NMFs, a wide range of numerical algorithms for computing (approximate) NMFs have been developed and implemented in a wide range of data analytical applications. Most, if not all, algorithmic approaches are based on the principle of alternating minimization. One of the most prominent approaches for computing NMFs is the Multiplicative Update (MU) algorithm proposed by Lee and Seung [19]. The update scheme is based on the majorizationminimization framework [17], and it operates by performing pointwise scaling by carefully selected nonnegative weights. In fact, it is the simplicity of the MU scheme that drives its popularity. Other alternative methods include variants of projected gradient methods and coordinate descent.
There are also a wide range of methods for computing PSDMFs, which operate in a similar vein by alternating minimization. In particular, Vandaele et al. propose algorithmic approaches based on the projected gradient method and coordinate descent [24], while the authors [14, 15, 16] develop algorithms by drawing connections between PSDMFs to the affine rank minimization and the phase retrieval problems from the signal processing literature. Recently, Soh and Varvitsiotis introduced the Matrix Multiplicative Update (MMU) method, which is the analogue of the MU scheme for computing PSDMFs in which updates are performed by congruence scaling with appropriately chosen PSD matrices [22]. In particular, the Matrix Multiplicative Update scheme retains the simplicity that the MU update scheme for NMF offers.
Contributions.
In this work, we introduce the symmetriccone multiplicative update (SCMU) algorithm for computing cone factorizations 1 in the case where the cone is symmetric; that is, it is self dual and homogenous (i.e., the automorphism group of acts transitively on ). The SCMU algorithm corresponds to LeeSeungβs Multiplicative update algorithm [19] for NMF when is the nonnegative orthant, and to the Matrix Multiplicative update algorithm in [22] for PSDMF when is the cone of PSD matrices.
The SCMU scheme is based on the principles of the majorizationminimization framework, and is multiplicative in the sense that iterates are updated by applying an appropriately chosen automorphism of (the interior) of the cone . As a result, the SCMU algorithm ensures that iterates remain in the interior of the cone , provided it is initialized in the interior. In terms of performance guarantees for the SCMU algorithm, we prove is that the Euclidean squared loss is nonincreasing along its trajectories and moreover, we also show that fixed points our our scheme correspond to firstorder stationary points. Additionally, if the starting factorization lies in a direct sum of simple symmetric cones, the direct sum structure is preserved throughout the execution of the SCMU algorithm. As such, the SCMU algorithm specifies a method for computing hybrid cone factorizations.
In terms of applications, when applied to computing cone factorizations of slack matrices, the SCMU algorithm provides a practical way to compute conic lifts of convex sets β for instance, it can be used to compute approximate SOCP lifts when is chosen to be (products of) the second order cone . In particular, we give explicit examples where we apply the SCMU algorithm for computing SOCP lifts of regular polygons.
Paper Organization.
In Section 2, we provide necessary background material concerning Euclidean Jordan Algebras and Symmetric Cones. In Section 3, we describe our approach and derive the multiplicative update algorithm for symmetriccone factorizations. In Section 4, we show that the Euclidean square loss is nonincreasing along the algorithmsβ trajectories and that fixed points satisfy the firstorder optimality conditions. In Section 5, we conclude with numerical experiments focusing on lifts over the secondorder cone.
2 Euclidean Jordan Algebras and Symmetric Cones
In this section, we provide brief background on symmetric cones to describe our algorithm and our analysis. Our discussion requires the formal language of Euclidean Jordan algebras (EJAs) from which symmetric cones arise. For further details and omitted proofs, we refer the reader toΒ [5, 25].
Jordan Algebras. Let be a finitedimensional vector space endowed with a bilinear product . We say that the pair form a Jordan algebra if the following properties hold:
Here, we use the shorthand notation . In the remainder of this paper, we assume that the Jordan algebra has an identity element; that is, there exists such that .
Euclidean Jordan Algebras. A Jordan algebra over which is equipped with an associative inner product , (i.e., for all ) is calledΒ Euclidean.
Spectral decomposition and powers. Our algorithm requires the computation of squareroots as well as inverses of Jordan Algebra elements. To explain how these are defined, we require a spectral decomposition theorem for EJAs. In what follows, we describe a version of this theorem based on Jordan frames (see TypeII spectral decompositionΒ [5, Theorem III.1.2]).
More concretely, an idempotent is an element satisfying . We say that an idempotent is primitive if it is nonzero and cannot be written as a sum of two nonzero idempotents. We say that a collection of primitive idempotents form a Jordan frame if they satisfy if and only if , and . It follows that distinct elements of a Jordan frame are orthogonal with respect to the inner product .
For every there exists a Jordan frame , and real numbers Β with Here, the value is called the rank of . The scalars are called the eigenvalues of and β up to reordering and accounting for multiplicities β are uniquely specified.
Given with spectral decomposition , we define its th power as whenever exists for all . We say that an element is invertible if all its eigenvalues are nonzero.
The canonical trace inner product. In our description of an EJA so far, we have only assumed the existence of an inner product . There is in fact a canonical choice: Given the spectral decomposition , define the trace to be the sum of the eigenvalues . Then, the bilinear mapping is positive definite, symmetric, and also satisfies the associativity property (see, for instance, [5, Proposition II.4.3]). Furthermore, in the case where the EJA is simple (i.e., it cannot be expressed as the direct sum of smaller EJAs), the inner product is a positive scalar multiple of the canonical one ([5, Proposition III.4.1]). In the remainder of this paper we use the notation to denote the canonical EJA inner product.
The Lyapunov transformation and the quadratic representation. Consider an EJA . As the algebra product is bilinear, given any , there is a matrix Β satisfying for all . We note that it is fairly common in the literature to express the defining properties of an EJA in terms of the Lyapunov operator . For isntance, the requirement is equivalent to the requirement that the operator commutes with , while the requirement is equivalent to the requirement , i.e., the operator is symmetric with respect to the trace inner product.
The quadratic representation of is defined by . The term quadratic alludes to the fact that . The quadratic representation satisfies the following properties:
(2)  
(3)  
(4)  
(5)  
(6)  
(7)  
(8)  
(9) 
Cone of squares and symmetric cones. Given an inner product space , we say that a cone is symmetric if:

is selfdual; i.e., , and

is homogeneous; i.e., given , there is an invertible linear map such that , and (or acts transitively on ).
Given an EJA , the cone of squares is defined as the set
The set is selfdual with respect to the inner product , and hence closed and convex.
The key connection between symmetric cones and EJAs is that all symmetric cones arise as cone of squares of some EJA; see e.g., [5, Theorem III.3.1]. In particular, the homogeneity property of the cone of squares follows from the existence of a scaling point; i.e., for itΒ holds
(10) 
As both , it follows by 9 that and are positive definite. Consequently, the product is also positive definite and thus invertible. ByΒ 7, defines an automorphism of .
Classification of EJAs and Symmetric Cones. There is in fact a complete classification of all EJAs. Let denote the fields of complex numbers, quaternions, and octonions respectively. Then every finitedimensional EJA is isomorphic to a direct sum of these simple EJAs:

Symmetric matrices over or Hermitian matrices over or with .

The space with product .

The space of 3by3 Hermitian matrices over with product .
Subsequently, since all symmetric cones arise as cones of squares of some EJA, it follows that any symmetric cone corresponds to a direct sum of:

Symmetric PSD matrices over or Hermitian PSD matrices over or

Secondorder cones, i.e., .

3by3 PSD matrices over octonions .
In Table 1, we summarize the basic information for the EJA corresponding to the cone of positive semidefinite matrices and the second order cone.
The Geometric Mean. Given a symmetric cone , the (metric) geometric mean of two elements is defined as the scaling point that takes to [23], i.e.,
(11) 
As established in 10, the geometric mean satisfies:
(12) 
In fact, it turns out that the geometric mean is the unique element of satisfyingΒ 12. Indeed, suppose satisfies . We then have
(13) 
As and are both elements of (recall 7), the equality in 13 implies that and thus .
Finally, the geometric mean satisfies the following useful properties (e.g. see [23]):
(14) 
Cone of squares  PSD matrices  Secondorder cone 

Ambient dimension  
Identity  
Trace  
Rank (of interior)  
Eigendecomposition  Spectral decomposition  
3 Deriving the Symmetriccone Multiplicative Update Algorithm
Let be a symmetric cone, and let be an entrywise nonnegative matrix. Our goal is to compute vectors and belonging to the cone such that
More formally, we frame our problem as a minimization instance over the squared loss objective:
(15) 
In order to (approximately) solve 15, we alternate between minimizing the βs and the βs. Consider the subproblem corresponding to fixing the variables and minimizing over . The objective 15 is separable in the βs. After dropping the suffix to simplify our notation, the problem simplifiesΒ to
(16) 
where is a fixed vector (the th column of ) and is the linear mapping
We solve the convex optimization problem 16 via the MajorizationMinimization (MM) approach (see, e.g.,Β [17] and references therein). In order to solve a problem using the MM framework, we need to identify a family of auxilliary functions , indexed by , satisfying these conditions
(17)  
The MM update scheme is given by
Subsequently, the objective is nonincreasing along the tracjectory of , which follows easily from the inequalities
Recall that in our setting, the goal is to minimize the function over the symmetric cone . As the objective is quadratic, the Taylor expansion of at the past iterate is
(18) 
Suppose we restrict our search to quadratic auxilliary functions of the form
(19) 
for some appropriate . Using such an auxilliary function, to compute the next iterate, we need to minimize the function in 19 over the cone
(20) 
Note that although the objective 19 is strictly convex since choosing ensures that is positive definite, it is not immediately clear how one handles the constraint . Instead, we show that it is possible to make a specific choice of such that the minimizer of the unconstrained auxilliary problem 20 lies in the interior of . In such a setting, we would have
(21) 
and hence we can simply calculate as the unconstrained minimum of 19, namely
(22) 
Furthermore, suppose we pick so that
(23) 
Then, the MM update is given by
Suppose that , and that the vector is nonzero. From 7, it follows that is a nonnegative linear sum of elements in , and hence so long as .
The equation 23 specifies uniquely; specifically, 23 is equivalent to , which by the scaling point interpretation of the geometric mean (recall 12) implies that
(24) 
Note that for 24 to exist, we require and to be in .
Finally, to check that the function given in 19 corresponding to is indeed an auxilliary function, we need to verify that the two conditions 17 are satisfied. The proof of this fact is deferred to the next section and is the main technical result in this paper.
Summarizing the preceding discussion, employing the MM approach to minimize the function over the symmetric cone using an auxilliary function of the form 19 with leads to the following the update rule:
In a similar fashion, we get an update rule for the βs when the βs are fixed, where is replacedΒ by
We summarize our procedure for computing factorizations over symmetric cones in Algorithm 1.
Input: A nonnegative matrix
Output: Vectors with
3.1 Two important special cases
First, we specialize the SCMU algorithm to the setting where , in which case the SCMU algorithm gives an iterative method for computing NMFs. The nonnegative orthant is the cone of squares of the EJA , where the Jordan product is componentwise multiplication, i.e., . Moreover, the trace of is just its 1norm, the Lyapunov transformation is and the quadartic mapping . Finally, the metric geometric meanΒ 11 of is given by .
Putting everything together, the th coordinate of the vector is updated by:
(25) 
These updates correspond to the multiplicative update rule introduced by Lee and Seung inΒ [19], which is one of the most widely used approaches for calculating NMFs.
Next, we specialize the SCMU algorithm for cone factorizations with PSD factors (with real entries). Letting denote the space of real symmetric matrices, the (real) PSD cone is the cone of squares of the EJA , where the Jordan product is given by . In this setting, the Lyapunov operator and the quadartic representation are superoperators (i.e., linear operators acting on a vector space of linear operators), and are concretely given by
where is the vectorization operator. Using that , weΒ get
(26) 
Moreover, the trace is just the usual trace of a symmetric matrix (i.e., the sum of its eigenvalues). Lastly, using 26, the metric geometric mean 11 specializes to
the usual geometric mean of two positive definite matrices, e.g. see [3]. Putting everything together, we have that
and thus, the SCMU algorithm in the case specializes to:
This is exactly the Matrix Multiplicative update method derived in [22].
4 Performance Guarantees of the SCMU Algorithm
Our first result is to show that, given a symmetric cone , the function defined in 19 does indeed parameterize a family of auxilliary functions for over . To do so, we need to verify that the two properties given in 17 do indeed hold. We obviously have that , and hence it remains to verify the domination property, namely that
(27) 
In fact, we show in the next theorem that this bound holds for all .
Theorem 4.1.
Let and define to be the linear map
where . Furthermore, let and
Then, we have that
(28) 
In particular, this immediately implies that
(29) 
Proof.
We first focus on simple EJAs. We need to show the validity of the generalized inequalityΒ 28 with respect to the Euclidean inner product. Nevertheless, as for simple EJAs, any inner product is a positive multiple of the canonical one, it suffices to prove 28 for the canonical one.
The proof of 28 is broken down in two steps. First, we show that it suffices to consider the special case where . More precisely, we first prove the inequality
(30) 
for all such that . Assuming that 30 holds, we let , and let . Then
Thus, by 11 we have
By 30 we have . Consequently, we have
where in the second equality we used 4. To conclude the proof, we prove 30 by showing the following two properties in Lemma 4.2 and Lemma 4.3 respectively:
(31)  
(32) 
Assuming the validity of 31 and 32, it is now easy to conclude thatΒ 30 holds. By definition of the map and 14 weΒ have
Consequently, 30 is equivalent to
(33) 
Lastly, the proof of 33 follows from the following chain of inequalities:
where for the first inequality we use 32, for the second equality we use property 3 of the quadratic representation, and for the last inequality we use 31.
Lastly, we consider the case where the EJA is a direct sum of simple ones. In this case, the cone of squares is a direct of simple symmetric cones, i.e., and the Jordan product is given by and Then, it is easy to check that the operators and are separable with respect to the blocks corresponding to these individual simple EJAs. Thus, if the operator inequality holds for each individual block, it holds for the fullsized operators. β
Next we proceed with the proof of 31.
Lemma 4.2.
For any we have that
Proof.
We will show that
For any EJA the function
is jointly concave for any fixed and [8, Theorem 3.1]. This the extension of Liebβs Concavity Theorem in the more general setting of EJAs. Also, by [8, Lemma 3.1] we have
and thus, it follows that for any fixed the mapping
is concave; i.e., for any and we have
Setting and , and using that we get
which is exactly 31. β
Next we proceed with the proof of 32.
Lemma 4.3.
For any we have that
Proof.
For this, note that:
Comments
There are no comments yet.