 # Eigendecomposition of Q in Equally Constrained Quadratic Programming

When applying eigenvalue decomposition on the quadratic term matrix in a type of linear equally constrained quadratic programming (EQP), there exists a linear mapping to project optimal solutions between the new EQP formulation where Q is diagonalized and the original formulation. Although such a mapping requires a particular type of equality constraints, it is generalizable to some real problems such as efficient frontier for portfolio allocation and classification of Least Square Support Vector Machines (LSSVM). The established mapping could be potentially useful to explore optimal solutions in subspace, but it is not very clear to the author. This work was inspired by similar work proved on unconstrained formulation discussed earlier in <cit.>, but its current proof is much improved and generalized. To the author's knowledge, very few similar discussion appears in literature.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Problem Formulation

Consider the equality constrained quadratic problem (EQP)

 minimizex 12xTQx+λcTx (1) subject to Ax=b (2)

where is a symmetric positive semidefinite matrix, is a matrix (), and we assume A has full rank , is a vector in , is a vector in and is a scalar, and assume that its optimal value is finite.

The KKT conditions for the solution of QP (1) and (2) give rise for the following linear system

 [QATA0][x∗β∗]=[−λcb] (3)

where is the associated Langrange mutliplier.

We can be rewrite (3) as:

 [x∗β∗]=[QATA0]−1[−λcb] (4)

According to the general form of block matrix inverse [2, 3] :

 M=[ABCD] (5)
 M−1=[(A−BD−1C)−1−A−1B(D−CA−1B)−1−D−1C(A−BD−1C)−1(D−CA−1B)−1] (6)
 M−1=[(A−BD−1C)−1−(A−BD−1C)−1BD−1−(D−CA−1B)−1CA−1(D−CA−1B)−1] (7)

In particular, if are all square and invertible matrices, we have:

 (8)

The block matrices in (8) can be replaced by EQP parameters in (4):

 A=Q,B=AT,C=A (9)

so (4) can be written as

 [x∗β∗] =[Q†−Q†AT(AQ†AT)−1AQ†Q†AT(AQ†AT)−1(AQ†AT)−1AQ†−(AQ†AT)−1][−λcb] =[−λ(Q†−Q†AT(AQ†AT)−1AQ†)c+Q†AT(AQ†AT)−1b−λ(AQ†AT)−1AQ†c−(AQ†AT)−1b] (10)

where is pseudo-inverse of . When is positive definite, .

So we have an analytical solution for optimal :

 x∗=−λ(Q†−Q†AT(AQ†AT)−1AQ†)c+Q†AT(AQ†AT)−1b (11)

## 2 Diagonalization of Q using Eigendecomposition

Consider eigenvalue decomposition on :

 \stackunderQ(N×N)   =   \stackunderU(N×N)         \stackunderΣ(N×N)         \stackunderUT(N×N) (12)

where is the square matrix whose

-th column is the eigenvector of

, is the diagonal matrix of eigenvalues

 Σ=diag(σ1,...,σr,0,...,0) (13)

The pseudo-inverse of , , can be defined as

 \stackunderQ†(N×N)   =   \stackunderU(N×N)         \stackunderΣ†(N×N)         \stackunderUT(N×N) (14)

where

 Σ†=diag(σ−11,...,σ−1r,0,...,0) (15)

For a special case, assuming is reduced to a vector in , we further construct a diagonal as

 D=diag(d1,...,dN) (16)

where

 [d1,...,dN]=a(aU)−1 (17) a=[a1,...,aN],  ar∈{−1,+1} (18)

Then, it is easy to prove that

Assume can be decomposed as , thus to transform by left multiplying :

 ~X =DUTX (21) ~c =DUTc (22) ~Q =DUTQUDT=DUTUΣUTUDT=D2Σ (23)

So far, we transform the problem in (1) in a new form

 minimize~x 12~xT~Q~x+λ~cT~x (24) subject to aT~x=b (25)

where is now a diagonal matrix.

###### Theorem 1.

Let be the solution of problem defined in (1) and (2), where reduces to vector defined in (18). Let

be the solution of problem defined in (24) and (25). There exists a linear transformation to map

back to .

###### Proof.

According to (11), we have an analytical solution for (24) and (25) analogously

 ~x∗=−λ(~Q†−~Q†aT(a~Q†aT)−1a~Q†)~c+~Q†aT(a~Q†aT)−1b (26)

From (14), we have

 ~Q†=D−1Σ†D−1 (27)

Reuse results in (19) and (20), we have this invariant property

Applying (19), (20) and (28) in (26), we have

 (29)

Left multiply (33) with , and again reuse (19),(20),(28) on the underbraced parts, we have

Interestingly, and are all optimal solutions, and transformations in (21) to (23) diagonalizes the quadratic matrix . Constraint vector and scalars , are invariant. Because EQP is solved by linear system, the optimal solutions in different formulations can be mapped via linear transformation.

## 3 Example 1: Efficient Frontier of Mutual Fund Portfolio

### 3.1 Comparing Efficient Frontiers (EFs) solved from different formulations

In modern portfolio theory, the efficient frontier is an investment portfolio which occupies the efficient parts of the risk-return spectrum. One of the popular forms of portfolio theory is to solve

 minimizex 12xTQx−λcTx (31) subject to eTx=1 (32)

where

is the variance-covariance matrix,

is the expected return, is the risk aversion factor, is the all-ones vector, and is portfolio allocation. The covariance matrix is usually generated by return matrix as , and is the expected profit vector as average of , therefore the projections in (22) and (23) are valid here. Theorem 1 can be directly applied, where changes are to include the negative sign of , and to replace as .

Figure 1 shows four different efficient frontiers calculated using real porfolio data from mutual fund LMPRX. The covariance matrix and expected return are calculated from twenty years of stock historical price. First, monthly profit is calculated using the end of month market close price minus the beginning of month market opening price, and divided by that opening price. Then, covariance matrix is calculated on time-series data of 240 monthly profit ratios, and the return matrix is the mean of profit ratios. In total, 99 assets (stocks) are included in portfolio. The efficient frontiers are obtained by benchmarking 1000 values evenly sampled from 0.1 to 100. Values on the x axis are risk, calculated by , and values on the y axis are return, calculated by .

The blue curve represents EF solved in the original space without diagonalization of .

The red curve are values calculated directly using solutions solved from formulations based on diagonalized , so risk values are and return values are . As shown, the EF is identical to the original space, indicating risk and return are invariant values during diagonalization.

The black curve, is the projection of to by left multiplying , so the risk values are and returns are . This can be seen as reconstructing original solution from the diagonalized solution.

The green curve, is the a projection of by left multiplying only. Because can be viewed as a rescaling matrix, so plays a rotation role. The risk values here are and returns are .

The EFs of blue, red and black curves are identical. The EF of green curve has very similar in shapes but at different scale compared to the original curve.

### 3.2 Comparing Efficient Frontiers (EFs) solved in different subspaces Figure 2: Efficient Frontiers of Holding Assets of LMPRX projected in Subspaces

If we assume eigenvalues obtained in (13) are sorted , we can reconstruct using number of eigenvectors corresponding to the largest number of eigenvalues.

 \stackunderQ(N×N)   ≈   \stackunderUk(N×K)         \stackunderΣk(K×K)         \stackunderUTk(K×N) (33)

It is easy to see equations from (14) to (23) can be rewritten on and , and we obtain a new formulation as (24) and (25) where is a matrix, and other vectors , , are in .

Experiments are carried out to see how the size of impacts the results of EFs. In Figure 2, the left figure is EFs solved in original asset space, where risk aversion factor is evenly sampled for 1000 times from 0.1 to 100.

In the middle figure of Fig 2, is solved using 10 different values of evenly sampled from 5 to 95. When is 99, it equals to the diagonalized solutions (red curve) shown in Figure 2. Risk and return values in the middle figure are calculated using solutions from diagonalized formulation and return values are .

The right figure of Fig 2 shows risk and return values using projected (reconstructed) solutions and returns are .

As shown, approximated subspace solutions lead to big differences in final optimal solutions. The smaller the dimension of subspace is selected, the larger the difference between the results is observed (comparing color curves with the original solution black curve)s. When solving in subspace, the risk and return values are still identical between results calculated in subspace and recovered in original space (middle figure and right figure have same gradient color curves).

## 4 Example 2: Decision Boundary of Least Square Support Vector Machines Classifier

We show the mappings can be generalized to other problems involving solutions of linear system. As known, the dual problem of LSSVM is a linear system :

 [cα]=[0yTyΩ+I/γ]−1[0e] (34)

where is the binary label vector composed by -1, +1 values, is the kernel matrix, the regularization parameter, is all-ones vector, is the bias term, and

is the vector of dual variables and the classifier in the dual space takes the form

 y(x)=sign[N∑k=1αkykK(x,xk)+c] (35)

Variables in (4), putting on left side, can be substituted by variables in (34), placing on the right side, as

 A =y (36) Q =Ω+I/γ (37) λ =1 (38) c =e (39) b =0 (40) x =α (41)

Denoting , and is now symmetric positive definite, we have a closed-form solution for optimal :

 α∗=−(Ω−1r−Ω−1ryT(yΩ−1ryT)−1yΩ−1r)e (42)

Decompose with eigenvalue formulation , constructing as (16) to (18), it is easy to prove that

 yD−1=yU (43) D−1yT=UTyT (44)

Unlike the previous example where projections need to be interpretable, here and are independent, so we left multiply on , and replace with eigenvalue decomposition results:

 ~e =DUTe (45) ~Ωr =DUTΩrUDT=DUTUΣUTUDT=D2Σ (46)

Following the proof in Section 2, it is easy to see

 UD~α∗ =−[UΣ−1UT−UΣ−1D−1yT(y~Ω−1ryT)−1yD−1Σ−1UT]e =−[Ω−1r−Ω−1ryT(yΩ−1ryT)−1yΩ−1r]e =α∗ (47) Figure 3: Decision boundaries of LSSVM solved in original formulation and diagonalized formulation

The mappings established in LSSVM binary classifier are validated on a toy problem visualizing predicted labels and decision boundaries. In Figure 3, the first column shows the original toy data, and the image plot of RBF kernel . The second column compares the predictions of two formulations. The top one is predictions from (35) using dual variables solved in original formulation. The bottom one is based on diagnoalized formulation. The third column compares decision boundaries (top original vs bottom diagonalized). Easy to see solutions are identical, and after projection (47), the solutions of are both sparse.

In this case, we cannot easily interpret solutions directly in diagonalized space and explore subspace solutions, mainly because corresponds to labels of original data point. It might be possible to project out-of-sample kernel matrix, which is used during prediction, onto the eigenspce of kernel obtained from training data. Such approach need more efforts and development. In the previous portfolio allocation problem, we consider diagonalization as exploration of underlying factors of stock volatility, and the constraint vector indicates in both original asset space and factor space, the sums of portfolio allocation should all equal to 1.

## 5 Conclusions

This report investigates an interesting mapping when eigenvalue decomposition is applied on the quadratic term matrix in some special form Equally constrained QP problems. Current results work under some very special cases: First, in (2) can only be a single vector, and its elements should be -1 or +1. That allows the construction of diagonal matrix in (16) to satisfy (19) and (20). Further generalizations could be possible, but need further development. Because of the same -1 and +1 constraints, LSSVM formulation is valid only for classification, where label vector contains -1 and +1 values as class labels. Similar extention to regression problem still needs development. The key effort is to construct scaling vectors in (17) to reconstruct the original solution. Second, the found mapping is only valid under equality constraint because of the underlyinglinear system solution.

The main motivation of investigating such a mapping is the efficiency and robustness of solving large scale EQP when is diagonal. In modern portfolio theory, finding optimal allocations in factor space where factors are independent (orthogonal) leads to interesting interpretations of risk-return w.r.t. market volatility.

The Matlab code of all experiments in this paper is available at http://github.com/cyberyu/eigeqp

## References

• 

Ji Tan, Principal Component Analysis and Portfolio Optimization (March 1, 2012). Available at SSRN:

https://ssrn.com/abstract=2213687 or http://dx.doi.org/10.2139/ssrn.2213687
•  Tzon-Tzer Lu and Sheng-HuaShiou, Inverses of 2 2 block matrices, Computers & Mathematics with Applications, Volume 43, Issues 1–2, January 2002, Pages 119-129
•  Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, first edition, 2004.
•  J. A. K. Suykens and J. Vandewalle, Least Squares Support Vector Machine Classifiers, Neural Processing Letters, Volume 9, Issues 3, 1999, Pages 293–300