 # Low-rank Approximation of Linear Maps

This work provides closed-form solutions and minimal achievable errors for a large class of low-rank approximation problems in Hilbert spaces. The proposed theorem generalizes to the case of linear bounded operators and p-th Schatten norms previous results obtained in the finite dimensional case for the Frobenius norm. The theorem is illustrated in various settings, including low-rank approximation problems with respect to the trace norm, the 2-induced norm or the Hilbert-Schmidt norm. The theorem provides also the basics for the design of tractable algorithms for kernel-based or continuous DMD

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Let and be two separable Hilbert spaces of dimension and , possibly infinite. Let denote the class of linear bounded operators from to and where denotes the rank operator. In this work, we are interested in characterizing the solutions of the following constrained optimization problem

 argminM∈Bk(V,V)∥Y−M∘X∥,X,Y∈S⊆B(U,V), (1)

where is some operator norm and some subset to be specified in due time, and where the symbol denotes the operator composition. Problem (1) is non-convex due to the rank constraint and is in general infinite-dimensional. This raises the question of the tractability of problems of the form of (1).

In the last decade, there has been a surge of interest for low-rank solutions of linear matrix equations [3, 6, 8, 9, 11, 14]. Problems of the form of (1) can be viewed as generalizations to the infinite dimensional case of some of those matrix equations. In the finite dimensional case, certain instances with a very special structures admit closed-form solutions [2, 4, 10, 12], which can be computed in polynomial time. We mention that some authors have proposed tractable but sub-optimal solutions to some particular finite [1, 7, 16, 17] and infinite-dimensional  problem instances.

In this work, we show that some infinite-dimensional problems of the form of (1) admit also a closed-form solution. The proof relies on the well-know Schmidt-Eckhart-Young-Mirsky theorem . The theorem exposed in this work can be viewed as a direct generalization to the infinite dimensional case of [4, Theorem 4.1]. It also generalizes the solution of approximation problems in the sense of the -th order Schatten norm, and includes the Frobenius norm as a particular case.

## 2 Problem Statement and Solution

We begin by introducing some notations, then define the low-rank approximation problem and finally provide a closed-form solution and error norm.

### 2.1 Notations

Let and be any ONBs of and . The inner product in those spaces will be denoted by and and their induced norm by and . Let Let

be the identity operator. Let the singular value decomposition (SVD) of

be

 M=m∑i=1σMiφMi⟨ψMi,⋅⟩U,

where , are respectively the left and right singular functions associated to the singular values of  . The pseudo inverse of denoted will be defined as

 M†=m∑i=1(σMi)†ψMi⟨φMi,⋅⟩V,where(σMi)†={(σMi)−1ifσMi>00else.

The -th Schatten norm, denoted by , is defined for and any as

 ∥M∥S,p=⎛⎝dim(U)∑i=1∥MeUi∥pV⎞⎠1p=(m∑i=1(σMi)p)1p,

and the -th Schatten-class is  .

### 2.2 Optimization Problem

We are now ready to clarify the definition of problem (1). Let . We are interested in the low-rank approximation problem solutions

 M⋆k∈ argminM∈Bk(V,V)∥Y−M∘X∥S,p. (2)

where symbol denotes an operator composition.

### 2.3 Closed-Form Solution

We detail in the following theorem our result. The proof is detailed in Section 4.

Problem (2) admits the optimal solution

 M⋆k=Pk∘Y∘X†,

where is given by and . Moreover, the square of the approximation error is

 ∥Y−M⋆k∘X∥2S,p=(m∑i=k+1(σZi)p)2p+⎛⎜⎝m∑i=\rank(X)+1(m∑j=1(σYj)2⟨ψYj,ψXi⟩2U)p2⎞⎟⎠2p, (3)

where .

###### Remark 1 (Modified p-th Schatten Norm)

The result can be extended for an approximation in the sense of the modified -th Schatten norm. In particular, for and for , this extension can be seen as the DMD counterpart to the POD problem with energy inner product presented in [13, Proposition 6.2]. Let us define this modified norm. We need first to introduce an additional norm for induced by an alternative inner product. For any , we define

 ⟨v1,v2⟩VK=⟨v1,K∘v2⟩V,

where is compact and self-adjoint, i.e., . Since is self-adjoint, the SVD guarantees that can be decomposed as where , and that . The modified -th Schatten norm is then defined for any and as

 ∥M∥S,K,p=⎛⎝dim(U)∑i=1∥K12∘MeUi∥pV⎞⎠1p,

which can be rewritten as

 ∥M∥S,K,p=⎛⎜⎝m∑i=1⎛⎝dim(V)∑j=1σKj(σMi)2|⟨φMi,φKi⟩V|2⎞⎠p/2⎞⎟⎠1p.

An optimal solution of problem (2) in the sense of the norm is then

 M⋆k=(K12)†∘P′k∘K12∘Y∘X†,

where with .

## 3 Some Particularizations

### 3.1 Trace Norm, Hilbert-Schmidt Norm and 2-Induced Norm

For , the -th Schatten norm corresponds to the -induced norm

 ∥M∥S,∞=sup{u|∥u∥2=1}∥Mu∥L2=supiσMi,

while for and we obtain the trace norm and the Hilbert-Schmidt norm. We have thus shown that is the optimal solution of problem (2) for approximation in the sense of the trace norm, the Hilbert-Schmidt norm or the -induced norm.

### 3.2 Unconstrained DMD, Low-rank DMD and Kernel-Based DMD

If is full rank, or equivalently , then and the optimal approximation error simplifies to

 ∥Y−M⋆k∘X∥2S,p=(m∑i=k+1(σZi)p)2p.

If and , we recover the standard result for the unconstrained DMD problem .

In the case and , we recover the optimal result proposed in [4, Theorem 4.1] for low-rank DMD (or extended DMD in finite dimension). Sub-optimal solutions to this problem have been proposed in [1, 7, 18, 17]

In the case , the result characterizes the solution of low-rank approximation in reproducing kernel Hilbert spaces, on which kernel-based DMD relies. Theorem 2.3 justifies in this case the solution computed by the optimal kernel-based DMD algorithm proposed in . We note that the proposed solution has been already given in  for the infinite dimensional setting, but in the case where . Nevertheless, the solution provided by the authors is sub-optimal in the general case.

### 3.3 Continuous DMD

In the case , the result characterizes the solution of a continuous version of the DMD problem, where the number of snapshots are infinite. In particular, for , the problem is the DMD counterpart to the continuous POD problem presented in [13, Theorem 6.2]. Here, problem (2) is defined as follows. in (2) are compact Hilbert-Schmidt operators, defined by their kernels

 X:g→∫UkX(u)g(u)dμ(u)andY:g→∫UkY(u)g(u)dμ(u),

where are the Hilbert-Schmidt kernels with supplied by the measure and so that . The solution and the optimal error are then characterized by Theorem 2.3.

## 4 Proof of Theorem 2.3

We will use the following extra notations in the proof. We define , where for any , . We thus have Finally, let and

### 4.1 Closed-Form Solution M⋆k

We begin by proving that problem (2) admits the solution .

First, we remark that is full-rank () so that , with . Therefore, using the Pythagore Theorem, we have

 minM∈Bk(V,V)∥Y−M∘X∥2S,p=minM∈Bk(V,V){ ∥(Y−M∘X)∘X†∘X∥2S,p +∥(Y−M∘X)∘(I−X†∘X)∥2S,p}.

Since we have

 X∘X†∘X=Xthus∥M∘X∘(I−X†∘X)∥2S,p}=0,

we obtain

 minM∈Bk(V,V) ∥Y−M∘X∥2S,p=minM∈Bk(V,V){∥(Y−M∘X)∘X†∘X∥2S,p +∥Y∘(I−X†∘X)∥2S,p} = minM∈Bk(V,V){∥Y∘X†∘X−M∘X∥2S,p+∥Y∘(I−X†∘X)∥2S,p} = minM∈Bk(V,V){∥~Y−M∘~X∥2S,p+∥Y∘(I−X†∘X)∥2S,p}, (4)

where the last equality follows from the invariance of the -th Schatten norm to unitary transforms and the fact that .

Second, from the Sylvester inequality, we get

 min~M∈Bk(U,V)∥~Y−~M∥2S,p≤minM∈Bk(V,V)∥~Y−M∘~X∥2S,p,

and by invariance of the -th Schatten norm to unitary transforms, we obtain

 minΛ∈Bk(U,V)∥Λ~Y−Λ∥2S,p(P1)≤minM∈Bk(V,V)∥Λ~Y−U∗~Y∘M∘~X∘V~Y∥2S,p(P2),

where .

Third, from the Schmidt-Eckhart-Young-Mirsky theorem , problem () admits the solution

 Λ⋆k=k∑i=1σ~YieVi⟨eUi,⋅⟩U

Fourth, we remark that and that the truncation to terms of the SVD of corresponds to the operator yielding . Therefore, and we verify that

 U∗~Y∘M⋆k∘~X∘V~Y=Λ⋆k.

We deduce that the minimum of the objective function of () reaches the minimum of the objective function of () at , i.e.,

 ∥Λ~Y−U∗~Y∘M⋆k∘~X∘V~Y∥2S,p=∥Λ~Y−Λ⋆k∥2S,p,

Finally, since the objective function of () reaches at its lower bound, is a minimiser of (). We then deduce from (4.1), that is also a minimiser of problem (2).

### 4.2 Characterization of the Optimal Error Norm

It remains to characterize the error norm (3). On the one hand, we have

 Y∘(I−X†∘X)=m∑i=1m∑j=r+1σYiφYi⟨ψXj,⋅⟩U⟨ψYi,ψXj⟩U.

Since is a ONB of , we can expand the norm and obtain

 ∥Y∘(I−X†∘X)∥2S,p =⎛⎝dim(U)∑ℓ=1∥Y∘(I−X†∘X)ψXℓ∥pV⎞⎠2/p, =⎛⎝dim(U)∑ℓ=1∥m∑i=1m∑j=r+1σYiφYi⟨ψXj,ψXℓ⟩U⟨ψYi,ψXj⟩U∥pV⎞⎠2/p, =⎛⎝m∑ℓ=r+1(∥m∑i=1σYiφYi⟨ψYi,ψXℓ⟩U∥2V)p/2⎞⎠2/p, =⎛⎝m∑ℓ=r+1(m∑i=1∥σYiφYi⟨ψYi,ψXℓ⟩U∥2V)p/2⎞⎠2/p, =⎛⎝m∑ℓ=r+1(m∑i=1(σYi⟨ψYi,ψXℓ⟩U)2)p/2⎞⎠2/p,

where, in order to obtain the two last equalities, we have exploited the fact that is an ONB of . On the other hand, we have

 ∥Λ~Y−Λ⋆k∥2S,p =⎛⎝dim(U)∑j=1∥m∑i=1σ~YieVi⟨eUi,eUj⟩U−k∑i=1σ~YieVi⟨eUi,eUj⟩U∥pV⎞⎠2/p, =⎛⎝dim(U)∑j=1∥m∑i=k+1σ~YieVi⟨eUi,eUj⟩U∥pV⎞⎠2/p, =(m∑i=k+1∥σ~YieVi∥pV)2/p, =(m∑i=k+1(σ~Yi)p)2/p.

Finally, from (4.1) and the two above expressions, we conclude

 ∥Y−M⋆k∘X∥2S,p =∥~Y−M⋆k∘~X∥2S,p+∥Y∘(I−X†∘X)∥2S,p, =∥Λ~Y−Λ⋆k∥2S,p+∥Y∘(I−X†∘X)∥2S,p, =(m∑i=k+1(σ~Yi)p)2/p+⎛⎝m∑ℓ=r+1(m∑i=1(σYi⟨ψYi,ψXℓ⟩U)2)p/2⎞⎠2/p.□

## 5 Conclusion

We have shown that there exists a closed-form optimal solution to the non-convex problem related to low-rank approximation of linear bounded operators in the sense of the -th Schatten norm. This result generalizes to low-rank operator in Hilbert spaces solutions obtained in the context of low-rank matrix approximation. As in the latter finite-dimensional case, the proposed closed-form solution takes the form of the orthogonal projection of the solution of the unconstrained problem onto a specific low-dimensional subspace. However, the proof is substantially different. It relies on the well-known Schmidt-Eckhart-Young-Mirsky theorem. The proposed theorem is discussed and applied to various contexts, including low-rank approximation with respect to the trace norm, the -induced norm and the Hilbert-Schmidt norm, or kernel-based and continuous DMD.

## References

•  Chen, K.K., Tu, J.H., Rowley, C.W.: Variants of dynamic mode decomposition: boundary condition, koopman, and fourier analyses. Journal of nonlinear science 22(6), 887–915 (2012)
•  Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936)
•  Fazel, M.: Matrix rank minimization with applications, stanford university. Ph.D. thesis (2002)
•  Héas, P., Herzet, C.: Low rank dynamic mode decomposition: Optimal solution in polynomial time. arXiv e-prints (2017)
•  Héas, P., Herzet, C.: Kernel methods for non-linear reduced modeling. arXiv e-prints (2018)
•  Jain, P., Meka, R., Dhillon, I.S.: Guaranteed rank minimization via singular value projection. In: Advances in Neural Information Processing Systems, pp. 937–945 (2010)
•  Jovanovic, M., Schmid, P., Nichols, J.: Low-rank and sparse dynamic mode decomposition. Center for Turbulence Research Annual Research Briefs pp. 139–152 (2012)
•  Lee, K., Bresler, Y.: Guaranteed minimum rank approximation from linear observations by nuclear norm minimization with an ellipsoidal constraint. arXiv preprint (2009)
•  Lee, K., Bresler, Y.: Admira: Atomic decomposition for minimum rank approximation. IEEE Transactions on Information Theory 56(9), 4402–4416 (2010)
•  Mesbahi, M., Papavassilopoulos, G.P.: On the rank minimization problem over a positive semidefinite linear matrix inequality. IEEE Transactions on Automatic Control 42(2), 239–243 (1997)
•  Mishra, B., Meyer, G., Bach, F., Sepulchre, R.: Low-rank optimization with trace norm penalty. SIAM Journal on Optimization 23(4), 2124–2149 (2013)
•  Parrilo, P.A., Khatri, S.: On cone-invariant linear matrix inequalities. IEEE Transactions on Automatic Control 45(8), 1558–1563 (2000)
• 

Quarteroni, A., Manzoni, A., Negri, F.: Reduced basis methods for partial differential equations: an introduction, vol. 92.

Springer (2015)
•  Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM review 52(3), 471–501 (2010)
•  Schmidt, E.: Zur theorie der linearen und nichtlinearen integralgleichungen. i. teil: Entwicklung willkürlicher funktionen nach systemen vorgeschriebener. Mathematische Annalen 63, 433–476 (1907)
•  Tu, J.H., Rowley, C.W., Luchtenburg, D.M., Brunton, S.L., Kutz, J.N.: On dynamic mode decomposition: Theory and applications. Journal of Computational Dynamics 1(2), 391–421 (2014)
•  Williams, M.O., Kevrekidis, I., Rowley, C.: A data–driven approximation of the koopman operator: Extending dynamic mode decomposition. Journal of Nonlinear Science 25(6), 1307–1346 (2015)
•  Williams, M.O., Rowley, C.W., Kevrekidis, I.G.: A kernel-based method for data-driven koopman spectral analysis. Journal of Computational Dynamics 2(2), 247–265 (2015)
•  Zhu, K.: Operator Theory in Function Spaces, Second Edition. Mathematical surveys and monographs. American Mathematical Soc. (2007)