DeepAI

# The perturbation analysis of nonconvex low-rank matrix robust recovery

In this paper, we bring forward a completely perturbed nonconvex Schatten p-minimization to address a model of completely perturbed low-rank matrix recovery. The paper that based on the restricted isometry property generalizes the investigation to a complete perturbation model thinking over not only noise but also perturbation, gives the restricted isometry property condition that guarantees the recovery of low-rank matrix and the corresponding reconstruction error bound. In particular, the analysis of the result reveals that in the case that p decreases 0 and a>1 for the complete perturbation and low-rank matrix, the condition is the optimal sufficient condition δ_2r<1<cit.>. The numerical experiments are conducted to show better performance, and provides outperformance of the nonconvex Schatten p-minimization method comparing with the convex nuclear norm minimization approach in the completely perturbed scenario.

• 7 publications
• 7 publications
• 28 publications
• 11 publications
03/10/2020

### An Optimal Condition of Robust Low-rank Matrices Recovery

In this paper we investigate the reconstruction conditions of nuclear no...
06/28/2018

### Matrix Recovery from Rank-One Projection Measurements via Nonconvex Minimization

In this paper, we consider the matrix recovery from rank-one projection ...
03/06/2020

### An analysis of noise folding for low-rank matrix recovery

Previous work regarding low-rank matrix recovery has concentrated on the...
09/04/2020

### Smoothed analysis of the condition number under low-rank perturbations

Let M be an arbitrary n by n matrix of rank n-k. We study the condition ...
08/24/2019

### KL property of exponent 1/2 of ℓ_2,0-norm and DC regularized factorizations for low-rank matrix recovery

This paper is concerned with the factorization form of the rank regulari...
05/25/2018

### How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

When the linear measurements of an instance of low-rank matrix recovery ...
02/14/2022

### Analysis of Neural Fragility: Bounding the Norm of a Rank-One Perturbation Matrix

Over 15 million epilepsy patients worldwide do not respond to drugs and ...

## 1 Introduction

Low-rank matrix recovery (LMR) is a rapidly developing topic attracting the interest of numerous researchers in the field of optimization and compressed sensing. Mathematically, we can describe it as follows:

 y=A(X) (1.1)

where is a known linear transformation (we suppose that ),

is a given observation vector, and

is the matrix to be recovered. The objective of LMR is to find the lowest rank matrix based on . If the observation is corrupted by noise , model (1.1) is changed into the following form

 ^y=A(X)+z (1.2)

where is the noisy measurement, and is the additive noise independent of the matrix . However, more LMR models can be encountered where not only the linear measurement is contaminated by the noise vector , but also the linear transformation is perturbed by for completely perturbed setting, namely, substitute the linear transformation with . The completely perturbed appearance arises in remote sensing[1], radar[2], source separation[3], etc. When and the matrix is diagonal, models (1.1) and (1.2) degenerates to the compressed sensing models

 y =Ax, (1.3) ^y =Ax+z (1.4)

where is a measurement matrix and is an unknown sparse signal. We call the problem (1.3) as the sparse signal recovery. For the completely perturbed model, the convex nuclear norm minimization is frequently considered [4] as follows:

 min~Z∈Rm×n∥~Z∥∗ s.t. ∥^A(~Z)−^y∥2≤ϵ′A,r,y, (1.5)

where is the nuclear norm of the matrix

, that is, the sum of its singular values, and

is the total noise level. Problem (1.5) can be reduced to the -minimization [5]

 min~z∈Rn1∥~z∥1 s.% t. ∥^A~z−^y∥2≤ϵ′A,r,y, (1.6)

where is the -norm of the vector , that is, the sum of absolute value of its coefficients.

Chartrand [7] showed that fewer measurements are required for exact reconstruction if -norm is substituted with -norm. There exist many work regarding reconstructing via the -minimization [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19]. In [7], numerical simulations demonstrated that fewer measurements are needed for exact reconstruction than when .

In this paper, we are interested in the completely perturbed model for the nonconvex Schatten -minimization ()

 min~Z∈Rm×n∥~Z∥pp s.t. ∥^A(~Z)−^y∥2≤ϵ′A,r,y, (1.7)

where is the Schatten quasi-norm of the matrix , that is, with being th singular value of . Problem (1.7) can be returned to the -minimization [6]

 min~z∈RM×m∥~z∥pp s.t. ∥^A~z−^y∥2≤ϵ′A,r,y, (1.8)

where is the -quasi-norm of the vector . To the best of our knowledge, recently researches are considered only in unperturbed situation (), that is, the linear transformation is not perturbed by (for related work, see [22], [23], [24], [25], [26], [27], [28], [29]). From the perspective of application, it is more practical to investigate the recovery of low-rank matrices in the scenario of complete perturbation.

In this paper, based on restricted isometry property (RIP), the performance of low-rank matrices reconstruction is showed by the nonconvex Schatten -minimization in completely perturbed setting. The main contributions of this paper are as follows. First, we present a sufficient condition for reconstruction of low-rank matrices via the nonconvex Schatten

-minimization. Second, the estimation accurateness between the optimal solution and the original matrix is described by a total noise and a best

-rank approximation error. The result reveals that stable and robust performance concerning reconstruction of low-rank matrices in existence of total noise. Third, numerical experiments are conducted to sustain the gained results, and demonstrate that the performance of nonconvex Schatten -minimization can be better than that of convex nuclear norm minimization in completely perturbed model.

The rest of this paper is constructed as follows.

## 2 Notation and main results

Before presenting the main results, we first introduce the notion of RIC of a linear transformation , which is as follows.

###### Definition 2.1.

The restricted isometry constant (RIC) of a linear transformation is the smallest constant such that

 (1−δ)∥X∥2F≤∥A(X)∥22≤(1+δ)∥X∥2F (2.1)

holds for all -rank (i.e., ), where is the Frobenius norm of the matrix .

Then we provide some notations similar to [4], which quantifying the perturbations and with the bounds:

 ∥E∥op∥A∥op≤ϵA, ∥E∥(r)op∥A∥(r)op≤ϵ(r)A, ∥z∥2∥y∥2≤ϵy, (2.2)

where is the operator norm of linear transformation , and , and representing

 tr=∥X[r]c∥F∥X[r]∥F, sr=∥X[r]c∥∗√r∥X[r]∥F, κ(r)A=√1+δr√1−δr, αA=∥A∥op√1−δr. (2.3)

Here is the best -rank approximation of the matrix , its singular values are composed of -largest singular values of the matrix , and . With notations and symbols above, we present our results for reconstruction of low-rank matrices via the completely perturbed nonconvex Schatten -minimization.

###### Theorem 2.1.

For given relative perturbations , , , and in (2.2), suppose the RIC for the linear transformation fulfills

 δ2ar<2+√2a1/2−1/p(1+√2a1/2−1/p)(1+ϵ(2ar)A)2−1 (2.4)

for and that the general matrix meets

 tr+sr<1κ(r)A. (2.5)

Then a minimizer of problem (1.7) approximates the true matrix with errors

 ∥X−X∗∥pF ≤C1(ϵ′A,r,y)p+C2∥X[r]c∥ppr1−p/2, (2.6) ∥X−X∗∥pp ≤C′1r1−p/2(ϵ′A,r,y)p+C′2∥X[r]c∥pp, (2.7)

where the total noise is

 ϵ′A,r,y=⎡⎣ϵ(r)Aκ(r)A+ϵAαAtr1−κ(r)A(tr+sr)+ϵy⎤⎦∥y∥2, (2.8)

and

 C1 =2p(1+ap/2−1)(1+^δ(a+1)r)p/2(1−^δ(a+1)r)p−ap/2−1(^δ2(a+1)r+^δ22ar)p/2, (2.9) C2 =2ap/2−1[1+(1+ap/2−1)(^δ2(a+1)r+^δ22ar)p/2(1−^δ(a+1)r)p−ap/2−1(^δ2(a+1)r+^δ22ar)p/2], (2.10) C′1 =2p+1(1+a)1−p/2(1+^δ(a+1)r)p/2(1−^δ(a+1)r)p−ap/2−1(^δ2(a+1)r+^δ22ar)p/2, (2.11) C′2 =2+4(1+a)1−p/2ap/2−1(^δ2(a+1)r+^δ22ar)p/2(1−^δ(a+1)r)p−ap/2−1(^δ2(a+1)r+^δ22ar)p/2, (2.12)

where .

###### Remark 2.1.

Theorem 2.1 gives a sufficient conditions for the reconstruction of low-rank matrices via nonconvex Schatten -minimization in completely perturbed scenario. Condition (2.4) of the Theorem extends the assumption of situation in [6] to the nonconvex Schatten -minimization. Observe that as the value of becomes large, the bound of RIC reduces, which reveals that smaller value of can induce weaker reconstruction guarantee. Particularly, when ((1.7) degenerates to the rank minimization: ), it leads to the RIP condition for reconstruction of low-rank matrices via the rank minimization, to the best of our knowledge, the current optimal recovery condition about RIP is to ensure exact reconstruction for -rank matrices via rank minimization [21], therefore the Theorem extends that condition to the scenario of presence of noise and -rank matrices. Furthermore, when and the matrix is diagonal, the Theorem reduces to the case of compressed sensing given by [6].

###### Remark 2.2.

Under the requirement (2.4), one can easily check that the condition (2.5) is satisfied. Besides, when , the condition (2.5) holds. Additionally, the inequalities (2.6) and (2.7) in Theorem 2.1 which exploit two kinds of metrics provide upper bound estimations on the reconstruction of nonconvex Schatten -minimization. The estimations evidence that reconstruction accurateness can be controlled by the best -rank approximation error and the total noise. In particular, when there aren’t noise (i.e., and ), they clear that the -rank matrix can be accurately reconstructed via the nonconvex Schatten -minimization. In (2.6), both the error bound noise constant and the error bound compressibility constant may rely on the value of . Numerical simulations reveal that when we fix the other independent parameters, a smaller value of will produce a smaller and a smaller . For more details, see Fig. 1.

###### Remark 2.3.

When the matrix is a strictly -rank matrix (i.e., ), a minimizer of problem (1.7) approximates the true matrix with errors

 ∥X−X∗∥F ≤C1/p1ϵ′A,r,y, ∥X−X∗∥p ≤C′1/p1r1/p−1/2ϵ′A,r,y,

where

 ϵ′A,r,y=[ϵ(r)Aκ(r)A+ϵy]∥y∥2.

In the case of , that is, there doesn’t exist perturbation in the linear transformation , then . In the case that , the matrix is diagonal (i.e., the results of Theorem reduce to the case of compressed sensing), and , our result contains that of Theorem in [5].

## 3 Proofs of the main results

In this part, we will provide the proofs of main results. In order to prove our main results, we need the following auxiliary lemmas. Firstly, we give Lemma 3.1 which incorporates an important inequality associating with and .

###### Lemma 3.1.

(RIP for [4]) Given the RIC related with linear transformation and the relative perturbation corresponded with linear transformation , fix the constant . Then the RIC for is the smallest nonnegative constant such that

 (1−^δr)∥X∥2F≤∥^A(X)∥22≤(1+^δr)∥X∥2F (3.1)

holds for all matrices that are -rank.

We will employ the fact that maps low-rank orthogonal matrices to nearly sparse orthogonal vectors, which is given by [20].

###### Lemma 3.2.

([20]) For all satisfying , and , ,

 ∣∣⟨^A(X),^A(Y)⟩∣∣≤^δr1+r2∥X∥F∥Y∥F. (3.2)

Moreover, the following lemma will be utilized in the proof of main result, which combines with Lemma [21] and Lemma [25].

###### Lemma 3.3.

Assume that obey and . Let . Then

 ∥X+Y∥pp=∥X∥pp+∥Y∥pp, ∥X+Y∥p≥∥X∥p+∥Y∥p, (3.3)

where and stand for the nuclear norm of matrix in the case of .

For any matrix

, we represent the singular values decomposition (SVD) of

as

 X=Udiag(σ(X))V⊤,

where is the vector of the singular values of , and are respectively the left and right singular value matrices of .

###### Proof of the theorem 2.1.

Let denote the original matrix to be recovered and denote the optimal solution of (1.7). Let , and based on the SVD of , its SVD is given by

 U⊤ZV=U1diag(σ(U⊤ZV))V⊤1,

where are orthogonal matrices, and stands for the vector comprised of the singular values of . Let is the set composed of the locations of the largest magnitudes of elements of . We adopt technology similar to the reference [6] to partition into a sum of vectors , where is the set composed of the locations of the largest magnitudes of entries of , is the set composed of the locations of the second largest magnitudes of entries of , and so forth (except possibly ). Then where , . One can easily verify that and for all , and , , . For simplicity, denote . Then, we have (see (22) in [24], Lemma [27])

 ∥ZTc0∥pp≤∥ZT0∥pp+2∥X[r]c∥pp. (3.4)

By the decomposition of , for each , , it implies that

 (σTi(U⊤ZV)[l])p≤∑ark=1(σTi−1(U⊤ZV)[l])par=∥σTi−1(U⊤ZV)∥ppar=∥ZTi−1∥ppar, (3.5)

which deduces

 ∥ZTi∥2F≤(ar)1−2p∥ZTi−1∥2p. (3.6)

Thereby,

 ∥ZTi∥pF≤(ar)p2−1∥ZTi−1∥pp. (3.7)

Notice that and for all , due to Lemma 3.3 and (3.7), then we can get

 ∑i≥2∥ZTi∥pF≤(ar)p2−1∑i≥2∥ZTi−1∥pp=(ar)p2−1∥ZTc0∥pp. (3.8)

By the inequality and Hlder’s inequality, we get

 ∥ZT0∥pp≤r1−p2∥ZT01∥pF. (3.9)

From (3.4), (3.8), (3.9) and the inequality that for every fixed , and any , for every , it follows

 ∥ZTc01∥pF=(∑i≥2∥ZTi∥2F)p2≤∑i≥2∥ZTi∥pF≤(ar)p2−1(r1−p2∥ZT01∥pF+2∥X[r]c∥pp). (3.10)

Since

 ∥^A(ZT01)∥22 =<^A(ZT01),^A(ZT01)> =<^A(ZT01),^A(Z)>−<^A(ZT01),∑i≥2^A(ZTi)> ≤∥^A(ZT01)∥2∥^A(Z)∥2+∑i≥2|<^A(ZT01),^A(ZTi)>|, (3.11)

we get

 ∥^A(ZT01)∥2p2(a)≤∥^A(ZT01)∥p2∥^A(Z)∥p2+∑i≥2|<^A(ZT01),^A(ZTi)>|p, (3.12)

where (a) follows from the fact that for nonnegative and .

Additionally, by the minimality of , we get

 ∥^A(Z)∥22≤∥^y−^A(X)∥22+∥^y−^A(X∗)∥22≤2ϵ′A,r,y. (3.13)

Since is -rank and is -rank, , by applying the RIP of and combination with (3.12) and (3.13), we get

 ∥^A(ZT01)∥2p2≤(2ϵ′A,r,y)p(1+^δ(a+1)r)p2∥ZT01∥pF+∑i≥2|<^A(ZT01),^A(ZTi)>|p. (3.14)

Because for all , and is -rank, by Lemma 3.2 and (3.10), we get

 ∥^A(ZT01)∥2p2 ≤(2ϵ′A,r,y)p(1+^δ(a+1)r)p2∥ZT01∥pF+(^δ(a+1)r∥ZT0∥F+^δ2ar∥ZT1∥F)p ∑i≥2∥ZTi∥pF ≤(2ϵ′A,r,y)p(1+^δ(a+1)r)p2∥ZT01∥pF +(^δ(a+1)r∥ZT0∥F+^δ2ar∥ZT1∥F)p(ar)p2−1(r1−p2∥ZT01∥pF+2∥X[r]c∥pp) (3.15)

From (2.4), one can easily check that

 ap2−1(^δ2(a+1)r+^δ22ar)p2<(1−^δ(a+1)r)p2. (3.16)

By (3.15), (3.16) and the inequality , one can get

 ∥ZT01∥pF≤ 2p(1+ap2−1)(1+^δ(a+1)r)p2(1−^δ(a+1)r)p2−ap2−1(^δ2(a+1)r+^δ22ar)p2(ϵ′A,r,y)p +2ap2−1(^δ2(a+1)k+^δ22ar)p2(1−^δ(a+1)r)p2−ap2−1(^δ2(a+1)r+^δ22ar)p2∥X[r]c∥ppr1−p2 = :β(ϵ′A,r,y)p+γ∥X[r]c∥ppr1−p2, (3.17)

consequently,

 ∥ZT0∥pp ≤r1−p2∥ZT0∥pF (3.18) ≤βr1−p2(ϵ′A,r,y)p+γ∥X[r]c∥pp.

Thus, from (3.10) and (3.17), we get

 ∥Z∥pF ≤∥ZT01∥pF+∥ZTc01∥pF ≤C1(ϵ′A,r,y)p+C2∥X[r]c∥ppr1−p2; (3.19)

in addition, a combination of (3.4) and (3.18), one can get

 ∥Z∥pp ≤∥ZT0∥pp+∥ZTc0∥pp ≤C′1r1−p2(ϵ′A,r,y)p+C′2∥X[r]c∥pp, (3.20)

where the constants , , and are defined in Theorem 2.1. The proof is complete. ∎

## 4 Numerical experiments

In this section, we carry out some numerical experiments to sustain verification of our theoretical results, we implement all experiments in MATLAB 2016a running on a PC with an Inter core i7 processor (3.6 GHz) with 8 GB RAM. In order to address the completely perturbed nonconvex Schatten -minimization model, we employ the alternating direction method of multipliers (ADMM) method, which is often applied in compressed sensing and sparse approximation [30], [31], [32], [33]. The constrained optimization problem (1.7) can be transformed into an equivalent unconstrained form

 min~Z∈Rm×nλ∥~Z∥pp+12∥^Avec(~Z)−^y∥22, (4.1)

where , represents the vectorization of . Hence, presents the linear map . Then, introducing an auxiliary variable , the problem (4.1) can be equivalently turned into

 minW, ~Z∈Rm×nλ∥W∥pp+12∥^Avec(~Z)−^y∥22 s.t. ~Z=W. (4.2)

The augmented Lagrangian function is provided by

 Lρ(~Z,W,Y)=λ∥W∥pp+12∥^Avec(~Z)−^y∥22++ρ2∥~Z−W∥2F, (4.3)

where is dual variable, and is a penalty parameter. Then, ADMM used to (4.3) comprises of the iterations as follows

 ~Zk+1 =argmin~Z12∥^Avec(~Z)−^y∥22+ρ2∥~Z−(Wk−Ykρ)∥2F, (4.4) Wk+1 =argminWλ∥W∥pp+ρ2∥~Zk+1−(W−Ykρ)∥2F, (4.5) Yk+1