DeepAI

# RIP-based performance guarantee for low-tubal-rank tensor recovery

The essential task of multi-dimensional data analysis focuses on the tensor decomposition and the corresponding notion of rank. In this paper, by introducing the notion of tensor singular value decomposition (t-SVD), we establish a regularized tensor nuclear norm minimization (RTNNM) model for low-tubal-rank tensor recovery. On the other hand, many variants of the restricted isometry property (RIP) have proven to be crucial frameworks and analysis tools for recovery of sparse vectors and low-rank tensors. So, we initiatively define a novel tensor restricted isometry property (t-RIP) based on t-SVD. Besides, our theoretical results show that any third-order tensor X∈R^n_1× n_2× n_3 whose tubal rank is at most r can stably be recovered from its as few as measurements y = M(X)+w with a bounded noise constraint w_2≤ϵ via the RTNNM model, if the linear map M obeys t-RIP with δ_tr^M<√(t-1/n_3^2+t-1) for certain fixed t>1. Surprisingly, when n_3=1, our conditions coincide with Cai and Zhang's sharp work in 2013 for low-rank matrix recovery via the constrained nuclear norm minimization. We note that, as far as the authors are aware, such kind of result has not previously been reported in the literature.

• 28 publications
• 7 publications
• 7 publications
• 84 publications
• 11 publications
04/19/2020

### Tensor Completion via a Low-Rank Approximation Pursuit

This paper considers the completion problem for a tensor (also referred ...
07/05/2017

### The ℓ_∞ Perturbation of HOSVD and Low Rank Tensor Denoising

The higher order singular value decomposition (HOSVD) of tensors is a ge...
06/07/2018

### Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements

The recent proposed Tensor Nuclear Norm (TNN) [Lu et al., 2016; 2018a] i...
07/23/2012

### Guarantees of Augmented Trace Norm Models in Tensor Recovery

This paper studies the recovery guarantees of the models of minimizing X...
12/20/2019

### High-Dimensional Dynamic Systems Identification with Additional Constraints

This note presents a unified analysis of the identification of dynamical...
12/07/2020

### Low-Rank Tensor Recovery with Euclidean-Norm-Induced Schatten-p Quasi-Norm Regularization

The nuclear norm and Schatten-p quasi-norm of a matrix are popular rank ...
12/16/2017

### Low Rank Matrix Recovery for Joint Array Self-Calibration and Sparse Model DoA Estimation

In this work, combined calibration and DoA estimation is approached as a...

## 1 Introduction

Utilizing the tensor model, possessed of the ability to make full use of multi-linear structure, instead of the traditional matrix-based model to analyze multi-dimensional data (tensor data) has widely attracted attention. Low-rank tensor recovery as a representative problem is not only a mathematical natural generalization of the compressed sensing and low-rank matrix recovery problem, but also there exists lots of reconstruction applications of data that have intrinsically many dimensions in the context of low-rank tensor recovery including signal processing Liu2013Tensor

Romera2013Multilinear , data mining M2011Applications , and many others Lu2018Tensor ; rauhut2017low ; Zhang2014Novel ; Lu2018Exact ; shi2013guarantees .

The purpose of low-rank tensor recovery is to reconstruct a low-rank tensor (this article considers only the third-order tensor without loss of generality) from linear noise measurements , where is a random map with i.i.d. Gaussian entries and is a vector of measurement errors. To be specific, we consider addressing the following rank minimization problem

 minX∈Rn1×n2×n3 rank(X),  s.t.  ∥y−M(X)∥2≤ϵ, (1)

where is a positive constant. The key to dealing with the low-rank tensor recovery problem is how to define the rank of the tensor. Unlike in the matrix case, there are different notions of tensor rank which are induced by different tensor decompositions. Two classical decomposition strategies can be regarded as higher-order extensions of the matrix SVD: CANDECOMP/PARAFAC (CP) decomposition kiers2000towards and Tucker decomposition Tucker1966Some . Those induced tensor ranks are called the CP rank and Tucker rank, respectively. Tucker decomposition is the most widely used decomposition method at present. In particular, based on the Tucker decomposition, a convex surrogate optimization model Liu2013Tensor of the non-convex minimization problem (1) that is NP-hard regardless of the choice of the tensor decomposition has been studied as follows:

 minX∈Rn1×n2×n3 ∥X∥SNN,  s.t.  ∥y−M(X)∥2≤ϵ, (2)

where is referred to as the Sum of Nuclear Norms (SNN) and denotes the mode- matricization of , is the trace norm of the matrix . This popular approach (2), however, has its limitations. Firstly, the Tucker decomposition is highly non-unique. Secondly, SNN is not the convex envelop of , which leads to a fact that the model (2) can be substantially suboptimal. Thirdly, the definition of SNN is inconsistent with the matrix case so that the existing analysis templates of low-rank matrix recovery cannot be generalized to that for low-rank tensor recovery.

More recently, based on the definition of tensor-tensor product (t-product) and tensor singular value decomposition (t-SVD) Kilmer2011Factorization ; Martin2013An ; Oseledets2011Tensor that enjoys many similar properties as the matrix case, Kilmer et al. proposed the tensor multi-rank definition and tubal rank definition kilmer2013third (see Definition 2.8) and Semerci et al. developed a new tensor nuclear norm (TNN) Semerci2014Tensor . Continuing along this vein, Lu et al. given a new and rigorous way to define the average rank of tensor by Lu2018Tensor (see Definition 2.9) and the nuclear norm of tensor by Lu2018Tensor (see Definition 2.10), and proved that the convex envelop of is within the unit ball of the tensor spectral norm Lu2018Tensor . Furthermore, they pointed out that the assumption of low average rank for tensor is weaker than the CP rank and Tucker rank assumptions and tensor always has low average rank if it has low tubal rank induced by t-SVD. Therefore, considering that this novel and computable tensor nuclear norm can address the shortcoming of SNN, a convex tensor nuclear norm minimization (TNNM) model based on the assumption of low tubal rank for tensor recovery has been proposed in Lu2018Tensor , which solves

 minX∈Rn1×n2×n3 ∥X∥∗,  s.t.  ∥y−M(X)∥2≤ϵ, (3)

where tensor nuclear norm is as the convex surrogate of tensor average rank . In order to facilitate the design of algorithms and the needs of practical applications, instead of considering the constrained-TNNM (3), in this paper, we present a theoretical analysis for regularized tensor nuclear norm minimization (RTNNM) model, which takes the form

 minX∈Rn1×n2×n3 ∥X∥∗+12λ∥y−M(X)∥22, (4)

where is a positive parameter. According to zhang2016one , there exists a such that the solution to the regularization problem (4) also solves the constrained problem (3) for any , and vice versa. However, model (4) is more commonly used than model (3

) when the noise level is not given or cannot be accurately estimated. There exist many examples of solving RTNNM problem (

4

) based on the tensor nuclear norm heuristic. For instance, by exploiting the t-SVD, Semerci et al.

Semerci2014Tensor developed the tensor nuclear norm regularizer which can be solved by an alternating direction method of multipliers (ADMM) approach boyd2011distributed . Analogously, Lu et al. Lu2018Tensor and Zhang et al. Zhang2014Novel

used the tensor nuclear norm to replace the tubal rank for low-rank tensor recovery from incomplete tensors (tensor completion) and tensor robust principal component analysis (TRPCA). Two kinds of problems can be regarded as special cases of (

4). ADMM algorithm can also be applied to solve it. While the application and algorithm research of (4) is already well-developed, only a few contributions on the theoretical results with regard to performance guarantee for low-tubal-rank tensor recovery are available so far. The restricted isometry property (RIP) introduced by Candès and Tao candes2005Decoding is one of the most widely used frameworks in sparse vector/low-rank matrix recovery. In this paper, we generalize the RIP tool to tensor case based on t-SVD and hope to make up for the lack of research on low-tubal-rank tensor recovery.

Since different tensor decompositions induce different notions of tensor rank, and they also induce different notions of the tensor RIP. For example, in 2013, based on Tucker decomposition Tucker1966Some , Shi et al. defined tensor RIP shi2013guarantees as follows:

###### Definition 1.1.

Let . The RIP constant of linear operator is the smallest value such that

 (1−δ(r1,r2,r3))∥X∥2F≤∥F(X)∥22≤(1+δ(r1,r2,r3))∥X∥2F

holds for all tensors .

Their theoretical results show that a tensor with rank can be exactly recovered in the noiseless case if satisfies the RIP with the constant for . This is the first work to extend the RIP-based results from the sparse vector recovery to the tensor case. In addition, in 2016, Rauhut et al. rauhut2017low also induced three notions of the tensor RIP by utilizing the higher order singular value decomposition (HOSVD), the tensor train format (TT), and the general hierarchical Tucker decomposition (HT). These decompositions can be considered as variants of Tucker decomposition whose uniqueness is not guaranteed such that all these induced definitions of tensor RIP depend on a rank tuple that differs greatly from the definition of matrix rank. In contrast, t-SVD is a higher-order tensor decomposition strategy with uniqueness and computability. So, based on t-SVD, we initiatively define a novel tensor restricted isometry property as follows:

###### Definition 1.2.

(t-RIP) A linear map , is said to satisfy the t-RIP with tensor restricted isometry constant (t-RIC) if is the smallest value such that

 (1−δM)∥X∥2F≤∥M(X)∥22≤(1+δM)∥X∥2F (5)

holds for all tensors whose tubal rank is at most .

Our definition of tensor RIP shows the same form with vector RIP candes2005Decoding and matrix RIP Candes2011Tight . In other words, vector RIP and matrix RIP are low-dimensional versions of our t-RIP when and , respectively, which will result in some existing analysis tools and techniques that can also be used for tensor cases. At the same time, the existing theoretical results will provide us with a great reference. For constrained sparse vector/low-rank matrix recovery, different conditions on the restricted isometry constant (RIC) have been introduced and studied in the literature candes2008restricted ; Candes2011Tight ; Cai2013Sharp , etc. Among these sufficient conditions, especially, Cai and Zhang Cai2013Sparse showed that for any given , the sharp vector RIC (matrix RIC ensures the exact recovery in the noiseless case and stable recovery in the noisy case for -sparse signals and matrices with rank at most . In addition, Zhang and Li Zhang2018A obtained another part of the sharp condition, that is with . The results mentioned above are currently the best in the field. In view of unconstrained sparse vector recovery, as far as we know that Zhu Zhu2008Stable first studied this kind of problem in 2008 and he pointed out that -sparse signals can be recovered stably if . Next, in 2015, Shen et al. Shen2015Stable got a sufficient condition under redundant tight frames. Recently, Ge et al. Ge2018Stable proved that if the noisy vector satisfies the bounded noise constraint (i.e., ) and with , then -sparse signals can be stably recovered. Although there is no similar result for unconstrained low-rank matrix recovery, the results presented in this paper also can depict the case of the matrix when .

Equipped with the t-RIP, in this paper, we aim to construct sufficient conditions for stable low-tubal-rank tensor recovery and obtain an ideal upper bound of error via solving (4). The rest of the paper is organized as follows. In Section 2, we introduce some notations and definitions. In Section 3, we give some key lemmas. In Section 4, our main result is presented. In Section 5, some numerical experiments are conducted to support our analysis. The conclusion is addressed in Section 6. Finally, A and B provide the proof of Lemma 3.2 and Lemma 3.3, respectively.

## 2 Notations and Preliminaries

We use lowercase letters for the entries, e.g. , boldface letters for vectors, e.g. , capitalized boldface letters for matrices, e.g. and capitalized boldface calligraphic letters for tensors, e.g. . For a third-order tensor , , and are used to represent the th horizontal, lateral, and frontal slice. The frontal slice can also be denoted as . The tube is denoted as . We denote the Frobenius norm as . Defining some norms of matrix is also necessary. We denote by the Frobenius norm of and denote by the nuclear norm of , where ’s are the singular values of and represents the singular value vector of matrix . Given a positive integer , we denote and for any .

For a third-order tensor , let

be the discrete Fourier transform (DFT) along the third dimension of

, i.e., . Similarly, can be calculated from by . Let be the block diagonal matrix with each block on diagonal as the frontal slice of , i.e.,

 ¯X=bdiag(¯X)=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝¯X(1)¯X(2)⋱¯X(n3)⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠,

and be the block circular matrix, i.e.,

 bcirc(X)=⎛⎜ ⎜ ⎜ ⎜ ⎜⎝X(1)X(n3)⋯X(2)X(2)X(1)⋯X(3)⋮⋮⋱⋮X(n3)X(n3−1)⋯X(1)⎞⎟ ⎟ ⎟ ⎟ ⎟⎠.

The operator and its inverse operator are, respectively, defined as

 unfold(X)=(X(1)X(2)⋯X(n3))T,  fold(unfold(X))=X.

Then tensor-tensor product (t-product) between two third-order tensors can be defined as follows.

###### Definition 2.1.

(t-product Kilmer2011Factorization ) For tensors and , the t-product is defined to be a tensor of size ,

 A⋆B=fold(bcirc(A)⋅unfold(B)).
###### Definition 2.2.

(Conjugate transpose Kilmer2011Factorization ) The conjugate transpose of a tensor of size is the tensor obtained by conjugate transposing each of the frontal slice and then reversing the order of transposed frontal slices 2 through .

###### Definition 2.3.

(Identity tensor Kilmer2011Factorization ) The identity tensor is the tensor whose first frontal slice is the identity matrix, and other frontal slices are all zeros.

###### Definition 2.4.

(Orthogonal tensor Kilmer2011Factorization ) A tensor is orthogonal if it satisfies

###### Definition 2.5.

(F-diagonal tensor Kilmer2011Factorization ) A tensor is called F-diagonal if each of its frontal slices is a diagonal matrix.

###### Theorem 2.6.

(t-SVD Kilmer2011Factorization ) Let , the t-SVD factorization of tensor is

 X=U⋆S⋆V∗,

where and are orthogonal, is an F-diagonal tensor. Figure 1 illustrates the t-SVD factorization.

###### Remark 2.7.

For , the t-SVD of can be written

 X=∑κi=1UX(:,i,:)⋆SX(i,i,:)⋆VX(:,i,:)∗.

The diagonal vector of the first frontal slice of is denoted as . The best -term approximation of with the tubal rank at most is denoted by

 Xmax(r)=argminrankt(~X)≤r∥X−~X∥F=∑ri=1UX(:,i,:)⋆SX(i,i,:)⋆VX(:,i,:)∗,

and . In addition, for index set , we have

 XΓ=∑i∈ΓUX(:,i,:)⋆SX(i,i,:)⋆VX(:,i,:)∗.
###### Definition 2.8.

(Tensor tubal rank kilmer2013third ) For , the tensor tubal rank, denoted as , is defined as the number of nonzero singular tubes of , where is from the t-SVD of . We can write

 rankt(X)=♯{i:S(i,i,:)≠0}=♯{i:S(i,i,1)≠0}.
###### Definition 2.9.

(Tensor average rank Lu2018Tensor ) For , the tensor average rank, denoted as , is defined as

 ranka(X)=1n3rank(bcirc(X))=1n3rank(bdiag(¯X)).
###### Definition 2.10.

(Tensor nuclear norm Lu2018Tensor ) Let be the t-SVD of . The tensor nuclear norm of is defined as , where .

###### Proposition 2.11.

For a third-order tensor , we have the following properties

 ∥X∥F = 1√n3∥¯X∥F, (6) ∥X∥∗ = 1n3∥¯X∥∗. (7) rank(¯X) ≤ n3rankt(X). (8)

## 3 Some Key Lemmas

We present the following lemmas, which will play a key role in proving our sufficient conditions for low-tubal-rank tensor recovery.

###### Lemma 3.1.

Cai2013Sparse   For a positive number and a positive integer , define the polytope by

 T(ϕ,s)={v∈Rn:∥v∥∞≤ϕ,∥v∥1≤sϕ}.

For any , define the set of sparse vectors by

 U(ϕ,s,v)={u∈Rn:supp(u)⊆supp(v),∥u∥0≤s,∥u∥1=∥v∥1,∥u∥∞≤ϕ}.

Then if and only if is in the convex hull of . In particular, any can be expressed as

 v=N∑i=1γiui

where and , .

This elementary technique introduced by T. Cai and A. Zhang Cai2013Sparse shows that any point in a polytope can be represented as a convex combination of sparse vectors and makes the analysis surprisingly simple.

The following lemma shows that a suitable t-RIP condition implies the robust null space property Foucart2014stability of the linear map .

###### Lemma 3.2.

Let the linear map satisfies the t-RIP of order with t-RIC . Then for any tensor and any subset with and , it holds that

 ∥HΓ∥F≤η1∥M(H)∥2+η2√r∥HΓc∥∗, (9)

where

 η1≜2(1−δMtr)√1+δMtr,  and  η2≜√n3δMtr√(1−(δMtr)2)(t−1).
###### Proof.

In order to prove the main result, we still need the following lemma.

###### Lemma 3.3.

If the noisy measurements of tensor are observed with noise level , then for any subset with and , the minimization solution of (4) satisfies

 ∥M(H)∥22−2ϵ∥M(H)∥2≤2λ(∥HΓ∥∗−∥HΓc∥∗+2∥XΓc∥∗), (10)

and

 ∥HΓc∥∗≤∥HΓ∥∗+2∥XΓc∥∗+ϵλ∥M(H)∥2, (11)

where .

## 4 Main Results

With preparations above, now we present our main result.

###### Theorem 4.1.

For any observed vector of tensor corrupted by an unknown noise , with bounded constrain , if satisfies t-RIP with

 δMtr<√t−1n23+t−1 (12)

for certain , then we have

 ∥M(^X−X)∥2≤C1∥X−max(r)∥∗+C2, (13)

and

 ∥^X−X∥F≤C3∥X−max(r)∥∗+C4, (14)

where is the solution to (4), and are denoted as

 C1 = 2√rη1,C2=2√rη1λ+2ϵ, C3 = 2√rη1(2√n3r+1+η2)λ+2(√n3r+η2)ϵrη1(1−η2)λ, C4 = (√n3r+1)η1λ+(√n3r−√n3η2+√n3+1)ϵ(1−η2)λ(2√rη1λ+2ϵ)−1.

###### Remark 4.2.

We note that the obtained t-RIC condition (12) is related to the length of the third dimension. This is due to the fact that the discrete Fourier transform (DFT) is performed along the third dimension of . Further, we want to stress that this crucial quantity is rigorously deduced from the t-product and makes the result of the tensor consistent with the matrix case. For general problems, let be the smallest size of three modes of the third-order tensor, e.g. for the third-order tensor from a color image with size , where three frontal slices correspond to the R, G, B channels;

for 3-D face detection using tensor data

with column , row , and depth mode . Especially, when , our model (4) returns to the case of low-rank matrix recovery and the condition (12) degenerates to which has also been proved to be sharp by Cai, et al. Cai2013Sharp for stable recovery via the constrained nuclear norm minimization for . For unconstrained low-rank matrix recovery, the degenerated sufficient condition for and error upper bound estimation have been derived in our previous work wang2019low . We note that, to the best of our knowledge, results like our Theorem 4.1 has not previously been reported in the literature.

###### Remark 4.3.

Theorem 4.1 not only offers a sufficient condition for stably recovering tensor based on solving (4), but also provides an error upper bound estimate for the recovery of tensor via RTNNM model. This result clearly depicts the relationship among reconstruction error, the best -term approximation, noise level and . There exist some special cases of Theorem 4.1 which is worth studying. For examples, one can associate the -norm bounded noise level with the trade-off parameter (such as ) as Ge2018Stable ; Shen2015Stable ; Candes2011Tight . This case can be summarized by Corollary 4.4. Notice that we can take a which is close to zero such that and in (16),(17) is close to zero for the noise-free case . Then Corollary 4.4 shows that tensor can be approximately recovery by solving (4) if is small.

###### Corollary 4.4.

Suppose that the noise measurements of tensor are observed with noise level . If satisfies t-RIP with

 δMtr<√t−1n23+t−1 (15)

for certain , then we have

 ∥M(^X−X)∥2≤~C1∥X−max(r)∥∗+~C2λ, (16)

and

 ∥^X−X∥F≤~C3∥X−max(r)∥∗+~C4λ, (17)

where is the solution to (4), and are denoted as

 ~C1 = 2√rη1,~C2=2√rη1+1, ~C3 = 2√rη1(2√n3r+1+η2)+√n3r+η2rη1(1−η2), ~C4 = 2(√n3r+1)η1+√n3r−√n3η2+√n3+12(1−η2)(2√rη1+1)−1.

## 5 Numerical experiments

In this section, we present several numerical experiments to corroborate our analysis.

We perform to get the linear noise measurements instead of . Then the RTNNM model (4) can be reformulated as

 minX∈Rn1×n2×n3 ∥X∥∗+12λ∥y−Mvec(X)∥22, (18)

where , , is a Gaussian measurement ensemble and denotes the vectorization of . We adopt the alternating direction method of multipliers (ADMM) boyd2011distributed to solve this kind of problem quickly and accurately. We firstly introduce an auxiliary variable so that (18) forms a constrained optimization problem

 minX∈Rn1×n2×n3 ∥X∥∗+12λ∥y−Mvec(Z)∥22,  s.t.  X=Z.

The augmented Lagrangian function of the above constrained optimization problem is

 L(X,Z,K)=λ∥X∥∗+12∥y−Mvec(Z)∥22+⟨K,X−Z⟩+ρ2∥X−Z∥22

where is a positive scalar and is the Lagrangian multiplier tensor. By minimizing the augmented Lagrangian function, we can obtain the closed-form solutions of the variables and . A detailed update process is shown in Algorithm 1. In particular, according to Theorem 4.2 in Lu2018Tensor , the proximal operator in Step 3 of Algorithm 1 can be computed by exploiting the tensor Singular Value Thresholding (t-SVT) algorithm.

### 5.2 Experiment Results

All numerical experiments are tested on a PC with 4 GB of RAM and Intel core i5-4200M (2.5GHz). In order to avoid randomness, we perform 50 times against each test and report the average result.

First, we generate a tubal rank tensor as a product where and

are two tensors with entries independently sampled from a standard Gaussian distribution. Next, we generate a measurement matrix

with i.i.d. entries. Using and , the measurements are produced by , where

is the Gaussian white noise with mean

and variance

. We uniformly evaluate the recovery performance of the model by the signal-to-noise ratio (SNR) defined as in decibels (dB) (the greater the SNR, the better the reconstruction). The key to studying the RTNNM model (4) is to explain the relationship among reconstruction error, noise level and . Therefore, we design two sets of experiments to explain it. Case 1: , , ; Case 2: , , . The number of samples in all experiments is set to as Lu2018Exact .

All SNR values for different noise levels and regularization parameters in two cases are provided in Table 1 and Table 2 with the best results highlighted in bold. It can be seen that there exist two accordant conclusions for low-tubal-rank tensor recovery at different scales. For a fixed regularization parameter

, as the standard deviation

increases (the greater , the greater the noise level ), the SNR gradually decreases. This trend is more pronounced especially in the case of smaller . In addition, for each fixed noise level, the smaller the regularization parameter corresponds to the larger SNR, which means the low-tubal-rank tensor can be better recovered. However, for Case 1 and Case 2, respectively, this increment tends to be stable when and . Therefore, and are the optimal regularization parameters of the RTNNM model (4) in Case 1 and Case 2, respectively. We plot the data in Table 1 and Table 2 as Figure 2(a) and Figure 2(b), respectively, which allows us to see the results of the above analysis at a glance. Thus, these experiments clearly demonstrate the quantitative correlation among reconstruction error, noise level and .

## 6 Conclusion

In this paper, a heuristic notion of tensor restricted isometry property (t-RIP) has been introduced based on tensor singular value decomposition (t-SVD). Comparing with other definitions shi2013guarantees ; rauhut2017low , it is more representative as a higher-order generalization of the traditional RIP for vector and matrix recovery since the forms and properties of t-RIP and t-SVD are consistent with the vector/matrix case. This point is crucial because this guarantees that our theoretical investigation can be done in a similar way as sparse vector/low-rank matrix recovery. A sufficient condition was presented, based on the RTNNM model, for stably recovering a given low-tubal-rank tensor that is corrupted with an -norm bounded noise. However, this condition only considers the of the map when is limited to . In the future, we hope to provide a complete answer for when . Another important topic is to establish the guarantee for stable recovery based on (4) in the context of the required number of measurements.

## Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61673015, 61273020), Fundamental Research Funds for the Central Universities (Grant Nos. XDJK2018C076, SWU1809002) and China Postdoctoral Science Foundation (Grant No. 2018M643390).

## Appendix A Proof of Lemma 3.2

###### Proof.

Step 1: Sparse Representation of a Polytope.

Without loss of generality, assume that is an integer for a given . Next we divide the index set into two disjoint subsets, that is,

 Γ1={i∈Γc:SH(i,i,1)>ϕ},  Γ2={i∈Γc:SH(i,i,1)≤ϕ},

where . Clearly,

 Γ1∪Γ2=Γc and Γ1∩Γ=∅,

which implies that and , respectively. In order to prove (9), we only need to check

 ∥HΓ∪Γ1∥F≤η1∥M(H)∥2+η2√r∥HΓc∥∗. (19)

Let , where is denoted as the diagonal vector of first frontal slice of whose element for and otherwise. Since all non-zero entries of vector have magnitude larger than , we have,

 ∥sHΓ1∥1=∥HΓ1∥∗>|Γ1|∥HΓc∥∗(t−1)r≥|Γ1|∥HΓ1∥∗(t−1)r=|Γ1|(t−1)r∥sHΓ1∥1.

Namely . Besides, we also have

 ∥sHΓ2∥1=∥HΓ2∥∗=∥HΓc∥∗−∥HΓ1∥∗≤((t−1)r−|Γ1|)ϕ

and

 ∥sHΓ2∥∞≜maxi∈Γ2SH(i,i,1)≤ϕ.

Now, since , applying Lemma 3.1,