# An Optimal Restricted Isometry Condition for Exact Sparse Recovery with Orthogonal Least Squares

The orthogonal least squares (OLS) algorithm is popularly used in sparse recovery, subset selection, and function approximation. In this paper, we analyze the performance guarantee of OLS. Specifically, we show that if a sampling matrix Φ has unit ℓ_2-norm columns and satisfies the restricted isometry property (RIP) of order K+1 with δ_K+1 <C_K = 1/√(K), K=1, 1/√(K+1/4), K=2, 1/√(K+1/16), K=3, 1/√(K), K > 4, then OLS exactly recovers any K-sparse vector x from its measurements y = Φx in K iterations. Furthermore, we show that the proposed guarantee is optimal in the sense that OLS may fail the recovery under δ_K+1> C_K. Additionally, we show that if the columns of a sampling matrix are ℓ_2-normalized, then the proposed condition is also an optimal recovery guarantee for the orthogonal matching pursuit (OMP) algorithm. Also, we establish a recovery guarantee of OLS in the more general case where a sampling matrix might not have unit ℓ_2-norm columns. Moreover, we analyze the performance of OLS in the noisy case. Our result demonstrates that under a suitable constraint on the minimum magnitude of nonzero elements in an input signal, the proposed RIP condition ensures OLS to identify the support exactly.

## Authors

• 5 publications
• 108 publications
• 15 publications
07/10/2018

### A Sharp Condition for Exact Support Recovery of Sparse Signals With Orthogonal Matching Pursuit

Support recovery of sparse signals from noisy measurements with orthogon...
09/17/2019

### Coherence Statistics of Structured Random Ensembles and Support Detection Bounds for OMP

A structured random matrix ensemble that maintains constant modulus entr...
12/30/2019

### Basis Pursuit and Orthogonal Matching Pursuit for Subspace-preserving Recovery: Theoretical Analysis

Given an overcomplete dictionary A and a signal b = Ac^* for some sparse...
12/30/2019

### Joint Sparse Recovery Using Signal Space Matching Pursuit

In this paper, we put forth a new joint sparse recovery algorithm called...
08/08/2016

### Sampling Requirements and Accelerated Schemes for Sparse Linear Regression with Orthogonal Least-Squares

The Orthogonal Least Squares (OLS) algorithm sequentially selects column...
11/24/2019

### Optimal Permutation Recovery in Permuted Monotone Matrix Model

Motivated by recent research on quantifying bacterial growth dynamics ba...
12/01/2020

### Farthest sampling segmentation of triangulated surfaces

In this paper we introduce Farthest Sampling Segmentation (FSS), a new m...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Orthogonal least squares (OLS) is a popular algorithm for sparse recovery, subset selection, and function approximation [1, 2]. The main goal of the OLS algorithm is to reconstruct a high dimensional -sparse vector from a small number of linear measurements

 y =Φx, (1)

where is the sampling (measurement) matrix. In order to solve the problem, OLS identifies the support (the index set of nonzero entries) of sequentially in a way to minimize the residual power. To be specific, in each iteration, OLS identifies the column of that leads to the most significant reduction in the

-norm of a residual and then adds the index of the column to the estimated support. The vestige of indices in the enlarged support is then removed from

, generating an updated residual for the upcoming iteration (see Table I).

As a framework to analyze the performance of the OLS algorithm, the restricted isometry property (RIP) has been widely employed [4, 5, 3, 6, 7, 8]. A matrix is said to satisfy the RIP of order if there exists a constant such that

 (1−δ)∥x∥22≤∥Φx∥22≤(1+δ)∥x∥22 (2)

for any -sparse vector  [9]. In particular, the minimum of satisfying (2) is called the RIP constant and denoted by . In [3], it has been shown that OLS guarantees the exact recovery of any -sparse vector in iterations if a sampling matrix has unit -norm columns and obeys the RIP with

 δK+1 <1√K+2. (3)

This condition has been improved to [6]

 δK+1 <1√K+1. (4)

On the other hand, it has been reported in [6] that when , there exists a counterexample for which OLS fails the reconstruction under

 δK+1 =1√K+14.

Based on the counterexample, it has been conjectured that for all of , a recovery guarantee of OLS cannot be weaker than

 δK+1 <1√K+14. (5)

An aim of this paper is to bridge the gap between (4) and (5). Towards this end, we first put forth an improved performance guarantee of OLS. Specifically, we show that OLS exactly recovers any -sparse vector in iterations, provided that a sampling matrix has unit -norm columns and satisfies the RIP of order with (Theorem 1)

 δK+1

The significance of our result lies in the fact that it not only outperforms the existing result in (4), but also states an optimal recovery guarantee of OLS. By the optimality, we mean that if the condition is violated, then there exists a counterexample for which OLS fails the recovery. In fact, for any positive integer and constant

 δ∗∈[CK,1),

there always exist an -normalized matrix with and a -sparse vector such that cannot be recovered from by OLS (Theorem 2).

We note that the aforementioned results are based on the assumption that the columns of a sampling matrix are -normalized. This assumption might not be satisfied in some applications (e.g., is Gaussian random [9]). For this case, we establish a condition of general (not necessarily -normalized) sampling matrices for the success111By the success, we mean the exact reconstruction of the underlying sparse signal. of OLS (Theorem 3). By comparing our result with existing ones, we show that our result outperforms the existing one by a large margin.

We also provide a performance guarantee of OLS in the noisy scenario. Specifically, we study a sufficient condition under which OLS identifies the support of accurately in the presence of measurement noise. Our result states that under a suitable condition on the minimum magnitude of nonzero elements in , (6) ensures OLS to reconstruct the support of accurately (Theorem 4). We show that the proposed sufficient condition is better (more relaxed) than existing ones.

Finally, we extend our analyses for orthogonal matching pursuit (OMP) [10, 11], which is the most famous sparse recovery algorithm. Our results demonstrate that if an -normalized sampling matrix is employed, then (6) is also an optimal performance guarantee of OMP in the noiseless scenario (Theorems 5 and 6). We also show that under a proper constraint on the minimum magnitude of nonzero entries in , the proposed RIP condition (6) ensures OMP to identify the support of exactly in the noisy scenario (Theorem 7). Through these results, we also demonstrate that the RIP-based performance guarantee of OMP can be improved by employing an -normalized sampling matrix.

## Ii Preliminaries

We first summarize the notations used in this paper.

• ;

• is the support of ;

• For any , is the cardinality of and ;

• is the restriction of to the elements indexed by ;

• is the -th column of ;

• is the submatrix of with the columns indexed by ;

• is the column space of ;

• If has full column rank, then is the pseudoinverse of where is the transpose of ;

• is the smallest singular value of

;

• and are ()-dimensional matrices with entries being zeros and ones, respectively;

• is the

-dimensional identity matrix;

• and are the orthogonal projections onto and its orthogonal complement, respectively.

We next give some lemmas useful in our analysis. The first lemma is about the monotonicity of the RIP constant.

###### Lemma 1 ([9], [13, Lemma 1]).

If a matrix satisfies the RIP of orders and , then .

The second lemma is often called the modified RIP of a projected matrix.

###### Lemma 2 ([14, Lemma 1]).

Let . If a matrix satisfies the RIP of order , then for any with ,

 (1−δ|T∪J|)∥xT∖J∥22≤∥P⊥JΦx∥22≤(1+δ|T∪J|)∥xT∖J∥22.

The third lemma gives an equivalent form to the identification rule of OLS in Table I.

###### Lemma 3 ([2, Theorem 1]).

Consider the system model in (1). Let and be the estimated support and the residual generated in the -th iteration of the OLS algorithm, respectively. Then the index chosen in the -th iteration of OLS satisfies

 tk+1 =argmaxj∈Ω∖Tk|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2. (7)

One can deduce from (7) that OLS picks a support index in the ()-th iteration (i.e., ) if and only if

 maxj∈T∖Tk|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 >maxj∈Ω∖(T∪Tk)|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2. (8)

To examine if OLS is successful in the -th iteration, therefore, it suffices to check whether (8) holds.

The next lemma plays a key role in bounding the right-hand side of (8).

###### Lemma 4.

Consider the system model in (1) where has unit -norm columns. Let be a subset of and . If obeys the RIP of order with

 δK+1 ≤12, (9)

then

 maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 ≤δK+1∥rk∥22∥xT∖Tk∥2. (10)
###### Proof.

Let and . Then, what we need to show is

 t ≤δK+1∥rk∥22 (11)

under (9). Let and

 u=⎡⎢ ⎢⎣xT∖Tk−\operatornamewithlimitssgn(ϕ′jrk)∥xT∖Tk∥2∥P⊥Tkϕj∥2⎤⎥ ⎥⎦, v=⎡⎢ ⎢⎣xT∖Tk\operatornamewithlimitssgn(ϕ′jrk)∥xT∖Tk∥2∥P⊥Tkϕj∥2⎤⎥ ⎥⎦, (12)

where is the signum function. Noting that and , we have

 ∥Ψu∥22 =∥∥ ∥∥rk−\operatornamewithlimitssgn(ϕ′jrk)∥xT∖Tk∥2⋅P⊥Tkϕj∥P⊥Tkϕj∥2∥∥ ∥∥22 =∥rk∥22−2t+∥xT∖Tk∥22 (13)

and

 ∥Ψv∥22 =∥∥ ∥∥rk+\operatornamewithlimitssgn(ϕ′jrk)∥xT∖Tk∥2⋅P⊥Tkϕj∥P⊥Tkϕj∥2∥∥ ∥∥22 =∥rk∥22+2t+∥xT∖Tk∥22. (14)

Also, from Lemma 2, we have

 ∥Ψu∥22 ≥(1−δK+1)(1+1∥P⊥Tkϕj∥22)∥xT∖Tk∥22 (15)

and

 ∥Ψv∥22 ≤(1+δK+1)(1+1∥P⊥Tkϕj∥22)∥xT∖Tk∥22. (16)

By combining (13)-(16), we obtain

 ∥rk∥22−2t ≥(1−(1+∥P⊥Tkϕj∥22)δK+1)∥xT∖Tk∥22∥P⊥Tkϕj∥22 (17)

and

 ∥rk∥22+2t ≤(1+(1+∥P⊥Tkϕj∥22)δK+1)∥xT∖Tk∥22∥P⊥Tkϕj∥22. (18)

Note that since has unit -norm columns, and thus we have

 1−(1+∥P⊥Tkϕj∥22)δK+1≥1−2δK+1(a)≥0,

where (a) is from (9). Then, by combining (17) and (18), we have

 (∥rk∥22−2t)(1+(1+∥P⊥Tkϕj∥22)δK+1) ≥(∥rk∥22+2t)(1−(1+∥P⊥Tkϕj∥22)δK+1),

which is equivalent to

 2t ≤(1+∥P⊥Tkϕj∥22)δK+1∥rk∥22.

Finally, by exploiting , we obtain the desired result (11). ∎

###### Remark 1.

The bound in (10) is tight in the sense that the equality of (10) is attainable. To see this, we consider the following example:

 x=[01K×1] \operatornamewithlimitsand Φ=⎡⎢⎣√K−1K01×K1K1K×1IdK⎤⎥⎦.

We now take a look at the left- and right-hand sides of (10) when . Since , the left-hand side of (10) is

 maxj∈Ω∖T|⟨r0,ϕj⟩|∥P⊥T0ϕj∥2 =|⟨y,ϕ1⟩|=1.

Furthermore, since the RIP constant of is (see [19, p.3655]), the right-hand side of (10) is also

 δK+1∥r0∥22∥xT∖T0∥2 =δK+1∥y∥22∥xT∥2=1.

Therefore, the bound in (10) is tight.

###### Remark 2.

Lemma 4 is motivated by [6, Lemma 4], where the inequality

 maxj∈Ω∖T|⟨rk,ϕj⟩| ≤∥rk∥22√α∥xT∖Tk∥2 (19)

was established under

 δK+1 ≤1√α+1. (20)

We note that our result in Lemma 4 outperforms this result in the following aspects:

1. Under the same RIP condition (), a tighter upper bound of can be established using Lemma 4. By applying [6, Lemma 4] with , we have

 maxj∈Ω∖T|⟨rk,ϕj⟩| ≤∥rk∥22√3∥xT∖Tk∥2 (21)

under . Under the same condition on , it can be deduced from Lemma 4 that

 maxj∈Ω∖T|⟨rk,ϕj⟩| (a)≤maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 ≤δK+1∥rk∥22∥xT∖Tk∥2 ≤∥rk∥222∥xT∖Tk∥2, (22)

where (a) is because for each of . Clearly, the bound in (22) is tighter than that in (21) by the factor of .

2. In [6], by putting into (19), the inequality

 maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 <∥rk∥22√|T∖Tk|∥xT∖Tk∥2 (23)

is established under

 δK+1 <1√K+1. (24)

We mention that the inequality (23) can be established under a weaker RIP condition using Lemma 4. Specifically, when , (23) can be obtained from Lemma 4 under

 δK+1 <1√K (25)

because

 maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 ≤δK+1∥rk∥22∥xT∖Tk∥2 <∥rk∥22√K∥xT∖Tk∥2 ≤∥rk∥22√|T∖Tk|∥xT∖Tk∥2.

Clearly, (25) is less restrictive than (24) in all sparsity region.

## Iii Optimal Performance Guarantee of OLS

In this section, we present an optimal RIP condition for the success of the OLS algorithm when an -normalized sampling matrix is employed. First, we present a sufficient condition ensuring the success of OLS.

###### Theorem 1.

Let be a sampling matrix having unit -norm columns. If satisfies the RIP of order with

 δK+1

then the OLS algorithm exactly reconstructs any -sparse vector from its samples in iterations.

###### Proof.

We show that the OLS algorithm picks a support index in each iteration. In other words, we show that for all of . In doing so, we have , and thus OLS can recover accurately:

 (xK)TK=A†TKy=A†TKATxT=A†TATxT=xT. (27)

First, we consider the case for . This case is trivial since

 T0=∅⊂T.

Next, we assume that for some integer and then show that OLS picks a support index in the -th iteration. As mentioned, if and only if the condition (8) holds. Since the left-hand side of (8) satisfies [6, Proposition 2]

 maxj∈T∖Tk|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 ≥∥rk∥22√K−k∥xT∖Tk∥2,

it suffices to show that

 maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 <∥rk∥22√K−k∥xT∖Tk∥2. (28)

Towards this end, we consider three cases: (i) , (ii) , and (iii) .

• In this case, and thus we need to show that

 maxj∈Ω∖T|⟨y,ϕj⟩| <∥y∥22∥xT∥2. (29)

Without loss of generality, we assume that

 x =[c0(n−1)×1]

for some . Then, and thus the right-hand side of (29) is simply . As a result, it suffices to show that

 maxj∈Ω∖T|⟨y,ϕj⟩| <|c|. (30)

Let and be the angle between and (). Then, by [22, Lemma 2.1], we have

 |cosθ|≤δ2

and hence

 |⟨y,ϕj⟩|=|⟨cϕ1,ϕj⟩|=|ccosθ|≤|c|δ2. (31)

Using this together with (26), we obtain the desired result (30).

• Let and

 q=√K−k∥xT∖Tk∥2|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2.

Then, our task is to show that

 q <∥rk∥22. (32)

Let and

 w =⎡⎢ ⎢⎣xT∖Tk−sgn(ϕ′jrk)√K−k∥xT∖Tk∥22∥P⊥Tkϕj∥2⎤⎥ ⎥⎦. (33)

Then, by noting that and , we have

 ∥Ψw∥22 =∥∥ ∥∥rk−\operatornamewithlimitssgn(ϕ′jrk)√K−k∥xT∖Tk∥22⋅P⊥Tkϕj∥P⊥Tkϕj∥2∥∥ ∥∥22 =∥rk∥22−q+(K−k)∥xT∖Tk∥224. (34)

Also, from Lemma 2, we have

 ∥Ψw∥22 ≥(1−δK+1)(1+K−k4∥P⊥Tkϕj∥22)∥xT∖Tk∥22 (a)≥(1−δK+1)(1+K−k4)∥xT∖Tk∥22, (35)

where (a) is because . By combining (34) and (35), we obtain

 ∥rk∥22−q ≥(1−(1+K−k4)δK+1)∥xT∖Tk∥22 (36) (a)>(1−(1+K−k4)CK)∥xT∖Tk∥22 ≥(1−(1+K4)CK)∥xT∖Tk∥22 (b)=0,

where (a) follows from (26) and (b) is because

 (1+K4)CK=1

for each of (see (26)). Thus, we have (32), which is the desired result.

• In this case, the RIP constant of satisfies

 δK+1(a)<1√K≤12,

where (a) is from (26). Then, by applying Lemma 4, we obtain

 maxj∈Ω∖T|⟨rk,ϕj⟩|∥P⊥Tkϕj∥2 ≤δK+1∥rk∥22∥xT∖Tk∥2 (a)<∥rk∥22√K∥xT∖Tk∥2 ≤∥rk∥22√K−k∥xT∖Tk∥2, (37)

where (a) is from (26). This completes the proof. ∎

One can observe from (31) that (30) holds if and are linearly independent.222If and are linearly independent, then cannot be zero or . Thus, and then . This implies that when , OLS ensures the perfect recovery of if any two columns of are linearly independent (i.e., ). We note that this condition is the fundamental limit (i.e., the minimum requirement on ) to guarantee the exact sparse recovery [15, Theorem 2].

There have been previous efforts to analyze a recovery guarantee of the OLS algorithm [3, 6, 7, 8]. So far, the best result states that OLS guarantees the exact reconstruction under [6, Theorem 1]

 δK+1 <1√K+1. (38)

Clearly, the proposed guarantee (26) is less restrictive than this condition in all sparsity region. One might wonder about the key difference between our analysis and previous ones. One major concern in the analysis of OLS lies in dealing with the denominator term in (7), and various efforts have been made to handle this term. In [3, eq. (E.7)], the inequality

 ∥P⊥Tkϕj∥2 ≥ ⎷1−δ2|Tk|+11−δ|Tk|

was developed and employed to establish (3). This inequality was recently improved to [6, 16]

 ∥P⊥Tkϕj∥2 ≥√1−δ2|Tk|+1,

which leads to the improved performance guarantee (38). While previous studies have focused on constructing a tight lower bound of and using the bound as an estimate of , we do not approximate this term to prevent the loss, if any, caused by its relaxation. In fact, by properly defining some vectors incorporated with (e.g., and in (12)), the inequality (10) could be obtained without any relaxation on , which eventually leads us to obtain the improved guarantee (26).

We now demonstrate that the proposed guarantee (26) is optimal. This argument is established by showing that if , then there exists a counterexample for which OLS fails the recovery.

###### Theorem 2.

For any positive integer and constant

 δ∗ ∈[CK,1), (39)

there always exist an -normalized matrix with and a -sparse vector such that the OLS algorithm fails to recover from in iterations.

###### Proof.

It is enough to consider the case where since there is no satisfying when . In our proof, we consider three cases: (i) , (ii) , and (iii) .

• We consider and an -normalized matrix satisfying (see Appendix A for details)

 Φ′Φ=⎡⎢ ⎢ ⎢ ⎢⎣1δ∗2δ∗2δ∗21−δ∗2δ∗2−δ∗21⎤⎥ ⎥ ⎥ ⎥⎦. (40)

First, we compute the RIP constant of

. Note that the eigenvalues

of are

 λ1=λ2=1+δ∗2, λ3=1−δ∗.

Then, by exploiting the connection between the eigenvalues of and the RIP constant of  [13, Remark 1], we have

 δ3=maxi∈{1,2,3}|λi−1|=δ∗.

In short, is an -normalized matrix satisfying the RIP with . We now take a look at the first iteration of the OLS algorithm. Note that

 |⟨y,ϕj⟩|

and

 δ∗≥C2=1√K+14∣∣ ∣ ∣∣K=2=23

by (39). Thus, the index chosen in the first iteration would be333When , by the tie-breaking rule of OLS (see Table I).

and hence OLS cannot recover in