 # Orthogonal Matching Pursuit with Tikhonov and Landweber Regularization

The Orthogonal Matching Pursuit (OMP) for compressed sensing iterates over a scheme of support augmentation and signal estimation. We present two novel matching pursuit algorithms with intrinsic regularization of the signal estimation step that do not rely on a priori knowledge of the signal's sparsity. An iterative approach allows for a hardware efficient implementation of our algorithm, and enables real-world applications of compressed sensing. We provide a series of numerical examples that demonstrate a good performance, especially when the number of measurements is relatively small.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A well-known theorem in signal processing is the Nyquist-Shannon theorem: It states that any band-limited signal can be exactly recovered by sampling with a rate of no more than two times its highest frequency. In many applications, however, signals only carry little information compared to the space where the signal is acquired in. Compressed sensing—we refer to [CW08] for a review—relies on the assumption that the gap between the dimension of the signal and its information content is expressed through the signal’s sparsity, i.e. the number of its non-zero entries. Compressed sensing has demonstrated its capabilities in many applications such as medical imaging [LDP07], wireless communications [LT10], and the Internet of Things [LXW13], see also [QBIN13] for a review.

### 1.1 Compressed sensing and the restricted isometry property

The sampling process itself is described by the application of a linear functional to the signal. In the discrete setting, the acquisition of samples of an unknown signal can be written as

 Ax=y

with a sampling matrix . We say that is -sparse if . Given a set with , we denote by the matrix that consists of the columns of indexed by . Similarly, we denote by

the vector that consists of the elements of

indexed by .

It is easy to see that every -sparse signal cannot be reconstructed by measurements. Indeed, given a sampling matrix with , a basic result of linear algebra yields

 rank(A)≤m<2k,

and one can find a -sparse vector with . Decomposing into two -sparse vectors with , yields

 Ax−Ay=A(x−y)=Aw=0,

i.e. . We have shown that does not map all pairs of different -sparse signals to pairs of different samples, and exact signal recovery is impossible.

The previous example shows that should be injective, at least on the set of sparse signals. This idea motivates the following definition due to CT05 [CT05].

###### Definition 1.

Let and . Then, the restricted isometry constant (RIC) of is defined as the smallest such that

 (1−δk)∥x∥22≤∥Ax∥22≤(1+δk)∥x∥22 (1)

for all -sparse .

The restricted isometry constant of order is hard to verify for a given sampling matrix as it requires the computation of submatrices consisting of columns taken from

. Surprisingly, many random matrices have a small RIC with a high probability. If, for example, the entries of

are sampled independently from a standard Gaussian distribution, the matrix

has a small RIC of order with high probability, if

 m≥Cklog(N/k). (2)

We refer to [CT06] for more details.

### 1.2 Signal reconstruction without noise

Naturally, conditions on the sensing matrix and reconstruction algorithms come in pairs. One famous example due to CT05 is the basis pursuit (BP) [CT05, C08]: If , , and , then every -sparse is the unique solution of

 minz∈RN∥z∥1 s.t. Az=y

with input . The bound on has been further improved in [F10]. The BP can be understood as the convex relaxation of an -pseudonorm functional [T06].

Another class of reconstruction algorithms are greedy pursuit algorithms that can be motivated by the following: If the sensing matrix was an isometry, then , and the signal can be recovered from . In this ideal scenario, the support of can be recovered by . If is not an isometry, we call the observation vector of . The orthogonal mating pursuit (OMP) due to TG07 [TG07] iteratively adds the coordinate of the biggest value in magnitude of to the recovered support set of . The new signal estimate is then computed by projecting onto the column space of , and the next observation is given by , where is the residual. The OMP is summarized in Algorithm 1.

Note that Step 3 of OMP is equivalent to computing the pseudoinverse , provided that has a trivial kernel. In particular, the new signal estimate is obtained by

 xS ←A†Sy, xSC ←0.

TG07 proved the following non-uniform recovery result for the OMP algorithm [TG07]: Given a Gaussian sensing matrix , the support of every -sparse signal is recovered by OMP with input with high probability, if

 m≥CklogN (3)

holds. Using the restricted isometry condition, WZWT17 [WZWT17] derived a uniform sharp condition for exact support recovery with OMP: If satisfies

 δk+1<1√k+1, (4)

then the support of every -sparse signal is exactly recovered by OMP with input within iterations. Conversely, for every sparsity level , there exist a -sparse signal , and a sensing matrix with

 δk+1=1√k+1,

such that OMP cannot recover the support of within iterations.

Compared to solving the minimization problem of BP, matching pursuit algorithms are known to have a low computational complexity [NV09, TG07]: They add exactly one coordinate per iteration to the support estimate and solve the projection problem for the new signal estimate. They are therefore good candidates for efficient hardware implementations.

### 1.3 Signal reconstruction in the presence of noise

A naturally arising question is the robustness of signal recovery in the presence of noise. In this scenario, the signal acquisition reads as

 y=Ax+v,

where is an unknown noise term. There are several robustness results for BP [CRT06, CDS98, CT06] and OMP [WZWT17, CW11, W15]. In particular, for the OMP algorithm, a second assumptions besides (4) needs to be made to guarantee successful signal recovery. Namely, if satisfies (4) and

 mini∈supp(x)|xi|>2ε1−√K+1δk+1,

then the support of every -sparse signal is recovered by OMP with input with the stopping rule within iterations, where is the OMP residual (see Algorithm 1), and is the noise energy. A similar sufficient condition for exact recovery can be found, see [WZWT17] for more details.

Similar to the basis pursuit, regularization has been introduced to the orthogonal matching pursuit by various means. We have identified two major strands in the available literature: The refinement of the support set augmentation and the regularization of the projection step. Most of the literature focuses on regularization of the support augmentation, i.e. Part 1 in Algorithm 1:

• The regularized OMP (ROMP) due to NV09 [NV09, NV10] computes the observation vector of the residual , and selects up to support indices from a trusted interval of coefficient magnitude. Namely, ROMP will select a set of the largest coefficients in magnitude, and seek a subset such that the smallest coefficient is not bigger than twice the largest coefficient selected, i.e.

 |ui|≤2|uj| for all i,j∈J0.

If there are multiple such sets , ROMP will choose the one with the maximal energy . The set is then added to the set , which completes the support augmentation. The remainder of the algorithm is similar to the OMP.

• The compressive sampling matching pursuit (CoSaMP) due to NT09 [NT09] uses the coordinates of the largest coefficients in magnitude of the observation vector as estimate for . A -sparse signal is then estimated by the least squares problem

 ξT=arg minz∈R2k∥ATz−y∥2,

and a -sparse approximation for is given by the largest entries of in magnitude. The remainder of the algorithm is similar to the OMP.

• Finally, the hard thresholding pursuit (HTP) due to F11 [F11] selects the coordinates of the largest coefficients in as support estimate , and then projects onto . These steps are iterated with an arbitrary -sparse initialization for until a halting criterion is met.

For a fixed sparsity level , the reconstruction error of these algorithms is linear in , see [NV10, NT09, F11]. The same holds true for the BP [CRT06].

To the author’s best knowledge, there is only one algorithm that applies regularization to the signal estimation in OMP (Step 3 in Algorithm 1): The stochastic gradient pursuit (SGP) algorithm due to LCHW17 [LCHW17] replaces the computation of the pseudoinverse in Step 3 with the least mean squares (LMS) estimate of . This approach is motivated by the LMS adaptive filter due to WMLJ76 [WMLJ76], whereby the rows of are considered to be the stochastic input to a digital filter with desired outputs , and plays the role of that filter’s unknown weight vector. The objective of the LMS algorithm is to minimize the mean-squared error

 E[(yℓ−uℓxS)2]=E[y2ℓ]−2E[yℓuℓ]xS+xTSE[uTℓuℓ]xS

by stochastic gradient descent, where the expectation is taken over all rows of

with equal probability. The unknown expectations in the gradient

 ∇xSE[(yℓ−uℓxS)2]=−2E[yℓuℓ]+2E[uTℓuℓ]xS

are approximated by gradients of single samples. We refer to [WMLJ76] for more details. The SGP is summarized in Algorithm 2.

Note that the regularization of the support augmentation relies on knowledge about the sparsity level of the unknown signal . This information is required in every iteration when the next support set estimate of appropriate size has to be determined. In contrast, both OMP and SGP add no more than one coordinate per iteration to the support estimate, and the SGP algorithm uses the sparsity information only to find an upper bound for the LMS step size , see Algorithm 2.

### 1.4 Contributions

The contributions of this work are as follows:

1. Introduce two other means of regularization in the signal estimation step: We could identify only one publication [LCHW17] where the signal estimation step of OMP is regularized. We broaden this picture and introduce two other well-established regularization methods in the signal estimation step of the OMP, namely Tikhonov regularization and Landweber iteration. This approach does not rely on a priori knowledge of the signal’s sparsity level.

2. Hardware feasibility: In many application areas, limitations of sensor size and battery lifetime create the need for efficient hardware implementations [SM12]. The SGP algorithm’s LMS estimate is tailored to be a hardware efficient reconstruction method [LCHW17]. This work broadens the class of iterative hardware efficient algorithms where regularization an intrinsically built-in feature of the proposed algorithm.

3. Provide uniform comparison: This work provides a series of numerical experiments that serve as a proof of concept for the previously made claims. We thereby employ the same stopping criterion for all algorithms, making a fair comparison is possible. As a by-product, this stopping criterion improves the stopping criterion for SGP as proposed in [LCHW17].

## 2 Regularization of the pseudoinverse

Most of the reviewed literature employs regularization in the support augmentation step of OMP, i.e. Part 1 of Algorithm 1. This work aims to regularize the computation of the signal estimate, i.e. Part 2 of Algorithm 1. We assume throughout this whole section that , and . The latter holds true with high probability for many matrices with random entries, see e.g. [BG09].

We begin with the noise-free scenario. Given the exact measurements and a support set estimate , the signal estimate is given by

 xS =arg minz∈R|S|∥y−ASz∥2=A†Sy, xSC =0.

If , it is easy to see that the signal estimate corresponds to the true signal. If only an approximation of with for is available, the OMP signal estimate reads as

 xεS =arg minz∈R|S|∥yε−ASz∥2=A†Syε (5) xεSC =0.

It is a standard result from numerical linear algebra that, depending on the condition number of the matrix , the measurement error amplifies by the action of the pseudoinverse. The condition number is defined as

 κ=σmax(AS)σmin(AS),

where and

are the biggest and smallest non-zero singular values of

, respectively. If has the restricted isometry constant of order , equation (1) implies that the singular values of lie between and , and therefore

 κ≤√1+δk√1−δk.

In the light of (3), a favorable condition number is reached, when the number of measurements is large, compared to the ambient dimension and the RIC-order . The effect of regularizing the signal estimate in the OMP is therefore stronger in the regime where .

There are two standard approaches due to Tikhonov and Landweber that deal with unstable solutions under data perturbations. One famous regularization method is Tikhonov regularization, where the estimate is replaced by

 xα,εS =(A∗SAS+αI)−1A∗Syε, (6) xα,εSC =0,

for some .

The regularized signal estimate is an approximation of the pseudoinverse in the following sense [EHN00]:

###### Theorem 2.

Let be defined as in (6) with . Let and . For , the noise amplification is controlled by the regularization parameter as

 ∥xα,0−xα,ε∥≤ε√α.

If is such that

 limε→0α(ε)=0andlimε→0ε2α(ε)=0,

then

 limε→0xα(ε),ε=A†y.

A straightforward extension of the signal estimation of the OMP algorithm is to replace the direct signal estimate (5) by its regularized counterpart (6). The Tikhonov regularized OMP (T-OMP) is summarized in Algorithm 3. Note that the T-OMP algorithm relies on the computation of a matrix inverse, which may prevent a hardware efficient implementation.

Another well-known regularization method is the Landweber iteration. The new signal estimate reads as

 xℓ,εS =xℓ−1,εS+ωA∗S(yε−ASxℓ−1,εS), (7) xℓ,εSC =0,

where , and . The Landweber iteration is an approximation of the pseudoinverse in the following sense [EHN00]:

###### Theorem 3.

If , and , are two iteration sequences given by (7), then

 xℓ,0→A†y as ℓ→∞,and∥xℓ,0−xℓ,ε∥≤√ℓε.

This iterative regularization does not rely matrix inversion and therefore allows for a hardware efficient implementation. Note that the constraint can be satisfied by choosing which in turn is easy to compute. The Landweber regularized OMP (L-OMP) is summarized in Algorithm 4. Note that the inner loop (Step 3b) is initialized with in the first iteration of the outer loop. In the subsequent iterations, the inner loop takes the previous signal estimate as initialization for the Landweber iteration.

Remark: A similar approach can be found in [F11], where the same iterative scheme is used in the Fast Hard Thresholding Pursuit (FHTP). Similar ideas are discussed in [NT09] for the CoSaMP. However, note that both algorithms rely on a priori information on the sparsity level for the support augmentation step.

## 3 Numerical results

In this section, we evaluate the performance of the proposed T-OMP and L-OMP algorithms by a numerical experiment. We generate a random sensing matrix where each entry is taken independently from a standard Gaussian distribution. Following [LCHW17], the columns of are then normalized. For the ambient dimension, we set , and we consider numbers of measurement . The sparsity level is fixed at , and the non-zeros entries of the signal vector

are taken independently from a uniform distribution supported on

. The noise level is measured via the signal-to-noise ratio (SNR), given by

 SNR=10log10(∥Ax∥22∥v∥22).

Each component of the noise vector is sampled independently from a standard Gaussian distribution. The noise vector is scaled thereafter such that the desired SNR is attained. The reconstruction performance is measured by normalized root-mean-square error (NRMSE) defined by

 NRMSE=1√N∥x−^x∥2Δ,

where is the true signal, is the reconstructed signal, and is the spread between the largest and smallest entry of . Figure 1: Reconstruction performance of the T-OMP algorithm with different regularization parameters α for different signal-to-noise ratios. The number of measurements is m=16. The curves for OMP, SGP, and CoSaMP serve as benchmark.

In order to make the presented algorithms’ performance comparable, we employ the same stopping rule in all our experiments: The iteration is terminated as soon as , where is the matching pursuit’s residual, and is the true energy of the generated noise. This stopping rule is somewhat of theoretical nature, since the true noise energy is unknown in applications. However, only a uniform halting criterion will allow for a fair competition between the algorithms assessed. Note that LCHW17 propose a fixed halting criterion which is independent from the magnitude of the noise. All implemented algorithms use the number of measurements as maximum number of iterations. The presented values are the average of 1,000 trials for each combination of method and noise level.

Figure 1 displays the reconstruction performance of the proposed T-OMP algorithm for . The presented algorithms SGP, CoSaMP and T-OMP outperform OMP, with the exception of CoSaMP in the high noise regime. Looking at the SGP algorithm, the performance gap between parameter choice (as proposed in [LCHW17]) and highlights the importance of a uniform stopping criterion for a fair algorithm comparison. Interestingly, T-OMP with and reaches a stable estimated support size as the noise level decreases. The same holds true for SGP with a fixed, noise-independent . Figure 2: Reconstruction performance of the L-OMP algorithm with different regularization parameters λ for different signal-to-noise ratios. The number of measurements is m=16. The curves for OMP, SGP, and CoSaMP serve as benchmark. Note that the curves for OMP (blue) and L-OMP with λ=100 (pink) are often overlapping.

Figure 2 displays the reconstruction performance of the proposed L-OMP algorithm in the same setting. For , we observe that L-OMP produces results almost identical to the OMP algorithm as the pseudo-inverse is very closely approximated. Surprisingly, the Landweber method with only one iteration in the inner loop, i.e. , finds a good reconstruction of the true signal in terms of NRMSE while it tends to misestimate the support size. For , we observe a good approximation of the OMP solution which cannot outperform CoSaMP and SGP. However, note that SGP requires iterations in its inner loop. Figure 3: Reconstruction performance of the T-OMP algorithm with different regularization parameters for different signal-to-noise ratios. The number of measurements is m=64. The curves for OMP, SGP, and CoSaMP serve as benchmark. Figure 4: Reconstruction performance of the L-OMP algorithm with different regularization parameters for different signal-to-noise ratios. The number of measurements is m=64. The curves for OMP, SGP, and CoSaMP serve as benchmark. Note that the curves for OMP (blue) and L-OMP with λ=100 (pink) are overlapping.

We now repeat the numerical experiments with a higher number of measurements, . In the light of (2

), the random matrix

is now more likely to have a good condition number. Furthermore, in the light of (3), the OMP algorithm has now a higher success probability. Indeed, Figures 34 confirm that the OMP algorithm cannot be outperformed on any noise level and regularization does not improve the reconstruction performance. Note that for an increasing SNR, the OMP now captures the correct support size. Looking at the iterative methods (Figure 4), the SGP tends to overestimate the support size. In this case, the Landweber method can be a good alternative as it approximates the OMP solution very well after only 10 iterations in the inner loop, while the SGP requires iterations in its LMS step. Note that this setup reproduces the results of [LCHW17]. Clearly, the proposed choice of in the SGP does not demonstrate the algorithm’s full reconstruction capabilities. Figure 5: Reconstruction performance of the T-OMP algorithm with different regularization parameters for different signal-to-noise ratios. The number of measurements is m=32. The curves for OMP, SGP, and CoSaMP serve as benchmark. Figure 6: Reconstruction performance of the L-OMP algorithm with different regularization parameters for different signal-to-noise ratios. The number of measurements is m=32. The curves for OMP, SGP, and CoSaMP serve as benchmark. Note that the curves for OMP (blue) and L-OMP with λ=100 (pink) are overlapping.

Finally, Figures 56 show the same simulation with measurements. Both T-OMP and L-OMP outperform OMP in the high and medium noise regime. One can easily see that the choice leads to a poor overall performance due to a too strong regularization. T-OMP with has the best average success rates—at the price of an overestimated support. The L-OMP reaches the best overall performance with only one iteration (), which demonstrates the advantages of regularized methods over the direct computation of the pseudoinverse.

## 4 Conclusion

In this work, we have derived two extensions of the OMP algorithm for compressed sensing based on Tikhonov regularization and Landweber iteration. A series of numerical experiments confirms the positive effect of regularization, especially in situations where the sampling matrix does not act like an almost-isometry on the set of sparse vectors. In particular, in situations where the number of measurements is comparably small, T-OMP and L-OMP outperform OMP and CoSaMP with a unified stopping criterion. Unlike CoSaMP, the proposed algorithms no not rely on a priori information of the signal’s sparsity. The L-OMP algorithm is an alternative iterative method for signal reconstruction that allows for a hardware-friendly implementation in the spirit of [LCHW17], as it renounces the computation of a pseudoinverse or the solution of a linear system, and shows good results after only few steps.

An important open question for applications are good parameter choice rules or heuristics for the regularization parameters in T-OMP and L-OMP, and halting criteria that lead to provable recovery guarantees. Parameter choice rules and halting criteria usually rely on

a priori information on the signal such as the sparsity or noise energy, such that the fine-tuning of T-OMP and L-OMP remains a highly application-specific problem.

The OMP algorithm and its robust modifications iterate over a scheme of support augmentation and signal estimation. With this modularity in mind, our regularization approach can be employed within the framework of other matching pursuits as well. For example, a study of CoSaMP or ROMP can be of future interest, where both the support augmentation (by the algorithms’ corresponding techniques) and the signal estimation (by our proposed approach) are simultaneously regularized. By this fused approach, even higher reconstruction rates than proposed in this work can possibly be achieved.