Adaptive-Rate Sparse Signal Reconstruction With Application in Compressive Background Subtraction

03/11/2015 ∙ by João F. C. Mota, et al. ∙ EPFL Carnegie Mellon University 0

We propose and analyze an online algorithm for reconstructing a sequence of signals from a limited number of linear measurements. The signals are assumed sparse, with unknown support, and evolve over time according to a generic nonlinear dynamical model. Our algorithm, based on recent theoretical results for ℓ_1-ℓ_1 minimization, is recursive and computes the number of measurements to be taken at each time on-the-fly. As an example, we apply the algorithm to compressive video background subtraction, a problem that can be stated as follows: given a set of measurements of a sequence of images with a static background, simultaneously reconstruct each image while separating its foreground from the background. The performance of our method is illustrated on sequences of real images: we observe that it allows a dramatic reduction in the number of measurements with respect to state-of-the-art compressive background subtraction schemes.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Consider the problem of reconstructing a sequence of sparse signals from a limited number of measurements. Let  be the signal at time  and

be the vector of signal measurements at time 

, where . Assume the signals evolve according to the dynamical model

(1a)
(1b)

where  is modeling noise and  is a sensing matrix. In (1a), is a known, but otherwise arbitrary, map that describes  as a function of past signals. We assume that each  and  is sparse, i.e., it has a small number of nonzero entries. Our goal is to reconstruct the signal sequence  from the measurement sequence . We require the reconstruction scheme to be recursive (or online), i.e., is reconstructed before acquiring measurements of any future signal , , and also to use a minimal number of measurements. We formalize the problem as follows.

Problem statement. Given two unknown sparse sequences  and  satisfying (1), design an online algorithm that 1) uses a minimal number of measurements  at time , and 2) perfectly reconstructs each  from  acquired as in (1b), and possibly , .

Note that our setting immediately generalizes from the case where each  is sparse to the case where  has a sparse representation in a linear, invertible transform.111If  is not sparse but is, where 

is an invertible matrix, then redefine 

as the composition and  as . The signal  satisfies (1) with  and .

I-a Applications

Many problems require estimating a sequence of signals from a sequence of measurements satisfying the model in (1

). These include classification and tracking in computer vision systems 

[2, 3], radar tracking [4], dynamic MRI [5] and several tasks in wireless sensor networks [6].

Our application focus, however, is compressive background subtraction [7]. Background subtraction is a key task for detecting and tracking objects in a video sequence and it has been applied, for example, in video surveillance [8, 9], traffic monitoring [10, 11], and medical imaging [12, 13]. Although there are many background subtraction techniques, e.g., [14, 3, 15], most of them assume access to full frames and, thus, are inapplicable in compressive video sensing [16, 17, 18], a technology used in cameras where sensing is expensive (e.g., infrared, UV wavelengths).

In compressive video sensing, one has access not to full frames as in conventional video, but only to a small set of linear measurements of each frame, as in (1b). Cevher et al. [7] noticed that background subtraction is possible in this context if the foreground pixels, i.e., those associated to a moving object, occupy a small area in each frame. Assuming the background image is known beforehand, compressed sensing techniques [19, 20] such as -norm minimization allow reconstructing each foreground. This not only reconstructs the original frame (if we add the reconstructed foreground to the known background), but also performs background subtraction as a by-product [7].

We mention that, with the exception of [21, 22], most approaches to compressive video sensing and to compressive background subtraction assume a fixed number of measurements for all frames [16, 7, 23, 24, 17, 18]. If this number is too small, reconstruction of the frames fails. If it is too large, reconstruction succeeds, but at the cost of spending unnecessary measurements in some or all frames. The work in [21, 22] addresses this problem with an online scheme that uses cross validation to compute the number of required measurements. Given a reconstructed foreground, [21, 22] estimates the area of the true foreground using extra cross-validation measurements. Then, assuming that foreground areas of two consecutive frames are the same, the phase diagram of the sensing matrix, which was computed beforehand, gives the number of measurements for the next frame. This approach, however, fails to use information from past frames in the reconstruction process, information that, as we will see, can be used to significantly reduce the number of measurements.

I-B Overview of our approach and contributions

Overview. Our approach to adaptive-rate signal reconstruction is based on the recent theoretical results of [25, 26]. These characterize the performance of sparse reconstructing schemes in the presence of side information. The scheme we are most interested in is the - minimization:

(2)

where  is the optimization variable and  is the -norm. In (2), is a vector of measurements and  is a positive parameter. The vector  is assumed known and is the so-called prior or side information: a vector similar to the vector that we want to reconstruct, say . Note that if we set  in (2), we obtain basis pursuit [27], a well-known problem for reconstructing sparse signals and which is at the core of the theory of compressed sensing [20, 19]. Problem (2) generalizes basis pursuit by integrating the side information . The work in [25, 26] shows that, if  has reasonable quality and the entries of 

are drawn from an i.i.d. Gaussian distribution, the number of measurements required by (

2) to reconstruct  is much smaller than the number of measurements required by basis pursuit. Furthermore, the theory in [25, 26] establishes that  is an optimal choice, irrespective of any problem parameter. This makes the reconstruction problem (2) parameter-free.

We address the problem of recursively reconstructing a sequence of sparse signals satisfying (1) as follows. Assuming the measurement matrix is Gaussian,222Although Gaussian matrices are hard to implement in practical systems, they have optimal performance. There are, however, other more practical matrices with a similar performance, e.g., [28, 29]. we propose an algorithm that uses (2) with  to reconstruct each signal . And, building upon the results of [25, 26], we equip our algorithm with a mechanism to automatically compute an estimate on the number of required measurements. As application, we consider compressive background subtraction and show how to generate side information from past frames.

Contributions. We summarize our contributions as follows:

  • We propose an adaptive-rate algorithm for reconstructing sparse sequences satisfying the model in (1).

  • We establish conditions under which our algorithm reconstructs a finite sparse sequence 

    with large probability.

  • We describe how to apply the algorithm to compressive background subtraction problems, using motion-compensated extrapolation to predict the next image to be acquired. In other words, we show how to generate side information.

  • Given that images predicted by motion-compensated extrapolation are known to exhibit Laplacian noise, we then characterize the performance of (2) under this model.

  • Finally, we show the impressive performance of our algorithm for performing compressive background subtraction on a sequence of real images.

Besides the incorporation of a scheme to compute a minimal number of measurements on-the-fly, there is another aspect that makes our algorithm fundamentally different from prior work. As overviewed in Section II, most prior algorithms for reconstructing dynamical sparse signals work well only when the sparsity pattern of  varies slowly with time. Our algorithm, in contrast, operates well even when the sparsity pattern of  varies arbitrarily between consecutive time instants, as shown by our theory and experiments. What is required to vary slowly is the “quality” of the prediction given by each  (i.e., the quality of the side information) and, to a lesser extent, not the sparsity pattern of  but only its sparsity, i.e., the number of nonzero entries.

I-C Organization

Section II overviews related work. In Section III, we state the results from [25, 26] that are used by our algorithm. Section IV describes the algorithm and establishes reconstruction guarantees. Section V concerns the application to compressive background subtraction. Experimental results illustrating the performance of our algorithm are shown in section VI; and section VII concludes the paper. The appendix contains the proofs of our results.

Ii Related work

There is an extensive literature on reconstructing time-varying signals from limited measurements. Here, we provide an overview by referring a few landmark papers.

The Kalman filter. The classical solution to estimate a sequence of signals satisfying (1

) or, in the control terminology, the state of a dynamical system, is the Kalman filter 

[30]. The Kalman filter is an online algorithm that is least-squares optimal when the model is linear, i.e., , and the sequence  is Gaussian and independent across time. Several extensions are available when these assumptions do not hold [31, 32, 33]. The Kalman filter and its extensions, however, are inapplicable to our scenario, as they do not easily integrate the additional knowledge that the state is sparse.

Dynamical sparse signal reconstruction. Some prior work incorporates signal structure, such as sparsity, into online sparse reconstruction procedures. For example, [34, 35] adapts a Kalman filter to estimate a sequence of sparse signals. Roughly, we have an estimate of the signal’s support at each time instant and use the Kalman filter to compute the (nonzero) signal values. When a change in the support is detected, the estimate of the support is updated using compressed sensing techniques. The work in [34, 35], however, assumes that the support varies very slowly and does not provide any strategy to update (or compute) the number of measurements; indeed, the number of measurements is assumed constant along time. Related work that also assumes a fixed number of measurements includes [36], which uses approximate belief propagation, and [37], which integrates sparsity knowledge into a Kalman filter via a pseudo-measurement technique. The works in [38, 39] and [40] propose online algorithms named GROUSE and PETRELS, respectively, for estimating signals that lie on a low-dimensional subspace. Their model can be seen as a particular case of (1), where each map  is linear and depends only on the previous signal. Akin to most prior work, both GROUSE and PETRELS assume that the rank of the underlying subspace (i.e., the sparsity of ) varies slowly with time, and fail to provide a scheme to compute the number of measurements.

We briefly overview the work in [41], which is probably the closest to ours. Three dynamical reconstruction schemes are studied in [41]. The one with the best performance is

(3)

where  and  is the Euclidean -norm. Problem (3) is the Lagrangian version of the problem we obtain by replacing the constraints of (2) with , where  is a bound on the measurement noise; see problem (9) below. For  in a given range, the solutions of (9) and (3) coincide. This is why the approach in [41] is so closely related to ours. Nevertheless, using (9) has two important advantages: first, in practice, it is easier to obtain bounds on the measurement noise  than it is to tune ; second, and more importantly, the problem in (9) has well-characterized reconstruction guarantees [25, 26]. It is exactly those guarantees that enable our scheme for computing of the number of measurements online. The work in [41], as most prior work, assumes a fixed number of measurements for all signals.

Iii Preliminaries: Static signal
reconstruction using - minimization

This section reviews some results from [25], namely reconstruction guarantees for (2) in a static scenario, i.e., when we estimate just one signal, not a sequence. As mentioned before, is an optimal choice: it not only minimizes the bounds in [25], but also leads to the best results in practice. This is the reason why we use  henceforth.

- minimization. Let  be a sparse vector, and assume we have  linear measurements of : , where . Denote the sparsity of  with , where  is the cardinality of a set. Assume we have access to a signal  similar to  (in the sense that is small) and suppose we attempt to reconstruct  by solving the - minimization problem (2) with :

(4)

The number of measurements that problem (4) requires to reconstruct  is a function of the “quality” of the side information . Quality in [25] is measured in terms of the following parameters:

(5a)
(5b)

Note that the number of components of  that contribute to  are the ones defined on the support of ; thus, .

Theorem 1 (Th. 1 in [25]).

Let  be the vector to reconstruct and the side information, respectively. Assume  and that there exists at least one index  for which . Let the entries of 

be i.i.d. Gaussian with zero mean and variance 

. If

(6)

then, with probability at least , is the unique solution of (4).

Theorem 1 establishes that if the number of measurements is larger than (6) then, with high probability, (4) reconstructs  perfectly. The bound in (6) is a function of the signal dimension  and sparsity , and of the quantities  and , which depend on the signs of the entries of  and , but not on their magnitudes. When  approximates  reasonably well, the bound in (6) is much smaller than the one for basis pursuit333Recall that basis pursuit is (2) with . in [42]:

(7)

Namely, [42] establishes that if (7) holds and if  has i.i.d. Gaussian entries with zero mean and variance  then, with probability similar to the one in Theorem 1, is the unique solution to basis pursuit. Indeed, if  and  is larger than a small negative constant, then (6) is much smaller than (7). Note that, in practice, the quantities , , and  are unknown, since they depend on the unknown signal . In the next section, we propose an online scheme to estimate them using past signals.

Noisy case. Theorem 1 has a counterpart for noisy measurements, which we state informally; see [25] for details. Let , where . Let also  be as in Theorem 1 with

(8)

where . Let  be any solution of

(9)

Then, with overwhelming probability, , i.e., (9) reconstructs  stably. Our algorithm, described in the next section, adapts easily to the noisy scenario, but we provide reconstruction guarantees only for the noiseless case.

1:, a positive sequence , and estimates and of the sparsity of and , respectively.
2:
3:
4:for the first two time instants  do
5:     Set
6:     Generate Gaussian matrix
7:     Acquire  measurements of :
8:     Find  such that
9:end for
10:Set  and compute
11:Set
12:Set
13:
14:
15:for each time instant  do
16:     Set
17:     Generate Gaussian matrix
18:     Acquire measurements of :
19:     Set and find such that
20:     Compute
21:     Set
22:     Update
23:end for
Algorithm 1 Adaptive-Rate Sparse Signal Reconstruction

Iv Online sparse signal estimation

Algorithm 1 describes our online scheme for reconstructing a sparse sequence  satisfying (1). Although described for a noiseless measurement scenario, the algorithm adapts to the noisy scenario in a straightforward way, as discussed later. Such an adaptation is essential when using it on a real system, e.g., a single-pixel camera [43].

Iv-a Algorithm description

The algorithm consists of two parts: the initialization, where the first two signals  and  are reconstructed using basis pursuit, and the online estimation, where the remaining signals are reconstructed using - minimization.

Part I: Initialization. In steps 4-9, we compute the number of measurements  and  according to the bound in (7), and then reconstruct  and  via basis pursuit. The expressions for  and  in step 5 require estimates  and  of the sparsity of  and , which are given as input to the algorithm. Henceforth, variables with hats refer to estimates. Steps 10-12 initialize the estimator : during Part II of the algorithm, should approximate the right-hand side of (6) for , i.e., with , , and , where the subscript  indicates that these are parameters associated with .

Part II: Online estimation. The loop in Part II starts by computing the number of measurements as , where , an input to the algorithm, is a (positive) safeguard parameter. We take more measurements from  than the ones prescribed by , because is only an approximation to the bound in (6), as explained next. After acquiring measurements from , we reconstruct it as  via - minimization with (step 19). Next, in step 20, we compute the sparsity  and the quantities in (5), and , for . If the reconstruction of  is perfect, i.e., , then all these quantities match their true values. In that case, in step 21 will also match the true value of the bound in (6). Note, however, that the bound for , , is computed only after  is reconstructed. Consequently, the number of measurements used in the acquisition of , , is a function of the bound (6) for . Since the bounds for  and  might differ, we take more measurements than the ones specified by  by a factor , as in step 16. Also, we mitigate the effect of failed reconstructions by filtering  with an exponential moving average filter, in step 22. Indeed, if reconstruction fails for some , the resulting  might differ significantly from the true bound in (6). The role of the filter is to smooth out such variations.

Extension to the noisy case. Algorithm 1 can be easily extended to the scenario where the acquisition process is noisy, i.e., . Assume that  is arbitrary noise, but has bounded magnitude, i.e., we know  such that . In that case, the constraint in the reconstruction problems in steps 8 and 19 should be replaced by . The other modification is in steps 11 and 21, whose expressions for  are multiplied by  as in (8). Our reconstruction guarantees, however, hold only for the noiseless case.

Remarks. We will see in the next section that Algorithm 1 works well when each  is chosen according to the prediction quality of : the worse the prediction quality, the larger  should be. In practice, it may be more convenient to make  constant, as we do in our experiments in Section VI. Note that the conditions under which our algorithm performs well differ from the majority of prior work. For example, the algorithms in [34, 35, 7, 36, 21, 37, 44, 40, 38, 39, 22] work well when the sparsity pattern of  varies slowly between consecutive time instants. Our algorithm, in contrast, works well when the quality parameters  and  of the side information and also the sparsity  of  vary slowly; in other words, when the quality of the prediction of  varies slowly.

Iv-B Reconstruction guarantees

The following result bounds the probability with which Algorithm 1 with  perfectly reconstructs a finite-length sequence . The idea is to rewrite the condition that (6) applied to  is times larger than (6) applied to . If that condition holds for the entire sequence then, using Theorem 1 and assuming that the matrices  are drawn independently, we can bound the probability of successful reconstruction. The proof is in Appendix A.

Lemma 2.

Let , , and fix . Let also, for all ,

(10)

where . Assume , for , i.e., that the initial sparsity estimates  and  are not smaller than the true sparsity of  and . Assume also that the matrices  in Algorithm 1 are drawn independently. Then, the probability (over the sequence of matrices ) that Algorithm 1 reconstructs  perfectly in all time instants is at least

(11)

When the conditions of Lemma 2 hold, the probability of perfect reconstruction decreases with the length  of the sequence, albeit at a very slow rate: for example, if  is as small as , then (11) equals  for , and  for . If  is larger, these numbers are even closer to .

Interpretation of (10). As shown in the proof, condition (10) is equivalent to , where  is (6) applied to . To get more insight about this condition, rewrite it as

(12)

where

Suppose  and  are signals for which . In that case, , and condition (12) tells us that the oversampling factor  should be larger than the relative variation of  from time  to time . In general, the magnitude of  and  can be significant, since they approach zero at a relatively slow rate, . Hence, those terms should not be ignored.

Remarks on the noisy case. There is an inherent difficulty in establishing a counterpart of Lemma 2 for the noisy measurement scenario: namely, the quality parameters  and  in (5) are not continuous functions of . So, no matter how close a reconstructed signal is from the original one, their quality parameters can differ arbitrarily. And, for the noisy measurement case, we can never guarantee that the reconstructed and the original signals are equal; at most, if (8) holds, they are within a distance , for .

So far, we have considered  and  to be deterministic sequences. In the next section, we will model  (and thus ) as a Laplacian stochastic process.

V Compressive Video Background subtraction

We now consider the application of our algorithm to compressive video background subtraction. We start by modeling the problem of compressive background subtraction as the estimation of a sequence of sparse signals satisfying (1). Our background subtraction system, based on Algorithm 1, is then introduced. Finally, we establish reconstruction guarantees for our scheme when  in (1a) is Laplacian noise.

V-a Model

Let  be a sequence of images, each with resolution , and let  with  be the (column-major) vectorization of the th image. At time instant , we collect  linear measurements of : , where  is a measurement matrix. We decompose each image  as , where  is the th foreground image, typically sparse, and  is the background image, assumed known and to be the same in all the images. Let  and  be vectorizations of  and , respectively. Because the background image is known, we take measurements from it using : . Then, as suggested in [7], we subtract  to :

(13)

This equation tells us that, although we cannot measure the foreground image  directly, we can still construct a vector measurements, , as if we would. Given that  is usually sparse, the theory of compressed sensing tells us that it can be reconstructed by solving, for example, basis pursuit [20, 19]. Specifically, if  has sparsity  and the entries of 

are realizations of i.i.d. zero-mean Gaussian random variables with variance 

, then  measurements suffice to reconstruct  perfectly [42] [cf. (7)].

Notice that (13) is exactly the equation of measurements in (1b). Regarding equation (1a), we will use it to model the estimation of the foreground of each frame, , from previous foregrounds, . We use a motion-compensated extrapolation technique, as explained in Subsection V-C. This technique is known to produce image estimates with an error well modeled as Laplacian and, thus, each  is expected to be small. This perfectly aligns with the way we integrate side information in our reconstruction scheme: namely, the second term in the objective of the optimization problem in step 19 of Algorithm 1 is nothing but .

V-B Our background subtraction scheme

Fig. 1 shows the block diagram of our compressive background subtraction scheme and, essentially, translates Algorithm 1 into a diagram. The scheme does not apply to the reconstruction of the first two frames, which are reconstructed as in [7], i.e., by solving basis pursuit. This corresponds to Part I of Algorithm 1. The scheme in Fig. 1 depicts Part II of Algorithm 1. The motion extrapolation module constructs a motion-compensated prediction  of the current frame, , by using the two past (reconstructed) frames, and . Motion estimation is performed in the image domain () rather than in the foreground domain (), as the former contains more texture, thereby yielding a more accurate motion field. Next, the background frame  is subtracted from  to obtain a prediction of the foreground , i.e., the side information . These two operations are modeled in Algorithm 1 with the function , which takes a set of past reconstructed signals (in our case, and , to which we add , obtaining  and , respectively), and outputs the side information . This is one of the inputs of the - block, which solves the optimization problem (4). To obtain the other input, i.e., the set of foreground measurements , we proceed as specified in equation (13): we take measurements  of the current frame and, using the same matrix, we take measurements of the background . Subtracting them we obtain . The output of the - module is the estimated foreground , from which we obtain the estimate of the current frame as .

0.98 (0.3,4.5)zpp(1.8,4.5)zp(0.74,2.6)bbackground   (0.98,0.5)zcurrent frame   (8.2,2.6)zh[FillColor=black!15!white](1.1,3.5)MEmotion extrapolation(4.1,1.5)Ak1(4.1,0.5)Ak2[FillColor=black!15!white](6.0,3.5)L1L1-(4.1,2.6)dot1 (7.0,3.5)dot2 [labelsep=3pt]45dot2(6.0,0.5)dot3 (4.1,3.5)op1 210op1300op1(6.0,1.5)op2 210op2300op258op2(7.0,2.6)op3 60op3150op3

Figure 1: Block diagram of Algorithm 1 when applied to background subtraction. The main blocks are highlighted.

V-C Motion-compensated extrapolation

To obtain an accurate predition , we use a motion-compensated extrapolation technique similar to what is used in distributed video coding for generating decoder-based motion-compensated predictions [45, 46, 47]. Our technique is illustrated in Fig. 2. In the first stage, we perform forward block-based motion estimation between the reconstructed frames  and . The block matching algorithm is performed with half-pel accuracy and considers a block size of  pixels and a search range of 

pixels. The required interpolation for half-pel motion estimation is performed using the 6-tap filter of H.264/AVC 

[48]. In addition, we use the -norm (or sum of absolute differences: SAD) as error metric. The resulting motion vectors are then spatially smoothed by applying a weighted vector-median filter [49]

. The filtering improves the spatial coherence of the resulting motion field by removing outliers (i.e., motion vectors that are far from the true motion field). Assuming linear motion between 

and , and  and , we linearly project the motion vectors between  and  to obtain , our estimate of ; see Fig. 2. During motion compensation, pixels in  that belong to overlapping prediction blocks are estimated as the average of their corresponding motion-compensated pixel predictors in . Pixels in uncovered areas (i.e., no motion-compensated predictor is available) are estimated by taking averaging the three neighbor pixel values in  (up, left and up-left pixel positions, following a raster scan of the frame) and the corresponding pixel in .

0.98 estimationextrapolation

Figure 2: Scheme of motion-compensated extrapolation. We use the motion between matching blocks in and to create an estimate  of frame .

V-D Reconstruction guarantees for Laplacian modeling noise

It is well known that the noise produced by a motion-compensated prediction module, as the one just described, is Laplacian [45, 50]. In our model, that corresponds to each  in (1a) being Laplacian. We assume each  is independent from the matrix of measurements .

Model for . As in [45, 50, 51] (and references therein), we assume that  is independent from , for , and that the entries of each 

are independent and have zero-mean. The probability distribution of 

is then

(14)

where , and  is the parameter of the distribution of . The entries of , although independent, are not identically distributed, since they have possibly different parameters . The variance  of each component  is given by .

Resulting model for . The sequence  being stochastic implies that  is also stochastic. Indeed, if each  in (1) is measurable, then  is a sequence of random variables. Given the independence across time and across components of the sequence , the distribution of  given  is also Laplacian, yet not necessarily with zero-mean. That is, for  and ,

(15)

where  is the th component of . In words, the distribution of each component of  conditioned on all past realizations , , is Laplacian with mean  and parameter . Furthermore, it is independent from the other components.

Reconstruction guarantees. Note that  and being stochastic processes implies that the quantities in (5), which we will denote with  and  for signal , are random variables. Hence, at each time , the conditions of Theorem 1, namely that and that there is at least one index  such that , become events, and may or may not hold. We now impose conditions on the variances  that guarantee the conditions of Theorem 1 are satisfied and, thus, that - minimization reconstructs  perfectly, with high probability. Given a set , we use  to denote its complement in .

Theorem 3.

Let  be given. Let  have distribution (14), where the variance of component  is . Define , and the sets and . Assume , that is, there exists  such that  and . Assume  is generated as in Theorem 1 with a number of measurements

(16)

for some , where . Let  denote the solution of - minimization (4). Then,

(17)

The proof is in Appendix B. By assuming each component  is Laplacian with parameter  (independent from the other components), Theorem 3 establishes a lower bound on the number of measurements that guarantee perfect reconstruction of  with probability as in (17). Note that all the quantities in (16) are deterministic. This contrasts with the direct application of Theorem 1 to the problem, since the right-hand side of (6) is a function of the random variables , , and . The assumption implies , which means that some components of  have zero variance and, hence, are equal to zero with probability . Note that, provided the variances  are known, all the quantities in (16), and consequently in (17), are known.

The proof of Theorem 3 uses the fact that the sparsity of  is with probability . This implies that the bound in (16) is always smaller than the one for basis pursuit in (7) whenever . Since , this holds if .

We state without proof a consequence of Theorem 3 that is obtained by reasoning as in Lemma 2:

Corollary 4.

Let  be a stochastic process where  has distribution (14) and each  is independent from , . Assume that  is generated as in (1a) and consider Algorithm 1 with  at iteration . Assume that  and  are independent. Assume also, for , that

(18)

where , and the quantities ,