Exact Reconstruction Conditions for Regularized Modified Basis Pursuit

08/16/2011 ∙ by Wei Lu, et al. ∙ Iowa State University of Science and Technology 0

In this correspondence, we obtain exact recovery conditions for regularized modified basis pursuit (reg-mod-BP) and discuss when the obtained conditions are weaker than those for modified-CS or for basis pursuit (BP). The discussion is also supported by simulation comparisons. Reg-mod-BP provides a solution to the sparse recovery problem when both an erroneous estimate of the signal's support, denoted by T, and an erroneous estimate of the signal values on T are available.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In this work, we obtain sufficient conditions for exact recovery of regularized modified basis pursuit (reg-mod-BP) and discuss when the obtained conditions are weaker than those for modified compressive sensing [2] or for basis pursuit (BP) [3, 4]. Reg-mod-BP was briefly introduced in our earlier work [2] as a solution to the sparse recovery problem when both an erroneous estimate of the signal’s support, denoted by , and an erroneous estimate of the signal values on , denoted by , are available. The problem is precisely defined in Sec. I-A. Reg-mod-BP, given in (11

), tries to find a vector that is sparsest outside the set

among all solutions that are close enough to on and satisfy the data constraint. In practical applications, and may be available from prior knowledge, or in recursive reconstruction applications, e.g. recursive dynamic MRI [5, 2], recursive compressive sensing (CS) based video compression [6, 7], or recursive projected CS (ReProCS) [8, 9] based video layering, one can use the support and signal estimate from the previous time instant for this purpose.

Basis pursuit (BP) was introduced in [3] as a practical (polynomial complexity) solution to the problem of reconstructing a sparse vector, , with support denoted by , from an measurements’ vector, , when

. BP solves the following convex (actually linear) program:

(1)

The recent CS literature has provided strong exact recovery results for BP that are either based on the restricted isometry property (RIP) [4, 10] or that use the geometry of convex polytopes to obtain “exact recovery thresholds” on the

needed for exact recovery with high probability

[11, 12]. BP is often just referred to as CS in recent works and our work also occasionally does this.

In recent work [2], we introduced the problem of sparse reconstruction with partial and partly erroneous support knowledge, denoted by , and proposed a solution called modified compressive sensing (mod-CS). We obtained exact reconstruction conditions for mod-CS and showed when they are weaker than those for BP. Mod-CS tries to find the solution that is sparsest outside the set among all solutions of , i.e. it solves

(2)

Ideally the above should be referred to as mod-BP, but since we used the term mod-CS when we introduced it, we will retain it here. Similar problems were also studied in parallel work by von Borries et al. [13] and Khajehnejad et al. [14]. In [14], the authors assumed a probabilistic prior on the support, solved the following weighted problem and obtained exact recovery thresholds similar to those in [12]:

(3)

In another related work [15], Wang et al. showed how to iteratively improve recovery of a single signal by solving BP in the first iteration, obtaining a support estimate, solving (2) with this support estimate and repeating this. They also obtained exact recovery guarantees for a single iteration.

Another related idea is CS-diff or CS-residual which recovers the residual signal by solving (1) with replaced by

. This is related to our earlier least squares CS-residual (LS-CS) and Kalman filtered CS (KF-CS) ideas

[5, 16]. However, as explained in [2], the residual signals using all these methods have a support size that is equal to or slightly larger than that of (except if ). As a result, these do not achieve exact recovery with fewer measurements. The limitations of some other variants of this are also discussed in detail in [17]. Reg-mod-BP may also be interpreted as a Bayesian or a model-based CS approach. Recent work in this area includes [18, 19, 20].

This paper is organized as follows. We introduce reg-mod-BP in Sec. II. In Sec III, we obtain the exact reconstruction result, discuss its implications and give the key lemmas leading to its proof. Simulation comparisons are given in Sec. IV and conclusions in Sec. V.

I-a Notation and Problem Definition

For a set , . is the empty set. We use to denote the cardinality of a set. The same notation is also used for the absolute value of a scalar. The meaning is clear from context.

For a vector , , or just , denotes a sub-vector containing the elements of with indices in . means the norm of the vector . The notation () means that each element of the vector is greater than or equal to (strictly greater than) zero. Similarly () means each element is less than or equal to (strictly less than) zero. We define the sign pattern, as:

(4)

We use for matrix transpose. For a matrix , denotes the sub-matrix containing the columns of with indices in . Also, is the induced 2 norm.

Our goal is to solve the sparse reconstruction problem, i.e. reconstruct an -length sparse vector, , with support, , from an length measurement vector,

(5)

when an erroneous estimate of the signal’s support, denoted by ; and an erroneous estimate of the signal values on , denoted by , are available. The support estimate, , can be rewritten as

(6)

are the errors ( contains the misses while contains the extras) in the support estimate.

The signal value estimate is assumed to be zero along , i.e.,

and it satisfies

(7)

The restricted isometry constant (RIC) [4], , for , is defined as the smallest positive real number satisfying for all subsets of cardinality and all real vectors of length . The restricted orthogonality constant (ROC) [4], , is defined as the smallest positive real number satisfying for all disjoint sets with , and , and for all vectors , of length , respectively. Both and are non-decreasing functions of and of , respectively [4].

We will frequently use the following functions of the RIC and ROC of in Sec. III:

(8)
(9)

For the matrix , and for any set for which is full rank, we define the matrix as

(10)

Ii Regularized Modified Basis Pursuit

Mod-CS given in (2) puts no cost on and no explicit constraint except . Thus, when very few measurements are available, can become larger than required in order to satisfy with the smallest . A similar, though less, bias will also occur with (3) when . However, if a signal value estimate on , , is also available, one can use that to constrain . One way to do this, as suggested in [2], is to add to the mod-CS cost. However, as we saw from simulations, while this does achieve lower reconstruction error, it cannot achieve exact recovery with fewer measurements (smaller ) than mod-CS [2]. The reason is it puts a cost on the entire distance from and so encourages elements on the extras set, , to be closer to which is nonzero.

On the other hand, if we instead use the distance from , and add it as a constraint, then, at least in certain situations, we can achieve exact recovery with a smaller than mod-CS. Thus, we study

(11)

and call it reg-mod-BP. We see from simulations, that whenever one or more of the inequality constraints are active at , i.e. for some , (11) does achieve exact recovery with fewer measurements than mod-CS. We use this observation to derive a better exact recovery result below111One can also try to constrain the distance instead of the distance. When the constraint is active, one should again need a smaller for exact recovery. When we check this via simulations, this does happen, but since it is at most one active constraint, the reduction in required is small compared to what is achieved by (11) and hence we do not study this further..

Iii Exact Reconstruction Conditions

In this section, we obtain exact reconstruction conditions for reg-mod-BP by exploiting the above fact. We give the result and discuss its implications below in Sec III-A. The key lemmas leading to its proof are given in Sec. III-B and the proof outline in Sec. III-C.

Iii-a Exact Reconstruction Result

Let us begin by defining the two types of active sets (set of indices for which the inequality constraint is active), and , and the inactive set, , as follows.

(12)

In the result below, we try to find the sets and so that is maximized while and satisfy certain constraints. We call these the “good” sets. We define the “bad” subset of , as . As we will see, the smaller the size of this bad set, the weaker are our exact recovery conditions.

Theorem 1 (Exact Recovery Conditions)

Consider recovering a sparse vector, , with support , from by solving (11). The support estimate, , and the misses and extras in it, , , satisfy (6). The signal estimate, , satisfies (7), i.e. . Define the sizes of the sets and as

(13)

The true is the unique minimizer of (11) if

  1. , and

  2. where

    where

    is specified in (10), is defined in (8), and the sets , are defined in (12).

Notice that is a non-decreasing function of . Since , thus, finding the largest possible sets and ensures that the condition is the weakest. The reason for defining and in the above fashion will become clear in the proof of Lemma 2.

Notice also that the first condition of the above result ensures that . Since , thus, is positive definite and thus invertible. Thus is always well defined. The first condition also ensures that . Since , and since and are non-decreasing functions of , it also ensures that .

Remark 1 (Applicability)

A practical case where some of the inequality constraints will be active with nonzero probability is when dealing with quantized signals and quantized signal estimates. If the range of values that the signal estimate can take given the signal (or vice versa) is known, the smallest choice of is easily computed. We show some examples in Sec. IV. In general, even if just the range of values both can take is known, we can compute . The fewer the number values that can take, the larger will be the expected size of the active set, . Also, the condition (2) will hold for non-empty with nonzero probability.
Some real applications where quantized signals and signal estimates occur are recursive CS based video compression [6, 7] (the original video itself is quantized) or in recursive projected CS (ReProCS) [8, 9] based moving or deforming foreground objects’ extraction (e.g. a person moving towards a camera) from very large but correlated noise (e.g. very similar looking but slowly changing backgrounds), particularly when the videos are coarsely quantized (low bit rate). A common example where low bit rate videos occur is mobile telephony applications. In any of these applications, if we know a bound on the maximum change of the sparse signal’s value from one time instant to the next, that can serve as .

Remark 2 (Comparison with BP, mod-CS, other results)

The worst case for Theorem 1 is when both the sets and are empty either because no constraint is active ( and are both empty) or because (2) does not hold for any pair of subsets of and . In this case, we have and so the required sufficient conditions are the same as those of mod-CS [2, Theorem 1]. A small extra requirement is that satisfies (7). Thus, in the worst case, Theorem 1 holds under the same conditions on (needs the same number of measurements) as mod-CS [2]. In [2], we have already argued that the mod-CS result holds under weaker conditions than the results for BP [4, 10] as long as the size of the support errors, , are small compared to the support size, , and hence the same can be said about Theorem 1. For example, we argued that when (numbers taken from a recursive dynamic MRI application), the mod-CS conditions are weaker than those of BP. Small is a valid assumption in recursive recovery applications like recursive dynamic MRI, recursive CS based video compression, or ReProCS based foreground extraction from large but correlated background noise.
Moreover, if some inequality constraints are active and (2) holds, as in case of quantized signals and signal estimates, Theorem 1 holds under weaker conditions on than the mod-CS result.
As noted by an anonymous reviewer, our exact recovery conditions require knowledge of . However this is an issue with many results in sparse recovery, e.g. [21], and especially those that use more prior knowledge, e.g. [18].

Remark 3 (Small reconstruction error)

The reconstruction error of reg-mod-BP is significantly smaller than that of mod-CS, weighted or BP, even when none of the constraints is active, as long as is small (see Table III). On the other hand, the exact recovery conditions do not depend on the value of , but only on the size of the good subsets of the active sets. This is also observed in our simulations. In Table III, we show results for . Even when we tried , the exact reconstruction probability or the smallest needed for exact reconstruction remained the same, but the reconstruction error increased.

Remark 4 (Computation complexity)

Finding the best and requires that one check all possible subsets of and and find the pair with the largest sum of sizes that satisfies (2). To do this, one would start with , ; compute and and check if (2) holds; if it does not, remove one element from and then check (2); then remove an element from and check (2); keep doing this until one finds a pair for which (2) holds. In the worst case, one will need to check (2) times. However, the complexity of computing the RIC or any of the ROC’s is anyway exponential in and . In summary, computing the conditions of Theorem 1 has complexity that is exponential in the support size, but the same is true for all sparse recovery results that use the RIC. We should mention though that, for certain random matrices, e.g. random Gaussian, there are results that upper bound the RIC values with high probability, e.g. see [4]. However, the resulting bounds are usually quite loose.

Iii-B Proof of Theorem 1: Key Lemmas

Our overall proof strategy is similar to that of [4] for BP and of [2] for mod-CS. We first find a set of sufficient conditions on an vector, , that help ensure that is the unique minimizer of (11). This is done in Lemma 1. Next, we find sufficient conditions that the measurement matrix should satisfy so that one such can be found. This is done in an iterative fashion in the theorem’s proof. The proof uses Lemma 2 at the zeroth iteration, followed by applications of Lemma 3 at later iterations.

To obtain the sufficient conditions on , as suggested in [4], we first write out the Karush-Kuhn-Tucker (KKT) conditions for to be a minimizer of (11) [22, Chapter 5]. By strengthening these a little, we get a set of sufficient conditions for to be the unique minimizer. The necessary conditions for to be a minimizer are: there exists an , vector (Lagrange multiplier for the constraints in ), a vector, , and a vector, , such that (s.t.)

  1. every element of and is non-negative, i.e. and ,

  2. , , , , and .

As we will see in the proof of Lemma 1, strengthening to , keeping the other conditions the same, and requiring that gives us a set of sufficient conditions.

Lemma 1

Let be as defined in Theorem 1. is the unique minimizer of (11) if and if we can find an vector, , s.t.

  1. , , ,

  2. ,

  3. for all .

Recall that , and are defined in (12) and in Theorem 1.

Proof: The proof is given in Appendix -A.

Notice that the first condition is weaker than that of Lemma 1 of mod-CS [2] (which requires ), while the other two are the same. Next, we try to obtain sufficient conditions on the measurement matrix, (on its RIC’s and ROC’s) to ensure that such a can be found. This is done by using Lemmas 2 and 3 given below. Lemma 2 helps ensure that the first two conditions of Lemma 1 hold and provides the starting point for ensuring that the third condition also holds. Then, Lemma 3 applied iteratively helps ensure that the third condition also holds.

Lemma 2

Assume that . Let be such that . If , then there exists an vector and an “exceptional” set, , disjoint with , s.t.

  1. , , ,

  2. ,

  3. , , ,

  4. .

Recall that , are defined in (8), (9) and , , , , and in Theorem 1.

Notice that because we have assumed that , and are positive. We call the set an “exceptional” set, because except on the set , everywhere else on , is bounded. This notion is taken from [4]. Notice that the first two conditions of the above lemma are one way to satisfy the first two conditions of Lemma 1 since .

Proof: The proof is given in Appendix -B. We let . Since the good sets , are appropriately defined (see (2)), the first two conditions hold. The rest of the proof bounds , and finds the set of size so that is bounded for all and also is bounded.

Lemma 3 (Lemma 2 of [2])

Assume that . Let , be such that . Assume that . Let be a set that is disjoint with , of size and let be a vector. Then there exists an vector, , and a set, , disjoint with , s.t. (i) , (ii) , (iii) , , , and (iv) .
Recall that , are defined in (8), (9), and in Theorem 1.

Proof: The proof of Lemma 3 is given in [2] and also in Appendix C of [23].

Notice that because we have assumed that , and are positive.

Iii-C Proof Outline of Theorem 1

The proof is very similar to that of [2]. Hence we give only the outline here. The complete proof is in [23]. At iteration zero, we apply Lemma 2 with , to get a and an exceptional set , disjoint with , of size less than . Lemma 2 can be applied because and condition 1 of the theorem holds. At iteration , we apply Lemma 3 with (so that ), , and to get a and an exceptional set disjoint with of size less than . Lemma 3 can be applied because condition 1 of the theorem holds. Define . We then argue that if condition 2 of the theorem holds, is well-defined and satisfies the conditions of Lemma 1. Applying Lemma 1, the result follows.

Iv Numerical Experiments

In this section, we show two types of numerical experiments. The first simulates quantized signals and signal estimates. This is the case where some constraints are active with nonzero probability. The good set, is also non empty with nonzero probability. Hence, for a given small enough , reg-mod-BP has significantly higher exact reconstruction probability, , as compared to both mod-CS [2] and weighted [14] and much higher than that of BP [3, 4]. Alternatively, it also requires a significantly reduced for exact reconstruction with probability one, . In computing we average over the distribution of , and , as also in [2, 4]. All numbers are computed based on 100 Monte Carlo simulations. To compute , we tried various values of for each algorithm and computed the smallest required for exact recovery always (in all 100 simulations).

We also do a second simulation where signal estimates are not quantized.

In the following steps, the notation means that is equally likely to be equal to , , or . We use as short for . Also,

generates a scalar uniform random variable in the range

. The notation for all means that, for all , each is identically distributed according to P and is independent of all the others.

BP mod-CS weighted Reg-mod-BP
4 0 0.18 0.16 0.64
N-RMSE() 4 1.011 0.059 0.060 0.029
4 0.39 0.21 0.21 0.18
10 0 0.18 0.16 0.39
N-RMSE() 10 1.011 0.059 0.060 0.032
10 0.4 0.21 0.21 0.20
TABLE I: Quantized signals and signal estimates. Recall that . For , the expected sizes of , and are , and . For , , and .
BP mod-CS weighted Reg-mod-BP
0 0.26 0.26 0.57
N-RMSE() 0.967 0.152 0.152 0.082
0.4 0.21 0.21 0.20
TABLE II: Quantized signals and signal estimates: case 2. Recall that . The expected sizes of , and are , and .
BP mod-CS weighted Reg-mod-BP
m
N-RMSE(0.18) 0.961 0.0175 0.0177 0.0123
N-RMSE(0.11)
TABLE III: The non quantized case.

For the quantized case, was an length sparse vector with support size and support estimate error sizes . We generated the matrix once as an random Gaussian matrix (generate an matrix with i.i.d zero mean Gaussian entries and normalize each column to unit norm). The following steps were repeated times.

  1. The support set, , of size , was generated uniformly at random from . The support misses set, , of size , was generated uniformly at random from the elements of . The support extras set, , also of size , was generated uniformly at random from the elements of . The support estimate, and thus .

  2. We generated for ; for , and for . and are also independent of each other. We generated where for and for . We used and tried two choices of . Notice that, for a given , the number of equally likely values that for can take are roughly ( when ). The constraint is active when is equal to . Thus, the expected size of the active set is roughly .

  3. We generated . We solved reg-mod-BP given in (11) with ; BP given in (1); mod-CS given in (2); and weighted given in (3) with various choices of : . We used the CVX optimization package, http://www.stanford.edu/boyd/cvx/, which uses primal-dual interior point method for solving the minimization problem.

We computed as the number of times was equal to (“equal” was defined as ) divided by . For weighted , we computed for each choice of and recorded the largest one. This corresponded to . We tabulate results in Table I. In the first row, we record for all the methods, when using . We also record the Monte Carlo average of the sizes of the active set ; of the good set, and of the bad set . In the second row, we record the normalized root mean squared error (N-RMSE). In the third row, we record . In the next three rows, we repeat the same things with .

As can be seen, is about half the size of the active set, . As is increased, and hence reduces ( increases) and thus decreases and increases. Also, for mod-CS and weighted , is significantly smaller than for reg-mod-BP, while is larger.

Next, we simulated a more realistic scenario – the case of 3-bit quantized images (both and take integer values between 0 to 7). Here again , , and . The sets , , and were generated as before. We generated for ; for ; and for . Also, where for ; and for . Also clips any value more than 7 to 7 and any value less than zero to zero. Clearly, in this case . We record our results in Table II. Similar conclusions as before can be drawn.

Finally, we simulated the non-quantized case. We used , , and . We generated for ; for , and for . The signal estimate, where with . We tabulate our results in Table III. Since is a real vector (not quantized), the probability of any constraint being active is zero. Thus, as expected, and are the same for reg-mod-BP and mod-CS and weighted , though significantly better than BP. However, the N-RMSE for reg-mod-BP is significantly lower than that for mod-CS and weighted also, particularly when .

V Conclusions

In this work, we obtained sufficient exact recovery conditions for reg-mod-BP, (11), and discussed their implications. Our main conclusion is that if some of the inequality constraints are active and if even a subset of the set of active constraints satisfies certain conditions (given in (2)), then reg-mod-BP achieves exact recovery under weaker conditions than what mod-CS needs. A practical situation where this would happen is when both the signal and its estimate are quantized. In other cases, the conditions are only as weak as those for mod-CS. In either case they are much weaker than those for BP as long as is a good support estimate. From simulations, we see that even without any active constraints, the reg-mod-BP reconstruction error is much lower than that of mod-CS or weighted .

-a Proof of Lemma 1

Denote a minimizer of (11) by . Since and satisfies (7), is feasible for (11). Thus,

(15)

Next, we use the conditions on given in Lemma 1 and the fact that is supported on to show that and hence . Notice that

(16)
(17)
(18)
(19)
(20)
(21)

In the above, the inequality in (16) follows because for and because . Inequality (17) uses the fact that for any two scalars and and that for . In (18), the first equality uses