The springback penalty for robust signal recovery

We propose a new penalty, named as the springback penalty, for constructing models to recover an unknown signal from incomplete and inaccurate measurements. Mathematically, the springback penalty is a weakly convex function, and it bears various theoretical and computational advantages of both the benchmark convex ℓ_1 penalty and many of its non-convex surrogates that have been well studied in the literature. For the recovery model using the springback penalty, we establish the exact and stable recovery theory for both sparse and nearly sparse signals, respectively, and derive an easily implementable difference-of-convex algorithm. In particular, we show its theoretical superiority to some existing models with a sharper recovery bound for some scenarios where the level of measurement noise is large or the amount of measurements is limited, and demonstrate its numerical robustness regardless of varying coherence of the sensing matrix. Because of its theoretical guarantee of recovery with severe measurements, computational tractability, and numerical robustness for ill-conditioned sensing matrices, the springback penalty is particularly favorable for the scenario where the incomplete and inaccurate measurements are collected by coherence-hidden or -static sensing hardware.



There are no comments yet.


page 1

page 2

page 3

page 4


Linear convergence and support recovery for non-convex multi-penalty regularization

We provide a comprehensive convergence study of the iterative multi-pena...

Coherence-Based Performance Guarantee of Regularized ℓ_1-Norm Minimization and Beyond

In this paper, we consider recovering the signal x∈R^n from its few nois...

The high-order block RIP for non-convex block-sparse compressed sensing

This paper concentrates on the recovery of block-sparse signals, which i...

Corrupted Sensing: Novel Guarantees for Separating Structured Signals

We study the problem of corrupted sensing, a generalization of compresse...

Robust Sensing of Low-Rank Matrices with Non-Orthogonal Sparse Decomposition

We consider the problem of recovering an unknown low-rank matrix X with ...

A Generalization of Wirtinger Flow for Exact Interferometric Inversion

Interferometric inversion involves recovery of a signal from cross-corre...

Permutation Recovery from Multiple Measurement Vectors in Unlabeled Sensing

In "Unlabeled Sensing", one observes a set of linear measurements of an ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Signal recovery aims at recovering an unknown signal from its measurements, which are often incomplete and inaccurate due to technical, economical, or physical restrictions. Mathematically, a signal recovery problem can be expressed as estimating an unknown

from an underdetermined linear system


where is a full row-rank matrix representing a sensing matrix (e.g., the projection or transformation matrix),

is a vector of measurements,

is some unknown but bounded noise perturbation in , and the number of measurements is considerably smaller than the size of the signal . The set encodes both the cases of noise-free () and noisy () measurements.

Physically, a signal of interest, or its coefficients under certain transformation, is often sparse (see, e.g. [4]). Hence, it is natural to seek a sparse solution of the underdetermined linear system (1.1), though it has infinitely many solutions. We say that is -sparse if , where counts the number of nonzero entries of . To find the sparsest solution of (1.1), one may consider solving the following minimization problem:


in which serves as a penalty term of the sparsity, and it is referred to as the penalty for convenience. Due to the discrete and discontinuous nature of the penalty, the model (1.2) is NP-hard [4]. This means the model (1.2) is computationally intractable, and this difficulty has inspired many alternatives of the penalty in the literature. A fundamental proxy of the model (1.2) is the basis pursuit (BP) problem proposed in [13]:


In this convex model, and it is called the penalty hereafter. Recall that is the convex envelope of (see, e.g., [31]), and it induces sparsity most efficiently among all convex penalties (see [4]). The BP problem (1.3) has been intensively studied in voluminous papers since the seminal works [6, 7, 14], in which various conditions have been comprehensively explored for the exact recovery via the convex model (1.3). Numerically, the model (1.3) and its unconstrained version (see (5.2)) can be solved easily by benchmark algorithms such as the classic augmented Lagrangian method [23, 30], FISTA [2], and many others.

The BP problem (1.3) is fundamental for signal recovery, but its solution may be over-penalized because the penalty tends to underestimate high-amplitude components of the solution, as analyzed in [16]. Hence, it is reasonable to consider non-convex alternatives of the penalty, and upgrade the model (1.3) to achieve more accurate recovery. In the literature, some non-convex penalties have been well studied, such as the smoothly clipped absolute deviation (SCAD) [16], the capped penalty [44], the transformed penalty [26, 43], and the penalty with [10, 11]. Besides, one particular penalty is the minimax concave penalty (MCP) proposed in [41], and it has been widely shown to be effective for reducing the bias from the penalty [41]. Moreover, the so-called penalty has been studied in the literature, e.g. [15, 25, 39, 40] to mention a few. Some of these penalties will be summarized in Section 2. In a nutshell, convex penalties are more tractable in senses of theoretical analysis and numerical computation, while they are less effective for achieving the desired sparsity (i.e., approximation to the penalty is less accurate). Non-convex penalties are generally the opposite.

Considering the pros and cons of various penalties, our main motivation is to find a weakly convex penalty that can keep some favorable features from both the penalty and its non-convex alternatives, and the resulting model for signal recovery is preferable in senses of both theoretical analysis and numerical computation. More precisely, we propose the springback penalty


where is a model parameter and it should be chosen meticulously. We will show later that a larger implies a tighter stable recovery bound. On the other hand, a too large may lead to negative values of . Thus, a reasonable upper bound on should be considered to ensure the well-definedness of the springback penalty (1.4). In the following, we will see that, if the matrix is well-conditioned (e.g., when is drawn from a Gaussian matrix ensemble), then the requirement on is quite loose; while if is ill-conditioned (e.g., is drawn from an oversampled partial DCT matrix ensemble), then generally the upper bound on should be better discerned for the sake of designing an algorithm with theoretically provable convergence. We refer to Theorems 3.2, 4.1, 5.2.2, and Section 6.2 for more precise discussions on the determination of for the springback penalty (1.4) theoretically and numerically. With the springback penalty (1.4), we propose the following model for signal recovery:


Mathematically, the springback penalty (1.4) is a weakly convex function, and thus the springback-penalized model (1.5) can be intuitively regarded as an “average” of the convex BP model (1.3) and the mentioned non-convex surrogates. Recall that a function is -weakly convex if is convex. One advantage of the model (1.5) is that various results developed in the literature of weakly convex optimization problems (e.g., [21, 28]) can be used for both theoretical analysis and algorithmic design. Indeed, the weak convexity of the springback penalty (1.4) enables us to derive sharper recovery results with less measurements and design some efficient algorithms easily.

The rest of this paper is organized as follows. In the next section, we summarize some preliminaries for further analysis. In Sections 3 and 4, we establish the exact and stable recovery theory of the springback-penalized model (1.5) for sparse and nearly sparse signals, respectively. We also theoretically compare the springback penalty (1.4) with some other penalties in these two sections. In Section 5, we design a difference-of-convex algorithm (DCA) for the springback-penalized model (1.5) and study its convergence. Some numerical results are reported in Section 6 to verify our theoretical assertions, and some conclusions are drawn in Section 7.

2 Preliminaries

In this section, we summarize some preliminaries that will be used for further analysis.

2.1 Notations

For any , let be their inner product, and let be the support of . Let

be an identity matrix whose dimension is clear in accordance with the context. Let

(or with some super/subscripts) be an index set, and the cardinality of . For and , let be the vector with the same entries as on indices and zero entries on indices , and let be the submatrix of with column indices . For , is the sign function of . For a convex function , denotes the subdifferential of at .

2.2 A glance at various penalties

In the literature, there are a variety of convex and non-convex penalties. Below we list five of the most important ones, with .

  • The penalty [4, 13]:

  • The penalty with parameter [10, 11]:

  • The transformed (TL1) with parameter [26, 43]:

  • The minimax concave penalty (MCP) with parameter [41]:


  • The penalty [15, 40]:

2.3 Relationship among various penalties

For any nonzero vector and , the springback penalty as . Besides, is reduced to the MCP in [41] within the -ball if . The springback penalty appears to be a resemblance to the penalty, but their difference is many-sided. For instance, the gradient of is not defined at the origin.

Figure 1 displays some scalar (one-dimensional) penalties, including the penalty, the penalty, the transformed penalty with , the MCP with , and the springback penalty with and . The penalty is not plotted, as it is none other than zero in the one-dimensional case. To give a better visual comparison, we scale them to attain the point . It is shown in Figure 1 that the springback penalty is close to the penalty when . The springback penalty with , in fact, coincides with the MCP for if we do not scale them. The behavior of this penalty for attracts our interests because it turns around and heads towards the -axis. According to Figure 1, this behavior is clearer in terms of the thresholding operator corresponding to the proximal mapping of the springback penalty, whose mathematical descriptions are given in Section 2.4.

Figure 1: Scalar penalties and corresponding thresholding operators (for representing proximal mappings with ): the penalty and the soft thresholding operator; the penalty, whose proximal mapping has no closed-form expressions (hence no thresholding operator plotted); the transformed penalty with , whose proximal mapping can be expressed explicitly by a thresholding operator given in [42]; the MCP with and the firm thresholding operator; and two springback penalties with and , and the springback thresholding operator.

As mentioned, the proposed springback penalty (1.4) balances the approximation quality of the penalty and the tractability in analysis and computation, and it is in between the convex and non-convex penalties. More specifically, it is in between the penalty and the MCP in [41]. For any , we can always find a parameter for the MCP such that with a resulting penalty in form of . This penalty inherits approximation quality of the penalty from the MCP in [41], as well as the analytical and computational advantages of the penalty. Inasmuch as this penalty, we consider the more general penalty (1.4) in which is replaced by a more flexible parameter .

2.4 Proximal mappings and thresholding operators

For a function , as defined in [29], the proximal mapping of is defined as


where is a regularization parameter. In (2.1), we slightly abuse the notation “=”. This mapping takes a vector and maps it into a subset of , which might be empty, a singleton, or a set with multiple vectors; and the image of under this mapping is a singleton if the function is proper closed and convex [1]. For a given optimization model, if the proximal mapping of its objective function has a closed-from expression, then usually it is important and necessary to consider how to take advantage of this feature for algorithmic design.

When the proximal mapping of a penalty can be represented explicitly, the closed-form representation is often called a thresholding operator or a shrinkage operator in the literature. For example, as analyzed in [42], with the soft thresholding operator

which has been widely used in various areas such as compressed sensing and image processing, the proximal mapping (2.1) of the penalty can be expressed explicitly by

The proximal mapping of a non-convex penalty, in general, does not have a closed-form expression; such cases include the penalty and the penalty with . However, there are some particular non-convex penalties whose proximal mappings can still be represented explicitly. For instance, the transformed penalty [42] and the MCP [41]. In particular, with the following firm thresholding operator

which was first proposed in [19], it was further studied in [41] that the proximal mapping (2.1) of the MCP can be expressed explicitly by a firm thresholding operator for the case of orthonormal design. More specifically, the proximal mapping (2.1) of the MCP is

Below, we show that for the springback penalty (1.4) with a well chosen , its proximal mapping can also be expressed explicitly.

Definition 2.1

The springback thresholding operator is defined as

Proposition 2.1

If , then the proximal mapping of the springback penalty (1.4) can be represented explicitly as

Proof.  When , it follows from (2.1) that, for any satisfying , there holds , i.e., . The assumption ensures to be positive definite. Thus, the optimization problem occurred in (2.1) is convex. When in (2.1), for any satisfying the condition , which is equivalent to


we have . It also follows from (2.3) that

Hence, the assertion is proved.

Recall that the springback penalty (1.4) is a weakly convex function. Its thresholding operator defined in (2.2) is also in between the soft and firm thresholding operators. As , a compromising could be large enough such that and it reaches a certain compromise between the soft and firm thresholding operators. In this case, we have a particular springback thresholding operator

If is replaced by a more general , then the springback thresholding operator (2.2) is recovered.

2.5 Rationale of the name

Springback is a concept in applied mechanics (see, e.g. [36]). Figure 1 gives more explanations for naming (1.4) as springback. With , Figure 1 displays the thresholding operators for , including the soft thresholding operator, the transformed thresholding operator with , the firm thresholding operator with , and the springback thresholding operator with . The transformed thresholding operator enforces with to be 0, and then its outputs approach to as increases. All the other thresholding operators enforce with to be . For , the soft thresholding operator subtracts from and thus causes the penalty to underestimate high-amplitude components; the firm thresholding operator’s outputs jump from 0 to until exceeds , afterwards its output is . For the springback thresholding operator, its outputs jump from 0 to until exceeds , and afterwards its outputs still keep going along the previous jumping trajectory.

In applied mechanics, spring is related to the process of bending some materials. When bending is done, the residual stresses cause the material to spring back towards its original shape, so the material must be over-bent to achieve the proper bending angle. Note that the soft thresholding operator always underestimates high-amplitude components, and the components and in the springback penalty are decoupled. If we deem the soft thresholding operator as a process of over-bending, which stems for the component , then the output of the soft thresholding operator will be sprung back toward , which is achieved separately in consideration with the component . Such a springback process occurs for both and . The springback behavior is more obvious for those with larger absolute values, and this coincides with the behavior of the springback penalty in Figure 1. That is, once exceeds , the penalty turns around and heads towards the -axis. This process may also be explained as a compensation of the loss of with .

3 Springback-penalized model for sparse signal recovery

In this section, we focus on the recovery of a sparse signal using the springback-penalized model (1.5). After reviewing some basic knowledge of compressed sensing, we identify some conditions for exact and robust recovery using the springback-penalized model (1.5), respectively.

3.1 Compressed sensing basics

In some seminal compressed sensing papers such as [5, 14], recovery conditions have been established for the BP model (1.3). These conditions rely on the restricted isometry property (RIP) of the sensing matrix , as proposed in [8].

Definition 3.1

For an index set and an integer with , the -restricted isometry constant (RIC) of is the smallest such that

for all subsets with and all . The matrix is said to satisfy the -restricted isometry property (RIP) with .

Denoting by the minimizer of the BP problem (1.3), if satisfies , then for an -sparse , one has


where is a constant which may only depend on . We refer to [6, 7] for more details. If the measurements are noise-free, i.e., , then (3.1) implies exact recovery. Exact recovery is guaranteed only in the idealized situation where is -sparse and the measurements are noise-free. If the measurements are perturbed by some noise, then the recovery bound (3.1) is usually referred to as the robust recovery result with respect to the measurement noise. In more realistic scenarios, we can only claim that is close to an -sparse vector, and the measurements may also be contaminated. In such cases, we can recover with an error controlled by its distance to -sparse vectors, and it was proved in [6] that


where is the truncated vector corresponding to the largest values of (in absolute value), and and are two constants which may only depend on . The bound (3.2) is usually referred to as the stable recovery results. Recovery conditions for other models with different penalties are usually not as extensive as the BP model (1.3). Under the framework of the RIP or some generalized versions, recovery theory for the BP model (1.3) has been generalized to the -penalized model in [10, 18]. With the unique representation property of , stable recovery results for the MCP-penalized model were derived in [38] and an upper bound for , but not for , was obtained.

3.2 Recovery guarantee using the springback-penaltized model

Still denoting by the minimizer of the springback-penalized model (1.5), we have the following exact and robust recovery results of the model (1.5) for an -sparse .

[recovery of sparse signals] Let be an unknown -sparse vector to be recovered. For a given sensing matrix , let be a vector of measurements from with , and let and be the - and -RIC’s of , respectively. Suppose satisfies and satisfies


then the minimizer of the problem (1.5) satisfies when ; and it satisfies


when , where


Proof.  Let , and be the support of . It is clear that and . On the one hand, we know that

On the other hand, it holds that

Then, we have that

We continue by arranging the indices in in order of decreasing magnitudes (in absolute value) of , and then dividing into subsets of size . Set , i.e., contains the indices of the largest entries (in absolute value) of , contains the indices of the next largest entries (in absolute value) of , and so on. The cardinal number of may be less than . Denoting and using the RIP of , we have

As the magnitude of every indexed by is less than the average of magnitudes of indexed by , there holds , where . Then, we have

Together with , we have

Thus, it holds that


Note that

and it can be written as

With the assumption on , the coefficient of in (3.6) is positive and thus we have


If , then . If , then the condition (3.3) on guarantees

where we use Cauchy-Schwarz inequality. Hence we also have .

When , the inequality renders , which implies . Thus . When , the inequality

leads to , which implies (3.4).

In analysis of signal recovery models with various convex and non-convex penalties, such as the penalty [7, 10] and the penalty [39, 40], a linear lower bound for is derived somehow. The proof of Theorem 3.2 mainly follows the idea of [7], but we derive a quadratic lower bound for the term . Thus, it is worthy noting that our results cannot be reduced to the result of the BP model (1.3) as . Indeed, the quadratic bound (3.6) in our proof is reduced to a linear bound as , which then leads to the same results as the BP model (1.3). However, we handle our final quadratic bound by removing its linear and constant terms and hence the obtained result cannot be reduced to the result of the BP model (1.3) as .

Besides, the condition (3.3) on is required for the springback-penalized model (1.5). It is impossible to choose an satisfying (3.3), unless we have a priori estimation on before solving the problem (1.5). Thus, the condition (3.3) then can be interpreted as a posterior verification in the sense that it can be verified once is obtained by solving the problem (1.5).

Remark 3.1 (Posterior verification)

In practice, we solve the springback-penalized model (1.5) numerically and thus obtain an approximate solution, denoted by , subject to a preset accuracy . That is, . Then, the posterior verification (3.3) is guaranteed if

3.3 On the exact and robust recovery

In Theorem 3.2, we establish conditions for exact and robust recovery using the springback-penalized model (1.5). Table 1 lists the exact recovery conditions for five other popular models in the literature. In particular, the springback-penalized model (1.5) and the -penalized model, i.e., the BP model (1.3), have the same RIP condition. This condition is more stringent than that of the -penalized model () but weaker than those of the transformed - and -penalized models. Beside the RIP condition, there is an additional assumption for the -penalized model, where was first derived in [40] and slightly improved in [39] as

Note that was shown in [39, 40] for both the cases.

Penalty RIP condition
() [10]
transformed [43]
[39, 40]
Table 1: Exact recovery conditions recovery models with various penalties.

We then discuss robust recovery results. If , then the result (3.4) cannot provide any information as . However, for an appropriate , the bound (3.4) is informative and attractive. The robust recovery results of the -, -, transformed - and -penalized models were shown to be linear with respect to the level of noise [7, 10, 39, 40, 43], in the sense of


where is some constant. Thus, under the conditions of Theorem 3.2, the bound (3.4) for the springback-penalized model (1.5) is tighter than (3.8) in the sense of


if the level of noise satisfies


Assume that the recovery conditions listed in Table 1 are satisfied for each model, respectively. Then, we can summarize their corresponding ranges of in Table 2 such that the robust recovery bound (3.4) of the springback-penalized model (1.5) is tighter than all the others in the sense of (3.9).

Penalty RIP condition
[6, 7]
() [33]
transformed [43]
Table 2: Ranges of the level of noise such that the bound (3.4) is tighter than (3.8) in the sense of (3.9).

These ranges on look complicated. To have a better idea, we consider a toy example with , , , for the spingback penalty (1.4), and for the transformed penalty. Then, the springback-penalized model (1.5) gives a tighter bound in the sense of (3.9) than the -, -, -, -, transformed -, and -penalized models if , and , respectively.

Can we further improve the robust recovery result (3.4) in Theorem 3.2? The following proposition suggests a potential improvement. Moreover, without any requirement on , this proposition also means, even if the posterior verification (3.3) is violated sometimes, the springback-penalized model (1.5) may still give a good recovery. Note that this proposition is only of conceptual sense, because its assumption is not verifiable. Nevertheless, it helps us discern a possibility of achieving a better recovery bound than (3.4).

Proposition 3.1

Let be an unknown -sparse vector to be recovered. For a given sensing matrix , let be a vector of measurements from with , and let and be the - and -RIC’s of , respectively. Let be the minimizer of the problem (1.5) and assume . Suppose satisfies , then when ; and satisfies