Sparse regularization of inverse problems by biorthogonal frame thresholding

09/20/2019 ∙ by Jürgen Frikel, et al. ∙ Leopold Franzens Universität Innsbruck 0

We analyze sparse frame based regularization of inverse problems by means of a biorthogonal frame decomposition (BFD) for the forward operator. The BFD allows to define a non-iterative (direct) frame thresholding approach which we show to provide a convergent regularization method with linear convergence rates. These results will be compared to the well-known analysis and synthesis variants of sparse ℓ^1-regularization which are usually implemented thorough iterative schemes. If the frame is a basis (non-redundant case), the three versions of sparse regularization, namely synthesis and analysis variants of ℓ^1-regularization as well as the BFD thresholding are equivalent. However, in the redundant case, those three approaches are pairwise different.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

This paper is concerned with inverse problems of the form

(1.1)

where is a linear operator between Hilbert spaces, and denotes the data distortion (noise). We allow unbounded operators and assume that is dense. Moreover, we assume that the unknown object is an element of a closed subspace space on which is bounded. We are particularly interested in problems, where (1.1) is ill-posed in which case the solution of (1.1) (if existent) is either not unique or the solution operator is not continuous (hence, solution process is unstable with respect to data perturbations). In order to stabilize the inversion of (1.1) one has to apply regularization methods, cf. [11, 23]. The basic idea of regularization is to include a-priori information about the unknown object into the solution process.

In this paper, we use sparsity based regularization, where the a-priori assumption on the unknown object is sparsity of with respect to a frame of , cf. [23, 14, 7, 17, 21, 3]. That is, we regularize the recovery of from measurements (1.1) by enforcing sparsity of with respect to a suitably chosen frame of . Sparse regularization is well investigated and has been applied to many different imaging problems, and by now there are many algorithms available that implement sparse regularization. However, when dealing with frames, there are at least two fundamentally different concepts implementing sparsity, namely the synthesis and the analysis variant. The reason for this lies in the fact that expansions of with respect to frames are not unique (which is in contrast to orthonormal basis expansions). In the synthesis variant, it is assumed that the unknown is a sparse linear combination of frame elements, whereas, in the analysis variant, it is required that a the inner products with respect to a given frame are sparse. The difference between these approaches has been pointed out clearly in [10].

Sparse regularization is widely used in inverse problems as it provides good regularization results and is able to preserve or emphasize features (e.g., edges) in the reconstruction. However, this often comes at the price of speed, since most of the algorithms implementing sparse regularization are based on variational formulations that are solved by iterative schemes.

In the present paper, we investigate a third variant of sparse regularization that is based on an operator adapted biorthogonal frame decomposition (BFD) of the unknown object, cf. [9, 4, 6] and which allows to define a direct (non-iterative) sparse regularization method. In the noise-free case (), explicit reproducing formulas for the unknown object can be derived from the BFD, where the frame coefficients of are calculated directly from the data . In the presence of noise (), regularized versions of those formulas are obtained by applying component-wise soft-thresholding to the calculated coefficients, where the soft-thresholding operator is defined as follows:

Definition 1.1 (Soft-thresholding).

Let be some index set.

  • For let .

  • For we define the component wise soft-thresholding by

    (1.2)

Here and below we define for and . The advantage of the BFD-variant of sparse regularization lies in the fact that it admits a non-iterative (direct) and fast implementation of sparse regularization that can be easily implemented for several inverse problems.

We point out, that the three variants of sparse regularization (mentioned above) are equivalent if orthonormal bases are used instead of frames, but they are fundamentally different in the redundant case.

As the main theoretical results in this paper we show that the thrid variant of sparse regularization, which we call BFD-thresholding, defines a convergent regularization method and we derive linear convergence rates for sparse solutions. For the basis case, same results follow from existing results of -regularization [7, 14, 15]. In the redundant case, the results follow from [15] for the synthesis approach and from [16] for the analysis approach. In case of BFD-thresholding, we are not aware of any results concerning convergence analysis or convergence rates.

Outline

This paper is organized as follows. In Section 2 we define the biorthogonal frame decompositions of operators and give several examples of biorthogonal frame expansions for various operators using wavelet, curvelet, and shearlet frames. In section 3 we review the convergence theory of -regularization and the convergence rates. In section 4, we show that BFD-thresholding is a convergent regularization method and derive its convergence rates.

2 Biorthogonal frame decomposition

In this section, we introduce the concept of operator adapted biorthogonal frame decompositions (BFD) and discuss some classical examples of such BFDs in the case of the classical 2D Radon transform and the forward operator of photoacoustic tomography with a flat observation surface.

2.1 Formal definition

We define the operator adapted biorthogonal frame decomposition as a generalization of the wavelet vaguelette decomposition and the biorthogonal curvelet or shearlet decompositions to general frames, cf. [9, 4, 6].

Definition 2.1 (Biorthogonal frame decomposition (BFD)).

Let and let be a family of positive numbers. For a linear operator , we call a biorthogonal frame decomposition (BFD) for , if the following conditions hold:

  1. [label=(D0), leftmargin=3em]

  2. is a frame of ,

  3. is a frame of ,

  4. .

Remark 2.2.

The BFD generalizes the singular value decomposition (SVD) and the wavelet-vaguelette decomposition (WVD) (cf.

[9]) as it allows the systems and to be non-orthogonal and redundant. Note that by 2 and 3, then frame satisfies , where we have made use of the identity . For typical inverse problems this yields a notable smoothness assumption on the involved elements of the frame .

Although the SVD has proven itself to be a useful tool for analyzing and solving inverse problems it also can have the following drawbacks: First, ONBs that are provided by the SVD (though optimally adapted to the operator in consideration), in many cases, don’t provide sparse representations of signals of interest and, hence, are not suitable for the use in sparse regularization. In particular, frames that provide sparse representation of signals (such as wavelets or wavelet-like systems) are often not part of an SVD. Second, SVD is often very hard to compute or not known analytically for many practical applications.

To overcome some of those difficulties wavelet-vaguelette decompositions were introduced. Nevertheless, this concept builds upon expansions of signals with respect to orthogonal wavelet-systems, which may not provide an optimal sparse representation of signals of interest, e.g., signals with sharp edges. Thus, by allowing general frames, the BCD offers great flexibility in the choice of a suitable function system for sparse regularization while retaining the advantages.

Definition 2.3.

For a frame of , the synthesis operator is defined as and the corresponding analysis operator is defined as its adjoint, .

In what follows, the synthesis operator of a frame will be always denoted with the corresponding upper case letter, e.g., if is a frame, then denotes the corresponding synthesis and analysis operator.

In order to simplify the notation, we will also refer to the as BFD instead of using the full notation .

If a BFD exists for an operator , it immediately gives rise to a reproducing formula

(2.1)

where is the dual frame to (cf. [5]) and the corresponding synthesis operator. Moreover, denotes the operator performing pointwise division with , i.e.

(2.2)

Hence, from given (clean) data , one can calculate the frame coefficient of and obtain a reconstruction via (2.1). The key to the practical use of this reproducing formulas is the efficient implementation of the analysis and synthesis operators and , respectively. For particular cases, we will provide efficient and easy to implement algorithms for evaluation of and .

Note that, the reproducing formula (2.1) (similarly to the SVD) reveals the ill-posedness of the operator equation through the decay of the quasi-singular values . A regularized version of the reproducing formula (2.1) can be obtained by incorporating soft-thresholding of the frame coefficients. In section 4, we present a complete analysis of this approach as we are not aware of any results for the general BFD-thresholding in the context of regularization theory.

We now provide several examples of BFDs, including the wavelet vaguelette decomposition and the biorthogonal curvelet decomposition [4] for the Radon transform as well as a BFD for the forward operator of photoacoustic tomography with flat observation surface.

2.2 Radon transform

Definition 2.4 (Radon transform).

The Radon transform is defined by

(2.3)

It is well known that the Radon transform is bounded on , see [20]. Let

be the Fourier transform with respect to the first component and consider the Riesz potential

[20]

(2.4)

for . The following hold

  1. [label=(R0), leftmargin=3em]

  2. The commutation relation .

  3. The filtered backprojection formula .

  4. Isometry property .

Using these ingredients, one can obtain a BFD for the Radon transform as follows:

Example 2.5.

BFD for the Radon transform Let be either an orthonormal basis of wavelets with compact support, a (band-limited) curvelet or a shearlet tight frame with where is the scale index. Then is a BFD with

(2.5)
(2.6)

These results have been obtained in [9] for wavelet bases, in [4] for curvelet systems and in [6] for the shearlet frame. All cases are shown in similar manner and basically follow from 1, 2 and the fact that that for any of the considered systems. The limited data case has been studied in [12].

Equation (2.5) implies

(2.7)

This gives an efficient numerical algorithm for the evaluation of provided that is associated with an efficient algorithm. This is in particular the case for the wavelet, shearlet and curvelet frames as above.

Remark 2.6.

We would like to note that in order to define a BFD for the case of curvelets or shearlets one needs to consider the Radon transform on subspaces of consisting of functions that are defined on unbounded domains (since band-limited curvelets or shearlets have non-compact support). However, since the Radon transform is an unbounded operator on , the reproducing formula (2.1) will not hold any more in general. The reproducing formulas with are at least available for the case that the object can be represented as a finite linear combination of curvelets or shearlets (cf. [4] and [6]).

Another possibility would be to consider projections of curvelet or shearlet frames onto the space , which would yield a frame for this space (cf. [5]) and then define the BFD in the same way as above. Because the Radon transform is continuous on , the reproducing formula (2.1) will hold for general linear combinations.

Algorithm 2.7.

Computing BFD coefficients for the Radon transform Let be a wavelet, shearlet of curvelet frame and define by (2.5).

  1. Input: .

  2. Compute .

  3. Compute via wavelet, curvelet or shearlet transform.

  4. Apply rescaling .

  5. Output: Coefficients .

2.3 Inversion of the wave equation

We consider a planar geometry, which has been considered in our previous work [13]. Let denote the space of compactly supported functions that are supported in the half space . For consider the initial value problem

(2.8)

The trace map where for is known to be an isometry from to , see [2, 13, 19]. In particular, the operator is continuous.

Definition 2.8 (Forward operator for the wave equation).

We define by , for , where is the solution of (2.8), and extending it by continuity to .

The isometry property implies that any frame gives a BFD by setting and . This, in particular, yields a wavelet vaguelette decomposition and a biorthogonal curvelet decomposition for the wave equation.

Example 2.9.

BFD for the wave equation Let be either a wavelet frame, a curvelet frame or a shearlet frame with where is the scale index. Then is a BFD with

(2.9)
(2.10)

As noted in [13] this result directly follows from the isometry property and the associated inversion formula .

The isometry property also gives an efficient numerical algorithm for computing analysis coefficients with respect to the frame in the case that is associated with an efficient algorithm.

Algorithm 2.10.

Computing BFD coefficients for the wave equation Let be the curvelet frame and define by (2.9).

  1. Input: .

  2. Compute .

  3. Compute via wavelet, curvelet or shearlet transform.

  4. Output: Coefficients .

The algorithm described above can be used for any problem where the forward operator is an isometry. In the case of the wavelet transform this simple procedure has been previously used in [13].

3 Sparse -regularization

There are two fundamentally different and well studied instances of sparse frame based regularization, namely -analysis regularization and -synthesis regularization. They are defined by

(3.1)
(3.2)

respectively, with weights .

Definition 3.1.

We call sparse if the set is finite.

If is sparse, we write where is the multi-valued signum function defined by and for . We will use the notation

(3.3)

Any element in the set is called -minimizing solution of . Note that -minimizing solutions exists whenever there is any solution with , as follows from [23, Theorem 3.25].

Below we recall well-posedness and convergence results for both variants. These results hold under the following quite week assumptions:

  1. [label=(A0),leftmargin=3em]

  2. is bounded linear;

  3. are synthesis operators of frames , of ;

  4. satisfies .

For certain sparse elements we will state linear error estimates which have been derived in

[14, 16]. See [7, 17, 21, 3, 23] for some further works on sparse -regularization, and [8, 18, 22] for wavelet regularization methods.

3.1 -Analysis regularization

Let us define the -analysis Tikhonov functional by

(3.4)

Then we have .

Proposition 3.2 (Convergence of analysis regularization).

Let 13 be satisfied, suppose , , with , and choose .

  • Existence: The functional has at least one minimizer

  • Stability: There exists a subsequence of and a minimizer such that . If the minimizer of is unique, then .

  • Convergence: Assume for with and suppose with . Consider a parameter choice such that . There is an -minimizing solution of and a subsequence with . If the -minimizing solution is unique, then .

Proof.

See [14, Propositions 5, 6 and 7]. ∎

In order to derive convergence rates, one has to make additional assumptions on the exact solution to be recovered. Besides the sparsity this requires a certain interplay between and the forward operator .

[label=(A0),leftmargin=3em] is sparse; ; .

Assumption 5 is a so-called source condition and the main restrictive assumption. It requires that there exists an element that satisfies the smoothness assumption . Because , the space is finite dimensional. Therefore, Condition 6 requires injectivity on a certain finite dimensional subspace.

Proposition 3.3 (Convergence rates for analysis regularization).

Suppose 16 hold. Then, for a parameter choice , there is a constant such that for all with and every minimizer we have .

Proof.

See [16, Theorem III.8]. ∎

3.2 Synthesis regularization

Let use denote the -synthesis Tikhonov functional by

(3.5)

Then it holds . Synthesis regularization can be seen as analysis regularization for the coefficient inverse problem and the analysis operator . Using Proposition 3.2 we therefore have the following result.

Proposition 3.4 (Convergence of synthesis regularization).

Let 1-3 be satisfied, suppose , , with and take .

  • Existence: The functional has at least one minimizer.

  • Stability: There exists a subsequence of and such that . If the minimizer of is unique, then .

  • Convergence: Assume for with and with . Consider a parameter choice with . There exist an -minimizing solution of and a subsequence with . If the -minimizing solution is unique, then .

Proof.

Follows from Proposition 3.2 with and in place of . ∎

We have linear convergence rates under the following additional assumptions on the element to be recovered.

[label=(S0),leftmargin=3em] where is sparse; ; .
Proposition 3.5 (Convergence rates for synthesis regularization).

Suppose that 13 and 46 hold. Then, for a parameter choice , there is a constant such that for all with , every minimizer we have .

Proof.

Follows from Proposition 3.3. ∎

Because is bounded, the above convergence results can be transferred to convergence in the signal space . In particular, we have stability and convergence under the assumptions of made in Proposition 3.4, and linear convergence rates .

3.3 Sparse regularization using an SVD

In the special case that is part of an SVD, then analysis and synthesis regularization are equivalent and can be computed explicitly by soft-thresholding of the expansion coefficients.

Theorem 3.6 (Equivalence in the SVD case).

Let be an SVD for , let and consider (3.1), (3.2) with . Then

equals the soft-thresholding estimator in the SVD system.

Proof.

Because is an orthonormal basis of , we have which implies that . Now let be any minimizer of the -analysis Tikhonov functional . Let denote the orthogonal projection on . We have

The latter sum is minimized by componentwise soft-thresholding. This shows and concludes the proof. ∎

In the case that is a redundant BFD expansion and not an SVD, then (3.1), (3.2), and the soft-thresholding estimator

(3.6)

are all non-equivalent. Further, in this case (3.1), (3.2) have to be computed by iterative minimization algorithms. This requires repeated application of the forward and adjoint problem and therefore is time consuming. In the following section, we study BFD thresholding which is the analog of (3.6) for redundant systems. Despite the non-equivalence to -regularization, we are able to derive the same type of convergence results and linear convergence rates as for the analysis and synthesis variants of -regularization.

4 Regularization via BFD thresholding

Throughout this section we fix the following assumptions

  1. [label=(B0),leftmargin=3em]

  2. is bounded linear;

  3. is s BFD for ;

  4. satisfies .

In this section we show the well-posedness, convergence and convergence rates for BFD soft-thresholding.

4.1 BFD soft-thresholding

Any BFD gives an explicit inversion formula where is defined by (2.2). For ill-posed problems, and therefore the above reproducing formula is unstable when applied to noisy data instead of . Below we stabilize the inversion by including the soft-thresholding operation.

Definition 4.1 (BFD soft-thresholding).

Let be a BFD for . We define the nonlinear BFD soft-thresholding estimator by

(4.1)

If , are ONBs, then Theorem 3.6 shows that (3.1) and (3.2) are equivalent to (4.1). In the case of general frames , and are all different.

As the main result in this paper we show that BFD soft-thresholding yields the same theoretical results as - regularization. Assuming efficient implementations for and , the BFD estimator has the advantage that it can be calculated non-iteratively and is therefore much faster than and .

Consider the -Tikhonov functional for the multiplication operator ,

(4.2)

The proof strategy used in this paper is based on the following Lemma.

Lemma 4.2 (-minimization for multiplication operators).

  1. has a unique minimizer.

  2. .

  3. is continuous.

  4. .

Proof.

Because is an SVD for , Items 1, 2 follow from Theorem 3.6, the equivalence of -regularization and soft-thresholding in the SVD case. Item 3 follows from Proposition3.4. Moreover, the equality shows Item 4. ∎

Note that the continuity of (see item 2 in the above lemma) is not obvious as is the composition of soft thresholding with the unbounded operator in . The characterization in item 2 as minimizer of the -Tikhonov functional and the existing stability results for -Tikhonov yields an elegant way to obtain the continuity of . Verifying the continuity directly would also be possible but seems harder task. A similar comment applied to the proof of Theorem 4.3 where we use the convergence of .

4.2 Convergence analysis

In this section, we show that is well-posed and convergent.

Theorem 4.3 (Well-posedness and convergence).

Let 1-3 be satisfied, suppose , , and let satisfy .

  1. Existence and stability: is well-defined and continuous.

  2. Convergence: Assume for some with , suppose with and consider a parameter choice with . Then .

Proof.

1: According to Lemma 4.2, the mapping is well defined and continuous. Moreover, by definition we have which implies existence and stability of BFD thresholding.
2: We have

(4.3)

Moreover, . Therefore, Proposition 3.4 and the equality shown in Lemma 4.2 imply for . Together with (4.3) this yields 2 and completes the proof. ∎

4.3 Convergence rates

Next we derive linear convergence rates for sparse solutions. Let use denote by the support of . To derive the convergence rates, we assume the following for the exact solution to be recovered.

  1. [label=(B0),leftmargin=3em]

  2. is sparse.

  3. .

Note that assumptions 4, 5 imply the source condition

is satisfied for some element that can be chosen such that for . Moreover, it follows that is injective on with . Because , we have .

Theorem 4.4 (Convergence rates).

Suppose that 15 hold. Then, for the parameter choice with , there is a constant such that for all with and every minimizer .

Proof.

As in the proof of Theorem 4.3 one obtains

(4.4)
(4.5)

According to the considerations below 4,5 the conditions 46 are satisfies for the operator in place of and with . The convergence rates result in Proposition 3.3, estimate (4.5), and the identity shown in Lemma 4.2 imply . Together with (4.4) this implies and concludes the proof. ∎

5 Conclusion

To overcome the inherent ill-posedness of inverse problems, regularization methods incorporate available prior information about the unknowns to be reconstructed. In this context, a useful prior is sparsity with respect to a certain frame. There are at least two different regularization strategies implementing sparsity with respect to a frame, namely -analysis regularization and -synthesis regularization. In this paper, we analyzed BFD-thresholding as a third variant of sparse regularization. One advantage of BFD-thresholding compared to other sparse regularization methods is its non-iterative nature leading to fast algorithms. Besides having a BFD, actually computing the BFD soft-thresholding estimator (4.1) requires the dual frame . While in the general situation, the BFD and the dual frame have to computed numerically, we have shown that for many practical examples (see Section 2) they are known explicitly and efficient algorithms are available for its numerical evaluation.

The BFD-approach presented in this paper is well studied in the context of statistical estimating using certain multi-scale systems. However, its analysis in the context of regularization theory has not been given so far. In this paper we closed this gap and presented a complete convergence analysis of BFD-thresholding as regularization method.

Acknowledgement

The work of M.H has been supported by the Austrian Science Fund (FWF), project P 30747-N32.

References

  • [1]
  • [2] A. L. Buhgeim and V. B. Kardakov. Solution of an inverse problem for an elastic wave equation by the method of spherical means. Sibirsk. Mat. Z., 19(4):749–758, 1978.
  • [3] S. Bürger, J. Flemming, and B. Hofmann. On complex-valued deautoconvolution of compactly supported functions with sparse fourier representation. Inverse Probl., 32(10):104006, 2016.
  • [4] E. J. Candès and D. Donoho. Recovering edges in ill-posed inverse problems: Optimality of curvelet frames. Ann. Statist., 30(3):784–842, 2002.
  • [5] O. Christensen. An introduction to frames and Riesz bases. Applied and Numerical Harmonic Analysis. Birkhäuser Boston Inc., Boston, MA, 2003.
  • [6] F. Colonna, G. Easley, K. Guo, and D. Labate. Radon transform inversion using the shearlet representation. Appl. Comput. Harmon. Anal., 29(2):232–250, 2010.
  • [7] I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math., 57(11):1413–1457, 2004.
  • [8] V. Dicken and P. Maass. Wavelet-Galerkin methods for ill-posed problems. J. Inverse Ill-Posed Probl., 4:203–221, 1996.
  • [9] D. L. Donoho. Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition. Appl. Comput. Harmon. Anal., 2(2):101–126, 1995.
  • [10] M. Elad, P. Milanfar, and R. Rubinstein. Analysis versus synthesis in signal priors. Inveerse Problems, 23(3):947, 2007.
  • [11] H. W. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems. Kluwer Academic Publishers, Dordrecht, 1996.
  • [12] J. Frikel. Sparse regularization in limited angle tomography. Appl. Comput. Harmon. Anal., 34(1):117–141, 2013.
  • [13] J. Frikel and M. Haltmeier. Efficient regularization with wavelet sparsity constraints in photoacoustic tomography. Inverse Problems, 34(2):024006, 2018.
  • [14] M. Grasmair, M. Haltmeier, and O. Scherzer. Sparse regularization with penalty term. Inverse Problems, 24(5):055020, 13, 2008.
  • [15] M. Grasmair, M. Haltmeier, and O. Scherzer. Necessary and sufficient conditions for linear convergence of -regularization. Comm. Pure Appl. Math., 64(2):161–182, 2011.
  • [16] M. Haltmeier. Stable signal reconstruction via -Minimization in redundant, non-tight frames. IEEE Trans. Sig. Processing, 61(2):420–426, 2013.
  • [17] D. A. Lorenz. Convergence rates and source conditions for tikhonov regularization with sparsity constraints. Journal of Inverse and Ill-Posed Problems, 16(5):463–478, 2008.
  • [18] A.K. Louis, P. Maass, and A. Rieder. Wavelets. Theorie und Anwendungen. Teubner, Stuttgart, 1998.
  • [19] E. K. Narayanan and Rakesh.

    Spherical means with centers on a hyperplane in even dimensions.

    Inverse Probl., 26(3):035014, March 2010.
  • [20] F. Natterer. The Mathematics of Computerized Tomography, volume 32 of Classics in Applied Mathematics. SIAM, Philadelphia, 2001.
  • [21] R. Ramlau and G. Teschke. A tikhonov-based projection iteration for nonlinear ill-posed problems with sparsity constraints. Numerische Mathematik, 104(2):177–203, 2006.
  • [22] A. Rieder. A wavelet multilevel method for ill-posed problems stabilized by Tikhonov regularization. Numer. Math., 75:501–522, 1997.
  • [23] O. Scherzer, M. Grasmair, H. Grossauer, M. Haltmeier, and F. Lenzen. Variational methods in imaging. Applied Mathematical Sciences, 167, 2009.