The Ivanov regularized Gauss-Newton method in Banach space with an a posteriori choice of the regularization radius

In this paper we consider the iteratively regularized Gauss-Newton method, where regularization is achieved by Ivanov regularization, i.e., by imposing a priori constraints on the solution. We propose an a posteriori choice of the regularization radius, based on an inexact Newton / discrepancy principle approach, prove convergence and convergence rates under a variational source condition as the noise level tends to zero, and provide an analysis of the discretization error. Our results are valid in general, possibly nonreflexive Banach spaces, including, e.g., L^∞ as a preimage space. The theoretical findings are illustrated by numerical experiments.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 24

page 29

page 31

08/05/2021

On Regularization via Frame Decompositions with Applications in Tomography

In this paper, we consider linear ill-posed problems in Hilbert spaces a...
05/12/2021

Accuracy controlled data assimilation for parabolic problems

This paper is concerned with the recovery of (approximate) solutions to ...
05/21/2021

On the convergence of the regularized entropy-based moment method for kinetic equations

The entropy-based moment method is a well-known discretization for the v...
05/06/2020

Projected Newton method for noise constrained ℓ_p regularization

Choosing an appropriate regularization term is necessary to obtain a mea...
12/29/2020

Estimating solution smoothness and data noise with Tikhonov regularization

A main drawback of classical Tikhonov regularization is that often the p...
10/18/2019

Projected Newton Method for noise constrained Tikhonov regularization

Tikhonov regularization is a popular approach to obtain a meaningful sol...
05/29/2020

Maximal Spaces for Approximation Rates in ℓ^1-regularization

We study Tikhonov regularization for possibly nonlinear inverse problems...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Consider an inverse problem given as a nonlinear ill-posed operator equation

(1)

where the possibly nonlinear operator with domain maps between real Banach spaces and . The task is to recover (or actually an approximation of it), given noisy observations of . Due to the ill-posedness of (1), i.e., the lack of continuous invertibility of , the problem needs to be regularized (see, e.g., [1, 6, 9, 23, 25, 27, 28, 31, 33, 35, 36], and the references therein).

Throughout this paper we will assume that an exact solution of (1) exists, i.e., , and that the deterministic noise level

in the estimate

(2)

is known.

Ivanov regularization, i.e., the option of using a prior bounds for regularizing ill-posed problem has been known for a long time already, partly also as method of quasi solutions [5, 16, 17, 18, 32] and has been revisited and further analyzed recently [4, 22, 26, 29].

Here, we are particularly interested in the IRGN-Ivanov method, i.e., we define Newton type iterates in an Ivanov regularized way as

(3)

with stopping index according to the discrepancy principle

(4)

for chosen sufficiently large independently of . (This includes the theoretical value in case the set over which the minimum is taken is empty, which we will anyway exclude later on, though). Regularization is defined by a proper and convex functional . The example

(5)

with some norm defined on , some and some a priori guess is frequently used in practice and will also be focused on in some of our results (e.g., Proposition 2 below). In case of being defined by the norm, the inequality in (3) becomes a pointwise bound constraint and efficient methods for such mimimization problems can be employed, cf., e.g., [4, 20] in the context of ill-posed problems. Since sums of convex functions are again convex, may be composed as a sum of different terms. For instance, additional a priori information on might be incorporated in a strict sense by adding in the indicator function of some convex set .

The method (3), (4) has already been considered in [24], with a choice of the regularization radii that refers to the value of the regularization functional at the exact solution, more precisely, . Since this information might not be available in some practical applications, we here propose a special choice of that does not require the knowledge of . Instead, for fixed and , the regularization parameter is chosen in an a posteriori fashion according to the inexact Newton type rule

(6)

(see also [10, 30] in the context of the Levenberg-Marquardt method in Hilbert spaces) provided

(7)

In this case we set . We will show that (7) is in fact always satisfied. As compared to [24], where , we will here prove that the reguarization radius chosen according to (6) satisfies , cf. (11), which implies that the regularization acts in a more restrictive, hence stronger stabilizing way.

We point out that the proof of well-definedness of the regularization radius will strongly rely on recent results from [4]. Moreover, we emphasize the fact that due to the non-additive structure of regularization here, the analysis done so far (e.g. [1, 23, 31]) does not apply here. For the same reason, we also did not find a possibility to extend the Levenberg-Marquardt method (e.g., [10, 19]) to the Ivanov regularized setting.

1:Choose constants according to (9), (4) and (10), respectively.
2:Choose starting point . Set .
3:if  then
4:     
5:else
6:     while  do
7:         compute minimizer of (3) according to the rule (6) for
8:         
9:               
10:     
Algorithm 1 Ivanov-Iteratively Regularized Gauss Newton

The remainder of this paper is organized as follows. In Section 2 we provide a convergence analysis in the sense of a regularization method, along with rates under variational source conditions, which is carried over to the discretized setting under certain accuracy requirements in Section 3. Section 4 is devoted to implementation details and numerical experiments. The final Section 5 contains a summary and an outlook.

2 Convergence analysis

For analyzing convergence of this method, we impose the following conditions Assume that a solution of (1) exists such that .
Moroever, let topologies on and on exist such that

  1. is proper, convex, and -lower semicontinuous with ;

  2. for all , the sublevel set defined by

    (8)

    is compact with respect to ;

  3. bounded sets in are -compact and the norm in is -lower semicontinuous;

  4. for all , the linear operator is -to- closed, i.e., for any sequence ,

In case of the regularization functional being defined by the norm on the space (5), compactness of sublevel sets typically holds in the sense of weak or weak* sequential compactness (if is reflexive or the dual of a separable space, respectively) provided is closed with respect to this topology as well. For example, may be a Lebesgue or Sobolev spaces with summability index , the space of regular Radon measures , or the space of functions with bounded total variation for some domain . Thus, the topologies , will typically be weak or weak* topologies, or possibly also strong topologies arising from compact embeddings. Note that in general, we do not make explicit use of any norm on here, but actually only use the from Assumption (2) and the structure induced by bounded sets .

As a constraint on the nonlinearity of the forward operator , we impose the tangential cone condition

(9)

for . Here, is defined as in (8) and as in Assumption 2. Also, is some linearization of (not necessarily a Gâteaux or Fréchet derivative of ; the only requirements on are (9) and for all . In order to verify (9) in case in the definition of is defined by a norm (5), one typically restricts the radius (which basically determines the size of the constant ) to be small, which will usually be possible also here, assuming closeness of an exact solution to . In case needs to be large, since is large, closeness of to each other can be enforced by considering as the sum of some regularization functional and the indicator function of a sufficiently small neighborhood (possibly with respect to a different topology than the one used for regularization, and independent of the regularization radius) around .

Moroever, we assume that the thresholds for the choice of are chosen such that

(10)

Let Assumption 2 as well as condition (9) on be satisfied. Moreover, consider a family of data satisfying (2) and let, for each , , the stopping index be defined by (4) with , which enables a choice of , such that (10) holds.

  1. For any , and , the iterate is well defined by (3);

  2. For any , under condition (7), and if ,

    1. the choice of according to (6) is well defined

    2. for any solution of (1), the estimate

      (11)

      holds.

    3. the estimate

      (12)

      holds with some independent of and (cf. (18)),

    4. (7) is satisfied with replaced by .

    In particular, for fixed , , by induction and the fact that (7) holds for , condition (7) remains valid throughout the iteration, and the iterates according to Algorithm 1 are well-defined and remain in .

  3. The stopping index according to (4) is finite.

  4. We have -subsequential convergence as , i.e., has a -convergent subsequence and the limit of any -convergence subsequence solves (1). If the solution of (1) is unique in , then as .

Proof.

The existence (i) of a minimizer of (3) for fixed , , and follows by the direct method of calculus of variations (note that the setting differs from the one in [24] in that we do not assume admissibility of here): The cost functional

(13)

is bounded from below and the admissible set is nonempty (this follows from the fact that and ). Hence, there is a minimizing sequence with . By -compactness of , the sequence has a -convergent subsequence with limit . Boundedness of yields convergence of another (not relabelled) subsequence of to some , which by the assumed -to- closedness of coincides with . -lower semicontinuity of the norm in yields

which implies that is a minimizer of (3).

For proving (ii), fix such that and assume (7) to hold. Defining and using the notation , we have

On the other hand, by (9), we have, for any solution of (1),

(14)

Thus, using continuity of the distance mapping for , see [4], together with the Intermediate Value Theorem, we have existence of

(15)

such that (6) holds. To show that (15) holds for any satisfying (6), observe that the lower bound in (6) means that

(16)

hence by the monotone decrease of , (cf. [4],) a combination of (14) and (16) implies

Thus, (11) holds and therefore the iterates remain in .

The residuals can be estimated as follows

(17)

which implies

where

(18)

Thus we have with as in (18). Using (9) and (18) we therefore get

(19)

To see (iii), observe that from (12) it follows that as soon as

hence the stopping index defined by (4) is indeed finite.

Now follows from by standard arguments and our assumption on -compactness of . In fact, let be an arbitrary sequence converging to zero. By (4) and the fact that , Assumption 1 yields existence of a -convergent subsequence with limit , satisfying . Using (2) and (4), we have existence of a (not relabeled) subsequence with limit , which by Assumption 2 (c) coincides with , hence

Thus, solves (1).

In case of uniqueness, convergence of the whole sequence follows by a subsequence-subsequence argument. ∎

While Theorem 2 only gives weak (i.e., -subsequential) convergence for general regularization functionals, more can be said in the special but practically relevant case that is defined by a norm.

Under the assumptions of Theorem 2, with defined by a norm (5), we have, for defined by Algorithm 1, that

Hence, if is defined as the norm in a Kadets-Klee space, and is the corresponding weak topology, we even have (subsequential) norm convergence in place of convergence of to an minimizing solution. A Kadets-Klee (also called Radon-Riesz) space is a normed space in which, for any sequence , convergence of the norms and weak convergence implies strong convergence .

Proof.

From [4, Corollary 2.6] we conclude that for , the minimizer lies on the boundary of the feasible set , provided holds. The latter can be easily verified by noting that if there exists such that , then which contradicts (6) and (4) due to (7). Thus, we have

(21)

If there exists a subsequence such that for all , the iteration is already stopped at , we have

hence solves (1) and

This implies trivial convergence of the subsequence to in this case.

Otherwise, for any subsequence and any , there exists such that , hence in the next to last step (7) holds and by (21),

for

In this case, for any solving (1), by (11) we also have

(22)

and by Assumption 2, there exists a solution such that the right hand side is finite. Thus, again by Assumption 2, has a convergent subsequence with limit , which, by closedness of and (4), solves (1). On the other hand, from lower semicontinuity of we conclude

(23)

This together with (22) implies

i.e., since was an arbitrary solution of (1),

Using again (23) and (22) with , we end up with

A subsequence-subsequence argument yields the assertion. ∎

To obtain convergence rates in the Bregman distance with respect to

for some in the subdifferential we make use of a variational source condition (cf., e.g., [2, 7, 8, 13, 14, 15]) at some solution of (1)

(24)

for some index function (i.e., monotonically increasing and ), which we assume to be nonempty for this purpose.

Under the assumptions of Theorem 2 and the variational source condition (24), satisfies the convergence rate

(25)
Proof.

This is a consequence of [22, Proposition 2.9] and (11) as well as (4). For completeness of exposition, we provide the short proof here.

3 Convergence of discretized approximations

We now consider a discretized version for the actual numerical solution of (3) arising from restriction of the minimization to finite dimensional subspaces containing and leading to discretized iterates and an approximate version of the forward operator:

(26)

with stopping index according to the discretized discrepancy principle

(27)

for . Here the sub- and superscripts and indicate that and are discrete approximations of and , respectively, obtianed, e.g., by finite element discretizations on computational grids that my differ from step to step. In particular, they may be coarse at the beginning of the Newton iteration and emerge by successive mesh refinement during the iteration. Note that is not necessarily the derivative of .

The regularization parameter is chosen according to the following discretized version of (6) (relying on actually computed quantities):

(28)

for provided

(29)

in this case we set . Again we will show that (29) is always satisfied.

The tangential cone condition can usually not be expected to be transferrable from the continuous to the discretized setting, as already the simple setting of , with a projection operator shows, since the right hand side will usually be too weak to estimate the (projected) first order Taylor remainder. This can also be seen from the fact that the adjoint range invariance condition (with some bounded linear operator close to the identity), that is often used to verify the tangential cone condition, does not imply its projected version . Thus, in order to be able to employ the continuous version (9), we also define the auxiliary continuous iterates (for an illustration, see [21, Figure 1]):

(30)

and the parameter is given by

(31)

provided (29) above holds.

As we will show now, this still allows to prove closeness to some projection (e.g., a metric one) of an exact solution onto the finite dimensional space , provided certain accuracy requirements are met. Note that also in , the discretization level may depend on and will typically get finer for increasing in order to enable convergence in the sense of condition (45) below.

Let the assumptions of Theorem 2 be satisfied and assume that the discretization error estimates

(32)
(33)
(34)
(35)

for any solution of (1) hold with

(36)