Convergence results and low order rates for nonlinear Tikhonov regularization with oversmoothing penalty term

by   Bernd Hofmann, et al.

For the Tikhonov regularization of ill-posed nonlinear operator equations in a Hilbert scale setting, the convergence of regularized solutions is studied. We include the case of oversmoothing penalty terms, which means that the exact solution does not belong to the domain of definition of the considered penalty functional. In this case, we try to close a gap in the present theory, where Hölder-type convergence rates results have been proven under corresponding source conditions, but assertions on norm convergence of regularized solutions without source conditions are completely missing. A result of the present work is to provide sufficient conditions for convergence under a priori and a posteriori regularization parameter choice strategies, without any additional smoothness assumption on the solution. The obtained error estimates moreover allow us to prove low order convergence rates under associated (for example logarithmic) source conditions.



There are no comments yet.


page 1

page 2

page 3

page 4


Tikhonov functionals with a tolerance measure introduced in the regularization

We consider a modified Tikhonov-type functional for the solution of ill-...

On Tikhonov functionals penalized by Bregman distances

We investigate Tikhonov regularization methods for nonlinear ill-posed p...

Source Conditions for non-quadratic Tikhonov Regularisation

In this paper we consider convex Tikhonov regularisation for the solutio...

A new interpretation of (Tikhonov) regularization

Tikhonov regularization with square-norm penalty for linear forward oper...

Convergence Rates for Oversmoothing Banach Space Regularization

This paper studies Tikhonov regularization for finitely smoothing operat...

Nonlinear Tikhonov regularization in Hilbert scales with oversmoothing penalty: inspecting balancing principles

The analysis of Tikhonov regularization for nonlinear ill-posed equation...

Maximal Spaces for Approximation Rates in ℓ^1-regularization

We study Tikhonov regularization for possibly nonlinear inverse problems...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The subject of this paper are nonlinear operator equations of the form


where is a nonlinear operator between infinite-dimensional Hilbert spaces and with norms . We suppose that the right-hand side is approximately given as satisfying the deterministic noise model


with the noise level . Throughout the paper, it is assumed that the considered equation (1.1) has a solution and is (at least locally at ) ill-posed (cf. [12]).

For finding stable approximations to the solution of equation (1.1), we consider the Tikhonov regularization, where the regularized solutions are minimizers of the extremal problem


with a regularization parameter . In this context, is assumed to be a norm of a densely defined subspace of , which is stronger than the original norm in . Throughout this paper we suppose that the initial guess occurring in the penalty term of satisfies the condition


Precisely, we define the stronger norm by a generator , which is a self-adjoint and positive definite unbounded linear operator with dense domain , i.e. we have for some constant


This allows us to introduce norms


where for , and for . The fractional powers are defined by means of the resolution of the identity generated by the inverse operator , see, e.g., [6, Section 2.3]. Note that the system of spaces , equipped with the respective norms, is strongly related to the Hilbert scale generated by the operator . However, for , topological completion of the spaces with respect to the norm is not needed in our setting and thus is omitted.

In the present work, we discuss the nonlinear Tikhonov regularization (1.3) in particular with an oversmoothing penalty term, where we have , or in other words This continues studies started in papers [9, 10] and [7], where convergence rates and numerical case studies are provided for a priori and a posteriori parameter choices, respectively, under certain smoothness assumptions on and structural conditions on . Under the same structural conditions, which are also similar to those in the corresponding seminal paper for linear operator equations by Natterer [18], we present as the novelty of this paper convergence results based on the Banach–Steinhaus theorem without needing any smoothness assumptions. The error estimates derived in the context of convergence assertions moreover allow us to prove low order convergence rates under associated (for example logarithmic) source conditions.

The outline of the remainder is as follows: in Section 2, we introduce Hilbert scales and formulate the basic assumptions, and in addition we establish well-posedness of Tikhonov regularization used in our setting. Then in Section 3, we introduce auxiliary elements needed for the proof of the convergence results, and in addition we provide first error estimates for Tikhonov regularization which are based on those auxiliary elements and which are needed for the subsequent convergence proofs. The regularizing properties of an a priori parameter choice as well as a discrepancy principle are considered in Section 4. The suggested discrepancy principle is considered in a form that is suitable for misfit functionals which may depend discontinuously on the regularization parameter . Finally, as a byproduct of derived error estimates, we can prove low order convergence rates in Section 5.

2 Prerequisites and assumptions

2.1 Main assumptions

In the following assumption we briefly summarize the structural properties of the operator , of its domain , in particular with respect to the the solution of equation (1.3). For examples of nonlinear inverse problems, which satisfy these assumptions (or at least substantial parts of it), we refer to [5, 7] and to the appendices of the papers [10, 25].

Assumption 2.1.
  • The operator is sequentially continuous on with respect to the weak topologies of the Hilbert spaces and .

  • The domain of definition is a closed and convex subset of .

  • Let be a non-empty set.

  • Let the solution to equation (1.1) with right-hand side be an interior point of the domain .

  • Let the data satisfy the noise model (1.2) and let the initial guess satisfy (1.4).

  • Let and let there exist finite constants such that the inequality chain


    holds true for all .

Remark 2.2.

From item (f) (left-hand inequality) of Assumption 2.1 we have for that is the uniquely determined solution to equation (1.1) in the set . For there is no solution at all to (1.1) in . But in both cases alternative solutions with and cannot be excluded.

2.2 Properties of regularized solutions of the Tikhonov regularization

Let for minimizers of the Tikhonov functional denote by , i.e. we have and evidently by definition of the penalty term .

Example 2.3.

Let in this example with be a bounded linear operator with non-closed range , and for simplicity let . In this setting, Tikhonov regularized solutions solve the linear operator equation


In the special situation of an injective operator and of a scale generator with , this gives

and Assumption 2.1 is satisfied with then. The oversmoothing case here means that . This situation is discussed in the analysis of fractional Tikhonov regularization, and we refer for example to [3, 8, 15]. In Natterer’s paper [84] the analog

to the inequality chain (2.1) is the basis for error estimates and convergence rates results for linear operator equations. The constant characterizes here the degree of ill-posedness of the problem.

The extremal problem (1.3) for finding regularized solutions is well-posed with respect to existence of minimizers and their stability in a sense specified in the following proposition. This follows by standard results from regularization theory (cf., e.g., [22, Chapter 2.6], [23, 24] and [21, Section 4.1.1]). So we give a sketch of proof only.

Proposition 2.4.

Let Assumption 2.1 be satisfied.

  • There exists for all a minimizer of the Tikhonov functional in the set .

  • Each minimizing sequence of over has a subsequence that converges strongly in to a minimizer of the Tikhonov functional.

  • For every the regularized solutions are stable with respect to small perturbations in the data .


The basic ingredients needed for the proof are as follows:

  • The operator , when considered as , is sequentially continuous with respect to the weak topologies on and . This implies that the misfit functional is sequentially continuous with respect to the weak topology on .

  • The set is weakly closed in .

  • The stabilizing functional is sequentially weakly lower continuous on .

The statement in the second item follows from the two facts that (i) a linear self-adjoint operator is weakly closed, and that (ii) each closed convex subset of a Hilbert space is weakly closed.

From these ingredients, it follows that each minimizing sequence of the Tikhonov functional has a subsequence which converges weakly in to a minimizer , and the corresponding subsequence of converges to .        

Remark 2.5.

We note that the minimizer of the Tikhonov functional may be non-unique. For an example, see [22, Example 1 in Chapter 2.6]. On the other hand, it should be mentioned that the properties of Tikhonov regularization in Hilbert spaces are well investigated when the penalty functional in the Tikhonov functional is replaced by , cf., e.g., [6, Chapter 10] or [20, Section 3.1] and the references therein, respectively.

One of the two main goals of this study is to discuss convergence results for the Tikhonov regularization with oversmoothing penalty, i.e.  (note, however, that this is not explicitly required anywhere), and the regularization error is still measured in the norm of . This continues former studies like [7] under the assumption for some . In contrast to those papers, the focus of the present work is, although also not explicitly required anywhere, on the case for each , consequently on the situation characterized by . On the other hand, we also mention convergence assertions for with under the inequality chain (2.1).

3 Auxiliary elements and preparatory results

3.1 Auxiliary elements

In this section we consider auxiliary elements which are needed to verify our convergence results. As a preparation, we introduce the bounded, injective, selfadjoint, positive semidefinite linear operator


where the operator obeying the condition (1.5) is defined in Section 1, and is introduced by item (f) of Assumption 2.1. Note that the range of is not closed and hence zero is an accumulation point of the spectrum of . In this context, we also mention that is equivalent to , which means that obeys a power-type source condition with some source element . In the case , i.e. if , but for all , then it was shown in [16] and [11] that there exist an index function333According to [17] we call a function index function if it is continuous, non-decreasing and satisfies the limit condition . (for example of logarithmic type, cf. [13]) and a source element such that a (low order) source condition is satisfied.

The auxiliary elements based on the operator from (3.1) are defined as follows:


where the solution of the operator equation (1.1) and the corresponding initial guess are as introduced above. The basic properties of the auxiliary elements are summarized in Lemma 3.1.

In order to specify the limit behaviour of different positive functions occurring in error estimates, we use in the sequel a collection of non-negative functions named and defined for with the property


to be supposed for all indices . Consequently, we have for all that as . Note that pairwise products and linear combinations with non-negative constants can again be written as such a function as

Lemma 3.1.

There are functions for satisfying (3.3) such that the auxiliary elements from (3.2) have the following properties:

  • as ,

  • as ,

  • as .


We show first that


holds for all and . It is well known that

Then the interpolation inequality implies the estimate


Note that the operator is selfadjoint and positive semidefinite, and thus the fractional powers are well-defined.

In addition, for fixed and any with chosen so small such that , we have, from (3.5) with replaced by ,


where . Since for arbitrary , the range of the operator is dense in , i.e., , due to (3.5) and (3.6) the Banach–Steinhaus theorem (cf., e.g., [14, Problem 10.1] or [19, Theorem 1.1.4]) may be applied to the operators for , with being fixed. This finally gives the asymptotics (3.4) as .

For the functions


the statements of the lemma are now easily obtained from (3.4) and the following three representations,

3.2 Some estimates for oversmoothing Tikhonov regularization

Lemma 3.2.

Let Assumption 2.1 be satisfied. Then there is a function for satisfying (3.3) such that we have for all


For small enough we have , because item (a) of Lemma 3.1 holds and is an interior point of . Thus

The first term on the right-hand side of the latter estimate can be written as

This is a consequence of item (b) of Lemma 3.1. The second term on the right-hand side of the latter estimate attains the form

based on item (c) of Lemma 3.1. This yields the function

which completes the proof of the lemma.        

Corollary 3.3.

Let Assumption 2.1 be satisfied. Then there is a function for satisfying (3.3) and a constant such that we have for all


It follows from the left-hand estimate in (2.1) and Lemma 3.2 that

and hence the required assertion of the corollary with and        

The error is now estimated by the following series of error estimates. Using the triangle inequality and Lemma 3.1, we obtain


and below we consider the term in more detail. From the interpolation inequality for bounded linear, self-adjoint and positive semidefinite operators on Hilbert spaces, cf. [6, (2.49)], it follows


Both terms on the right-hand side of the estimate (3.11) can be estimated by using Corollary 3.3 and Lemma 3.1 in the following manner: Precisely, we find with and the estimates

Thus we can continue estimating (3.11). Introducing and , we obtain

From the latter estimate and (3.10), the following proposition now immediately follows by considering there.

Proposition 3.4.

Let Assumption 2.1 be satisfied. Then there is a function for satisfying (3.3) and a constant such that we have for all


The inequality (3.12), which is valid for arbitrary noise levels and regularization parameters , allows us to formulate in the subsequent section sufficient conditions for the convergence of the error norm of the regularized solutions.

4 Convergence results

4.1 Main theorem

The following main theorem is an immediate consequence of the error estimates outlined in the preceding section. The formulated convergence result follows immediately from the inequality (3.12).

Theorem 4.1.

Let Assumption 2.1 be satisfied. Then for any a priori parameter choice and any a posteriori parameter choice , the regularized solutions converge in the norm of the Hilbert space to the solution of the operator equation (1.1) for , i.e. , whenever

Remark 4.2.



then convergence cannot be derived in this way, because in that borderline case the second term on the right-hand side of inequality (3.12) does not tend to zero.

Remark 4.3.

The convergence result of Theorem 4.1 applies to both situations: (a) the classical case and moreover the case (b) of oversmoothing penalties . Since this theorem is immediately based on formula (3.12), an inspection of the proof of this formula from Lemma 3.2 to Proposition 3.4 shows that both inequalities in (2.1) (left-hand inequality and right-hand inequality) as structural conditions on the nonlinearity of in a neighbourhood of are needed for the convergence result of Theorem 4.1. Nevertheless, for the oversmoothing case (b), this is real progress, since the convergence result of Theorem 4.1 does not use any form of additional smoothness on , and such convergence assertions without smoothness assumptions like for some (cf. [10]) are missing by now for the oversmoothing case in the literature.

As is well known, in case (a) with the parameter choice condition

which is stronger than (4.1), is always sufficient for convergence of regularized solutions, and inequalities occurring in (2.1) represent only tools for obtaining convergence rates. On the other hand, in the limit situation of choosing the regularization parameter, one needs the left-hand inequality in (2.1) for obtaining convergence rates, and this inequality occurs here as a conditional stability estimate (cf. [7, Prop. 3], [5, Theorem 1.1] and references therein). Convergence is then a consequence of derived convergence rates.

4.2 A priori parameter choice of monomial type

In this subsection we consider in light of Theorem 4.1 the a priori parameter choice


for exponents . Then condition (4.1) is satisfied if and only if , and the borderline condition (4.2) holds if and only if . This gives the following proposition.

Proposition 4.4.

For the a priori choice (4.3) of the regularization parameter , condition (4.1) in Theorem 4.1 holds if and only if . For all , the choice yields convergence.

We can distinguish the -intervals  (A): ,  (B): , and (C): for (4.3). Then we have as in situation (A) and in situation (B). Note that both situation also occur and yield convergent regularized solutions in the oversmoothing case . This is a bit surprising, because the behaviour


occurring in situation (C) was supposed in the literature to be typical for the case of oversmoothing penalties. Namely as is seen in [9], convergence rate results of the form

are obtained under the both-sided structural condition (2.1) and in particular under the smoothness assumption for whenever the a priori parameter choice of type (4.3) with prescribed exponent applies. Evidently, this prescribed satisfies the conditions (4.4) and for all . It is important to note that coincides with the borderline case which, however, is not sufficient for convergent regularized solutions.

4.3 A discrepancy principle

For the specification of an appropriate discrepancy principle, the behaviour of the misfit functional needs to be described, for fixed. The basic properties are summarized in the following proposition.

Proposition 4.5.

Let Assumption 2.1 be satisfied. Then for fixed, the function is non-decreasing, with


We have .


We start with the verification of the first statement of the proposition. As a preparation, we show that the function is non-increasing. Indeed, for fixed we have

and thus . The first statement of the proposition is now easily obtained: for we have

and thus .

Next we consider the latter statement of the proposition. There holds


and thus in particular as . The estimate (1.5) implies as , which implies the latter statement of the proposition.

The first statement in (4.5) follows directly from Lemma 3.2, and we finally consider the second statement in (4.5). From (4.6) we already know that . Conversely, sequential weak continuity of the operator implies weak convergence as , and thus . This completes the proof of the proposition.        

Remark 4.6.

Notice that in the proof of Proposition 4.5, no use of the fundamental estimates (2.1) on the smoothing property of is made in fact. Notice also that the statement of Proposition 4.5 is quite similar to related results for Tikhonov regularization with non-oversmoothing penalty, cf. [1], [22, Section 2.6] and [24, Section 6.7],

It follows from Proposition 4.5 that the following version of the discrepancy principle (cf. [23, 24]) is implementable. It determines, for each noise level , an approximation . Possibly discontinuities of the misfit functional are taken into account.

Algorithm 4.7 (Discrepancy principle).
  • If holds, then choose , i.e., .

  • Otherwise choose a finite parameter such that


    where and are finite constants, and let .

Algorithm 4.7 can be realized by the following strategy.

Remark 4.8 (Sequential discrepancy principle).

Practically, a parameter satisfying condition (4.7) can be determined, e.g., by choosing a constant and an initial guess and proceeding then as follows:

  • If holds then, with the notation , proceed for until is satisfied for the first time; define then.

  • If holds then, with the notation , proceed for until is satisfied for the first time; define then.

The regularizing properties of Algorithm 4.7 are stated in the following theorem.

Theorem 4.9.

Let Assumption 2.1 be satisfied. For the a posteriori parameter choice introduced in Algorithm 4.7, we have


For an arbitrary countable noise level set having the origin as only accumulation point, we consider the following three cases: (a) for each , (b) as , and (c) for each , . Below we show that in each of those three cases, (4.8) holds, if the noise level in addition satisfies , respectively. The main statement of the theorem then follows by arguing for subsequences. Note that in cases (a) and (c), the second statement in (4.8) trivially holds.

  • The case for means for , and thus , and then for .

  • Suppose that as . From Lemma 3.2, we obtain

    and thus as . Proposition 3.4 then yields as .

  • Next suppose that both for , and

    • We first observe that

      so as . Note that the asymptotics (4.9) is not needed for this result.

    • There holds


      This easily follows from (4.6) and (4.9), in combination with the estimate .

    • We next show


      Estimates (4.10) and (1.5) imply as , and thus we obtain weak convergence

      for some elements . Those two weak convergence statements initially hold for subsequences only, but then without loss of generality we may assume that it holds for the whole considered system.

      Since the operator is weakly closed, we have

      The operator is sequentially weakly continuous, thus

      Algorithm 4.7 implies as , so , and the lower bound in (2.1) finally gives , which means (4.11).

      The interpolation inequality then gives

      This completes the proof of the theorem.        

Note that in the oversmoothing situation , case (c) in the proof of Theorem 4.9 does not emerge, cf. (4.11). This fact is, in a similar setting, already observed in [10, Lemma 1].

Remark 4.10.

Notice that the situation (b) in the proof of Theorem 4.9 is the regular case in applications. The case (c) is an exceptional case which, in the non-oversmoothing case, can be excluded, if the exact penalization veto is satisfied. This veto had been introduced in the paper [1]; see also [2].

5 Low order convergence rates

Our convergence assertion established in the main theorem formulated in Subsection 4 is due to the error estimate (3.12) derived in Section 3. The presented sufficient conditions for convergence are based on the Banach–Steinhaus theorem and do not need any form of solution smoothness. In other words, the case is included, where does not satisfy a power-type source condition. However, as already mentioned above, there exists at least a source condition of lower order for solution element . Precisely, there is always an index function and a source element such that


Based on formula (3.12) and taking into account the representations (3.7), (3.8) and (3.9), we can derive for such source condition low order convergence rates in the case of oversmoothing penalties as a byproduct of the studies presented in Section 3. We will outline this in the following. First we obtain from [4, Prop. 3.3] the following lemma, where we refer to [17] for the concept of qualification of a regularization method.

Lemma 5.1.

An index function is a qualification of the classical Tikhonov regularization related to the operator , which means that there are positive constants and such that

whenever for some the quotient function is non-increasing for .

Corollary 5.2.

If, for the index function and all exponents , the quotient functions are strictly increasing for sufficiently small , then the index function is for all a qualification of the classical Tikhonov regularization related to the operator .


We have that for all the quotient function with is non-increasing for sufficiently small . Consequently there are according to Lemma 5.1 positive constants and depending on such that

Theorem 5.3.

Let Assumption 2.1 and the source condition (5.1) be satisfied, where it is supposed that for all the index function has an strictly increasing quotient function for sufficiently small . Then we have, for some positive constant and from (3.12) and for all and sufficiently small , the error estimate


The functions and from Lemma 3.1 satisfy

These properties are immediate consequences from (5.2) taking into account the three representations (3.7), (3.8) and (3.9). Since the function in the error estimate (3.12) can be estimated from above by a linear combination of the functions and , there is a positive constant such that holds for sufficiently small .        

This provides us directly with the following low order convergence rate result.

Corollary 5.4.

Set under the assumptions of Proposition 5.3 and . Then we have

Example 5.5.

In this example, we consider source conditions (5.1) of logarithmic type with the function

which is strictly concave for sufficiently small and can be extended to as an index function. Is is evident for all that the quotient function is strictly increasing for sufficiently small and Corollary 5.2 applies. This yields the error estimate (5.3) written as

For the a priori choice