The subject of this paper are nonlinear operator equations of the form
where is a nonlinear operator between infinite-dimensional Hilbert spaces and with norms . We suppose that the right-hand side is approximately given as satisfying the deterministic noise model
For finding stable approximations to the solution of equation (1.1), we consider the Tikhonov regularization, where the regularized solutions are minimizers of the extremal problem
with a regularization parameter . In this context, is assumed to be a norm of a densely defined subspace of , which is stronger than the original norm in . Throughout this paper we suppose that the initial guess occurring in the penalty term of satisfies the condition
Precisely, we define the stronger norm by a generator , which is a self-adjoint and positive definite unbounded linear operator with dense domain , i.e. we have for some constant
This allows us to introduce norms
where for , and for . The fractional powers are defined by means of the resolution of the identity generated by the inverse operator , see, e.g., [6, Section 2.3]. Note that the system of spaces , equipped with the respective norms, is strongly related to the Hilbert scale generated by the operator . However, for , topological completion of the spaces with respect to the norm is not needed in our setting and thus is omitted.
In the present work, we discuss the nonlinear Tikhonov regularization (1.3) in particular with an oversmoothing penalty term, where we have , or in other words This continues studies started in papers [9, 10] and , where convergence rates and numerical case studies are provided for a priori and a posteriori parameter choices, respectively, under certain smoothness assumptions on and structural conditions on . Under the same structural conditions, which are also similar to those in the corresponding seminal paper for linear operator equations by Natterer , we present as the novelty of this paper convergence results based on the Banach–Steinhaus theorem without needing any smoothness assumptions. The error estimates derived in the context of convergence assertions moreover allow us to prove low order convergence rates under associated (for example logarithmic) source conditions.
The outline of the remainder is as follows: in Section 2, we introduce Hilbert scales and formulate the basic assumptions, and in addition we establish well-posedness of Tikhonov regularization used in our setting. Then in Section 3, we introduce auxiliary elements needed for the proof of the convergence results, and in addition we provide first error estimates for Tikhonov regularization which are based on those auxiliary elements and which are needed for the subsequent convergence proofs. The regularizing properties of an a priori parameter choice as well as a discrepancy principle are considered in Section 4. The suggested discrepancy principle is considered in a form that is suitable for misfit functionals which may depend discontinuously on the regularization parameter . Finally, as a byproduct of derived error estimates, we can prove low order convergence rates in Section 5.
2 Prerequisites and assumptions
2.1 Main assumptions
In the following assumption we briefly summarize the structural properties of the operator , of its domain , in particular with respect to the the solution of equation (1.3). For examples of nonlinear inverse problems, which satisfy these assumptions (or at least substantial parts of it), we refer to [5, 7] and to the appendices of the papers [10, 25].
The operator is sequentially continuous on with respect to the weak topologies of the Hilbert spaces and .
The domain of definition is a closed and convex subset of .
Let be a non-empty set.
Let the solution to equation (1.1) with right-hand side be an interior point of the domain .
Let and let there exist finite constants such that the inequality chain
holds true for all .
2.2 Properties of regularized solutions of the Tikhonov regularization
Let for minimizers of the Tikhonov functional denote by , i.e. we have and evidently by definition of the penalty term .
Let in this example with be a bounded linear operator with non-closed range , and for simplicity let . In this setting, Tikhonov regularized solutions solve the linear operator equation
In the special situation of an injective operator and of a scale generator with , this gives
and Assumption 2.1 is satisfied with then. The oversmoothing case here means that . This situation is discussed in the analysis of fractional Tikhonov regularization, and we refer for example to [3, 8, 15]. In Natterer’s paper  the analog
to the inequality chain (2.1) is the basis for error estimates and convergence rates results for linear operator equations. The constant characterizes here the degree of ill-posedness of the problem.
The extremal problem (1.3) for finding regularized solutions is well-posed with respect to existence of minimizers and their stability in a sense specified in the following proposition. This follows by standard results from regularization theory (cf., e.g., [22, Chapter 2.6], [23, 24] and [21, Section 4.1.1]). So we give a sketch of proof only.
Let Assumption 2.1 be satisfied.
There exists for all a minimizer of the Tikhonov functional in the set .
Each minimizing sequence of over has a subsequence that converges strongly in to a minimizer of the Tikhonov functional.
For every the regularized solutions are stable with respect to small perturbations in the data .
The basic ingredients needed for the proof are as follows:
The operator , when considered as , is sequentially continuous with respect to the weak topologies on and . This implies that the misfit functional is sequentially continuous with respect to the weak topology on .
The set is weakly closed in .
The stabilizing functional is sequentially weakly lower continuous on .
The statement in the second item follows from the two facts that (i) a linear self-adjoint operator is weakly closed, and that (ii) each closed convex subset of a Hilbert space is weakly closed.
From these ingredients, it follows that each minimizing sequence of the Tikhonov functional has a subsequence which converges weakly in to a minimizer , and the corresponding subsequence of converges to .
We note that the minimizer of the Tikhonov functional may be non-unique. For an example, see [22, Example 1 in Chapter 2.6]. On the other hand, it should be mentioned that the properties of Tikhonov regularization in Hilbert spaces are well investigated when the penalty functional in the Tikhonov functional is replaced by , cf., e.g., [6, Chapter 10] or [20, Section 3.1] and the references therein, respectively.
One of the two main goals of this study is to discuss convergence results for the Tikhonov regularization with oversmoothing penalty, i.e. (note, however, that this is not explicitly required anywhere), and the regularization error is still measured in the norm of . This continues former studies like  under the assumption for some . In contrast to those papers, the focus of the present work is, although also not explicitly required anywhere, on the case for each , consequently on the situation characterized by . On the other hand, we also mention convergence assertions for with under the inequality chain (2.1).
3 Auxiliary elements and preparatory results
3.1 Auxiliary elements
In this section we consider auxiliary elements which are needed to verify our convergence results. As a preparation, we introduce the bounded, injective, selfadjoint, positive semidefinite linear operator
where the operator obeying the condition (1.5) is defined in Section 1, and is introduced by item (f) of Assumption 2.1. Note that the range of is not closed and hence zero is an accumulation point of the spectrum of . In this context, we also mention that is equivalent to , which means that obeys a power-type source condition with some source element . In the case , i.e. if , but for all , then it was shown in  and  that there exist an index function333According to  we call a function index function if it is continuous, non-decreasing and satisfies the limit condition . (for example of logarithmic type, cf. ) and a source element such that a (low order) source condition is satisfied.
The auxiliary elements based on the operator from (3.1) are defined as follows:
In order to specify the limit behaviour of different positive functions occurring in error estimates, we use in the sequel a collection of non-negative functions named and defined for with the property
to be supposed for all indices . Consequently, we have for all that as . Note that pairwise products and linear combinations with non-negative constants can again be written as such a function as
We show first that
holds for all and . It is well known that
Then the interpolation inequality implies the estimate
Note that the operator is selfadjoint and positive semidefinite, and thus the fractional powers are well-defined.
In addition, for fixed and any with chosen so small such that , we have, from (3.5) with replaced by ,
where . Since for arbitrary , the range of the operator is dense in , i.e., , due to (3.5) and (3.6) the Banach–Steinhaus theorem (cf., e.g., [14, Problem 10.1] or [19, Theorem 1.1.4]) may be applied to the operators for , with being fixed. This finally gives the asymptotics (3.4) as .
For the functions
the statements of the lemma are now easily obtained from (3.4) and the following three representations,
3.2 Some estimates for oversmoothing Tikhonov regularization
For small enough we have , because item (a) of Lemma 3.1 holds and is an interior point of . Thus
The first term on the right-hand side of the latter estimate can be written as
which completes the proof of the lemma.
and hence the required assertion of the corollary with and
The error is now estimated by the following series of error estimates. Using the triangle inequality and Lemma 3.1, we obtain
and below we consider the term in more detail. From the interpolation inequality for bounded linear, self-adjoint and positive semidefinite operators on Hilbert spaces, cf. [6, (2.49)], it follows
Thus we can continue estimating (3.11). Introducing and , we obtain
From the latter estimate and (3.10), the following proposition now immediately follows by considering there.
The inequality (3.12), which is valid for arbitrary noise levels and regularization parameters , allows us to formulate in the subsequent section sufficient conditions for the convergence of the error norm of the regularized solutions.
4 Convergence results
4.1 Main theorem
The following main theorem is an immediate consequence of the error estimates outlined in the preceding section. The formulated convergence result follows immediately from the inequality (3.12).
then convergence cannot be derived in this way, because in that borderline case the second term on the right-hand side of inequality (3.12) does not tend to zero.
The convergence result of Theorem 4.1 applies to both situations: (a) the classical case and moreover the case (b) of oversmoothing penalties . Since this theorem is immediately based on formula (3.12), an inspection of the proof of this formula from Lemma 3.2 to Proposition 3.4 shows that both inequalities in (2.1) (left-hand inequality and right-hand inequality) as structural conditions on the nonlinearity of in a neighbourhood of are needed for the convergence result of Theorem 4.1. Nevertheless, for the oversmoothing case (b), this is real progress, since the convergence result of Theorem 4.1 does not use any form of additional smoothness on , and such convergence assertions without smoothness assumptions like for some (cf. ) are missing by now for the oversmoothing case in the literature.
As is well known, in case (a) with the parameter choice condition
which is stronger than (4.1), is always sufficient for convergence of regularized solutions, and inequalities occurring in (2.1) represent only tools for obtaining convergence rates. On the other hand, in the limit situation of choosing the regularization parameter, one needs the left-hand inequality in (2.1) for obtaining convergence rates, and this inequality occurs here as a conditional stability estimate (cf. [7, Prop. 3], [5, Theorem 1.1] and references therein). Convergence is then a consequence of derived convergence rates.
4.2 A priori parameter choice of monomial type
In this subsection we consider in light of Theorem 4.1 the a priori parameter choice
We can distinguish the -intervals (A): , (B): , and (C): for (4.3). Then we have as in situation (A) and in situation (B). Note that both situation also occur and yield convergent regularized solutions in the oversmoothing case . This is a bit surprising, because the behaviour
occurring in situation (C) was supposed in the literature to be typical for the case of oversmoothing penalties. Namely as is seen in , convergence rate results of the form
are obtained under the both-sided structural condition (2.1) and in particular under the smoothness assumption for whenever the a priori parameter choice of type (4.3) with prescribed exponent applies. Evidently, this prescribed satisfies the conditions (4.4) and for all . It is important to note that coincides with the borderline case which, however, is not sufficient for convergent regularized solutions.
4.3 A discrepancy principle
For the specification of an appropriate discrepancy principle, the behaviour of the misfit functional needs to be described, for fixed. The basic properties are summarized in the following proposition.
Let Assumption 2.1 be satisfied. Then for fixed, the function is non-decreasing, with
We have .
We start with the verification of the first statement of the proposition. As a preparation, we show that the function is non-increasing. Indeed, for fixed we have
and thus . The first statement of the proposition is now easily obtained: for we have
and thus .
Next we consider the latter statement of the proposition. There holds
and thus in particular as . The estimate (1.5) implies as , which implies the latter statement of the proposition.
The first statement in (4.5) follows directly from Lemma 3.2, and we finally consider the second statement in (4.5). From (4.6) we already know that . Conversely, sequential weak continuity of the operator implies weak convergence as , and thus . This completes the proof of the proposition.
Notice that in the proof of Proposition 4.5, no use of the fundamental estimates (2.1) on the smoothing property of is made in fact. Notice also that the statement of Proposition 4.5 is quite similar to related results for Tikhonov regularization with non-oversmoothing penalty, cf. , [22, Section 2.6] and [24, Section 6.7],
It follows from Proposition 4.5 that the following version of the discrepancy principle (cf. [23, 24]) is implementable. It determines, for each noise level , an approximation . Possibly discontinuities of the misfit functional are taken into account.
Algorithm 4.7 (Discrepancy principle).
If holds, then choose , i.e., .
Otherwise choose a finite parameter such that
where and are finite constants, and let .
Algorithm 4.7 can be realized by the following strategy.
Remark 4.8 (Sequential discrepancy principle).
Practically, a parameter satisfying condition (4.7) can be determined, e.g., by choosing a constant and an initial guess and proceeding then as follows:
If holds then, with the notation , proceed for until is satisfied for the first time; define then.
If holds then, with the notation , proceed for until is satisfied for the first time; define then.
The regularizing properties of Algorithm 4.7 are stated in the following theorem.
For an arbitrary countable noise level set having the origin as only accumulation point, we consider the following three cases: (a) for each , (b) as , and (c) for each , . Below we show that in each of those three cases, (4.8) holds, if the noise level in addition satisfies , respectively. The main statement of the theorem then follows by arguing for subsequences. Note that in cases (a) and (c), the second statement in (4.8) trivially holds.
The case for means for , and thus , and then for .
Next suppose that both for , and
We first observe that
so as . Note that the asymptotics (4.9) is not needed for this result.
We next show
for some elements . Those two weak convergence statements initially hold for subsequences only, but then without loss of generality we may assume that it holds for the whole considered system.
Since the operator is weakly closed, we have
The operator is sequentially weakly continuous, thus
The interpolation inequality then gives
This completes the proof of the theorem.
5 Low order convergence rates
Our convergence assertion established in the main theorem formulated in Subsection 4 is due to the error estimate (3.12) derived in Section 3. The presented sufficient conditions for convergence are based on the Banach–Steinhaus theorem and do not need any form of solution smoothness. In other words, the case is included, where does not satisfy a power-type source condition. However, as already mentioned above, there exists at least a source condition of lower order for solution element . Precisely, there is always an index function and a source element such that
Based on formula (3.12) and taking into account the representations (3.7), (3.8) and (3.9), we can derive for such source condition low order convergence rates in the case of oversmoothing penalties as a byproduct of the studies presented in Section 3. We will outline this in the following. First we obtain from [4, Prop. 3.3] the following lemma, where we refer to  for the concept of qualification of a regularization method.
An index function is a qualification of the classical Tikhonov regularization related to the operator , which means that there are positive constants and such that
whenever for some the quotient function is non-increasing for .
If, for the index function and all exponents , the quotient functions are strictly increasing for sufficiently small , then the index function is for all a qualification of the classical Tikhonov regularization related to the operator .
We have that for all the quotient function with is non-increasing for sufficiently small . Consequently there are according to Lemma 5.1 positive constants and depending on such that
Let Assumption 2.1 and the source condition (5.1) be satisfied, where it is supposed that for all the index function has an strictly increasing quotient function for sufficiently small . Then we have, for some positive constant and from (3.12) and for all and sufficiently small , the error estimate
The functions and from Lemma 3.1 satisfy
These properties are immediate consequences from (5.2) taking into account the three representations (3.7), (3.8) and (3.9). Since the function in the error estimate (3.12) can be estimated from above by a linear combination of the functions and , there is a positive constant such that holds for sufficiently small .
This provides us directly with the following low order convergence rate result.
Set under the assumptions of Proposition 5.3 and . Then we have
In this example, we consider source conditions (5.1) of logarithmic type with the function
which is strictly concave for sufficiently small and can be extended to as an index function. Is is evident for all that the quotient function is strictly increasing for sufficiently small and Corollary 5.2 applies. This yields the error estimate (5.3) written as
For the a priori choice