# Tikhonov functionals with a tolerance measure introduced in the regularization

We consider a modified Tikhonov-type functional for the solution of ill-posed nonlinear inverse problems. Motivated by applications in the field of production engineering, we allow small deviations in the solution, which are modeled through a tolerance measure in the regularization term of the functional. The existence, stability, and weak convergence of minimizers are proved for such a functional, as well as the convergence rates in the Bregman distance. We present an example for illustrating the effect of tolerances on the regularized solution and examine parameter choice rules for finding the optimal regularization parameter for the assumed tolerance value. In addition, we discuss the prospect of reconstructing sparse solutions when tolerances are incorporated in the regularization functional.

## Authors

• 1 publication
• 1 publication
11/02/2019

### Convergence results and low order rates for nonlinear Tikhonov regularization with oversmoothing penalty term

For the Tikhonov regularization of ill-posed nonlinear operator equation...
08/08/2019

### Sparse ℓ^q-regularization of inverse problems with deep learning

We propose a sparse reconstruction framework for solving inverse problem...
01/02/2020

### Regularization of Inverse Problems

These lecture notes for a graduate class present the regularization theo...
07/01/2021

### A variational non-linear constrained model for the inversion of FDEM data

Reconstructing the structure of the soil using non invasive techniques i...
03/17/2020

### Regularization of linear and nonlinear ill-posed problems by mollification

In this paper, we address the problem of approximating solutions of ill-...
05/29/2020

### Maximal Spaces for Approximation Rates in ℓ^1-regularization

We study Tikhonov regularization for possibly nonlinear inverse problems...
12/20/2020

### On Tikhonov functionals penalized by Bregman distances

We investigate Tikhonov regularization methods for nonlinear ill-posed p...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The classical inverse problem is described by an operator equation of the form

 F(\ud)=v, (1)

where is a linear or non-linear operator between some Hilbert/Banach spaces and . In the case of ill-posedness, we resort to regularization methods for approximating the true solution . The most developed and widely used method for solving ill-posed inverse problems is Tikhonov regularization, see [35, 36]. Some of the classical results on Tikhonov regularization can be found in [4, 9, 15, 17, 22, 23, 30, 33]. Here, the regularized solution is defined as the minimizer of the Tikhonov functional

 \Tδα(u)=∥F(u)−vδ∥pV+α\Rq(u), (2)

which consists of a discrepancy and a regularization term (also called penalty term). Through the regularization term we are able to include a-priori knowledge about the true solution.

In recent years, the concept of sparsity is considered a powerful tool, especially in applications, see for instance [5, 8, 15, 22, 29]. In this case the true solution has a sparse representation in the given basis or frame for the parameter space , i.e., only a few coefficients are different from zero. It turns out that in many applications one has to choose between classical and sparse regularization. The new challenge, resulting from real-world applications, is to allow some deviations in the data . In [11] Tikhonov functionals incorporating tolerances in the discrepancy term were studied for the solution of inverse problems. The authors proposed an altered Tikhonov functional of the form

 \Tδα,\ve(u)=\normd\ve(F(u)−vδ)pV+α\Rq(u), (3)

where denotes the -insensitive distance . This approach makes sense, e.g., in production engineering. In the case of surface treatment, tolerances for the quality of the end product or for the measurement accuracy are often specified. These methods have been successfully applied to the problem of process design in micro production and applications in image processing. In addition to the original reference, we refer the user to [12] and [13], too. For linear operators the case and

is a generalization of Support Vector Regression (SVR) which can be used for treating ill-posed inverse problems, see for instance

[34]. Furthermore, in [24] a rigorous analysis incorporating discrepancy terms with tolerance for solving linear integral equations was presented, under a semi-discrete setting in reproducing kernel Hilbert spaces (RKHS).

Inspired by the great potential of such approaches in applications, in our work we examine the effect of tolerances in the regularization term of Tikhonov functionals. Including these inside the penalty term means that the solution will eventually lie inside a confidence interval. An application of interest is the development of new structural materials. In this case, the goal is to find appropriate values for a set of production parameters, like chemical composition, heating or cooling, to finally obtain materials satisfying certain properties. The desired properties of the new materials are given in the form of intervals, or in the form of a so-called performance profile, for further reading refer to

[27].

### 1.1 Regularization functional with tolerances

As discussed in the introduction, the -insensitive function comes from the theory of SVR, for further reading see [24, 32, 37], and was first introduced by Cortes and Vapnik in [7]. For a given the function is defined as

 dε(x):=\absxε=max{|x|−ε,0}. (4)

In Figure (a)a, as given in (4) is plotted in comparison to the absolute value function while Figure (b)b shows their subdifferentials. In the following, we often use the term tolerance function when referring to the -insensitive function. Two analogous definitions are used within this work which differ in being a sequence or a function. We follow the definition in [11, Definition 1] and define the -insensitive modulus .

###### Definition 1 (ε-modulus function).

For we define the -insensitive modulus component wise as

 dε,n(x)i:=dεi(xi),i=1,…,n. (5)

For , with we define the -insensitive modulus function by

 dε,Ω(f)(⋅):=dε,n(f(⋅)). (6)

For simplicity of notation, we write for all cases. In both definitions given in (5) and (6) the equation (4) is applied point-wise. Analogously using the Definition 1 point-wise in the -induced norm we obtain a distance function in space.

###### Definition 2 (Lq,ε-insensitive measure).

Let be a bounded and closed and let . The -insensitive measure is denoted via

 ∥u|Lq(Ω)∥ε=∥u∥Lq,ε=∥u∥q,ε:=(∫Ωdε(u(x))q\dx)1q. (7)

Our definition agrees with the one given in [11], and for the case of we further have to assume that is bounded. For notational simplicity of our subsequent analysis, will often be denoted by .

In regularization methods we often assume a reference solution which is included in the penalty term as a-priori information on the true solution of the problem. Denoting with the reference solution and assuming including the tolerances, our penalty term is of the form

 \Rqe(u):=\normu−u∗qq,\ve=∫Ω(max{\absu(x)−u∗(x)−\ve,0})qdx, (8)

where and bounded set in . Since does not affect our theoretical analysis, for simplicity, we assume it to be zero and we only consider it later in our numerical results.

The functional is weakly lower semi-continuous and fulfills the following inequalities

 \normuLq,\ve ≤\normuLq, (9) \normuLq ≤\normuLq,\ve+\norm\veLq, (10)

which have been proved in [11]. Furthermore, is continuous, convex for , whereas for is strictly convex. By (9) it is obvious that and, therefore, is well defined.

###### Proposition 3.

Let for . The regularization functional given by (8) is coercive.

###### Proof.

This follows directly from the inequality (9) since taking leads to the conclusion that .∎

### 1.2 Tikhonov functional with tolerance in regularization term

Assuming over a bounded set and to be a reflexive Banach space, we consider an altered Tikhonov functional including the tolerance function described in the previous section in the regularization term, that is

 \Jevd(u):=\normF(u)−vδpV+α\Rqe(u). (11)

Here is a nonlinear operator between and and the noisy data are created with additive noise with level noise and are such that . The regularization term for includes the tolerance and is given by (8). We aim at investigating the analytical properties of minimizers . Moreover, we examine the connection between tolerances in parameter space and sparsity regularization. The following assumption remains valid throughout the paper.

###### Assumption 4.
• Let be weakly sequentially closed with respect to the weak topology on .

• The set is non-empty. Note that this assumption implies that is proper.

Furthermore, in the proofs of convergence and convergence rates of the minimizers of , we use the concept of an -minimizing solution.

###### Definition 5 (R-minimizing solution).

The element is called an -minimizing solution, if and .

## 2 Well-posedness

We begin with the existence of minimizers . Then, we continue with results on the stability of minimizers i.e., we prove that the minimizer depends continuously on the data. In the following results we use the next lemma which can be found in [15].

###### Lemma 6.

Let . Assume that is fixed, is a bounded sequence in and that there exist and such that , for all . Then, there exist and a subsequence such that and .

###### Proof.

The proof of this Lemma is omitted as it follows with similar steps as in [15, Lemma 4]. ∎

In the theorems, we closely follow the concept in [15] and [23] and prove them for our Tikhonov functional with tolerances incorporated in the regularization term.

###### Theorem 7 (Existence).

Assume that is fixed. For and for every the functional has a minimizer in .

###### Proof.

Let satisfy . From Lemma 6, there exists a subsequence weakly converging to some such that . From the weak lower semi-continuity of and and the fact that is weakly sequentially closed it follows that

 \Jevd(\ut) ≤\liminfj\normF(\ukj)−\vdp+α\liminfj\Rqe(\ukj) ≤\liminfj{\normF(\ukj)−\vdp+α\Rqe(\ukj)} ≤\limsupj\Jevd(\ukj),∀u∈\dom(F).

Therefore, for any , which means that is a minimizer of . ∎

###### Notation.

If any of the ingredients is taken as a (sub)sequence, the functional will be denoted including the respective (sub)sequence in its shorthand notation, e.g., given a sequence of noisy data , we will write for denoting the functional .

The next theorem concerns the stability of minimizers of , namely, for fixed we prove that the minimizer depends continuously on .

###### Theorem 8 (Stability for fixed \ve>0).

Assume and fixed. Let converge to some and let

 uk∈argmin{\Jevk(u):u∈\D}.

Then, there exist a subsequence which converges weakly to a minimizer of the functional . Moreover, we have that

###### Proof.

Since is a sequence of minimizers of , it holds that for any . From Lemma 6, there exists a subsequence weakly converging to some such that . Moreover, from the weak lower semi-continuity of and there holds

 \normF(\ut)−\vdpV≤\liminfj\normF(\ukj)−\vkjpVand\Rqe(\ut)≤\liminfj\Rqe(\ukj). (12)

Combining the above, we get

 \Jevd(\ut) ≤\liminfj\normF(\ukj)−\vkjpV+α\liminfj\Rqe(\ukj) (13) ≤\liminfj{\normF(\ukj)−\vkjpV+α\Rqe(\ukj)} =\liminfj\Jevkj(\ukj).

On the other hand, for any , we see that

 \Jevd(u)=\limk\Jevk(u)≥\limsupj\Jevkj(\ukj)≥\liminfj\Jevkj(\ukj). (14)

From (13) and (14) we conclude that for any , that is, is a minimizer of . Moreover, the weak lower semi-continuity of and implies that . ∎

###### Remark 9.

In [15, Proposition 6], the authors additionally to prove that for their functional . In our case, such a result cannot be inferred as weak convergence is not preserved under the nonlinearity of . That is, assuming we cannot prove that . In order to obtain norm convergence, one can further assume . However, we choose not to make this additional assumption as it is quite restrictive.

###### Theorem 10 (Weak convergence for fixed \ve>0).

Let be fixed. Assume that attains a solution in and that satisfies

 α(δ)→0 and δpα(δ)→0, as δ→0.

Let and let satisfy . Moreover, let and

 uk∈argmin{\Jakevk(u):u∈\D}.

Then, there exist an -minimizing solution of and a subsequence with .

###### Proof.

Let be any solution of . From the definition of it follows that

 \Jevk(\uk) =\normF(\uk)−\vkpV+αk\Rqe(\uk)≤δpk+αk\Rqe(\ut).

It can be easily seen that and together with the assumptions on and , we conclude that . For the penalty term we have which yields

 \limsupk\Rqe(\uk)≤\Rqe(\ut), (15)

when using the definition of the limit superior. Let , from the previous inequality there exists such that

 \limsupk{\normF(\uk)−\vkpV+αmax\Rqe(\uk)}≤M<∞, ∀k∈\N.

Therefore, Lemma 6 guarantees the existence of a subsequence and some such that and . Since

 \normF(\ukj)−vV=\normF(\ukj)−\vkj+\vkj−vV≤\normF(\ukj)−\vkjV+\norm\vkj−vV→0,

it follows that , i.e., . From the weak lower semi-continuity of and the fact that (15) holds for any solving , we conclude

 \Rqe(\ud)≤\liminfj\Rqe(\ukj)≤\limsupj\Rqe(\ukj)≤\Rqe(\ut).

This shows that is an -minimizing solution of and . ∎

### 2.1 Stability and convergence for vanishing tolerances

In the previous results we always assumed a positive constant . In this section, we consider a nonnegative sequence , such that . When the limit point of is , we observe that gives

 \Rqo(u)=∫Ω\absu(x)q\dx=:Rq(u). (16)

Therefore, we obtain minimizers of the generalized Tikhonov functional. For that reason, the minimizer of is denoted by .

###### Theorem 11 (Stability for \vek→0).

Assume . Let converge to , be a tolerance sequence converging to and let

 uk∈argmin{\Jekvk(u):u∈\D}.

Then, there exist and a minimizer of the functional such that .

###### Proof.

The minimizing property of gives that . Lemma 6, guarantees the existence of a subsequence of , denoted by , which converges to some and is such that . From the weak lower semi-continuity of and and the fact that , we have that

 \Jovd(\ut)≤\liminfj\normF(\ukj)−\vkjpV+α\liminfj\Rqekj(\ukj)≤\liminfj\Jekjvkj(\ukj). (17)

On the other hand, since , for any we have

 \Jovd(u)=\limk\Jekvk(u)≥\limsupj\Jekjvkj(\ukj)≥\liminfj\Jekjvkj(\ukj)(???)≥\Jovd(\ut).

Hence, based on the notation in (16), we obtain

 \Jvd(u)=:\Jovd(u)≥\Jovd(\ut):=\Jvd(\ut),

for all , implying that is a minimizer of . Moreover, and due to the fact that both and are weakly lower semi-continuous, it follows that . Then, with the use of [15, Lemma 2] we conclude that

###### Theorem 12 (Convergence for \vek→0).

Let be a tolerance sequence converging to . We assume that attains a solution in and that satisfies

 α(δ)→0andδpα(δ)→0,asδ→0.

Let and let satisfy . Moreover, let and

 uk∈argmin{\Jakekvk(u):u∈\D}.

Then, there exist an -minimizing solution of and a subsequence with .

###### Proof.

Let be any solution of . The minimizing property of implies

 \Jakekvk(\uk)≤\Jakekvk(\ut) =\normF(\ut)−\vkpV+αk\Rqek(\ut) =\normv−\vkpV+αk\Rqek(\ut) ≤δpk+αk\Rqek(\ut).

Therefore, it follows that . Then, taking the limit for yields since we assumed that and as . In a similar way, for the penalty term we have

 αk\Rqek(\uk)≤\Jakekvk(\uk)≤δpk+αk\Rqek(\ut),

that is . Taking the limit superior as we obtain

 \limsupk\Rqek(\uk)≤\limsupk{δpkαk+\Rqek(\ut)}=\Rqo(\ut), (18)

which is true for any solution of .
With and the previous calculation, there exists a constant such that

 \limsupk{\normF(\uk)−\vkpV+α1\Rqek(\uk)}≤M<∞, ∀k∈\N.

From Lemma 6, there exists a subsequence weakly convergent to some such that . Since

 \normF(\ukj)−vp =\normF(\ukj)−\vkj+\vkj−vp (19)

it follows that
From the weak lower semi-continuity of , the fact that and (18), we obtain that

 \Rqo(\ud)≤\liminfj\Rqekj(\ukj)≤\limsupj\Rqekj(\ukj)≤\Rqo(\ut),

for all such that . Using the notation in (16), we conclude that , for all such that Hence, is an -minimizing solution of . Due to and the fact that , and using [15, Lemma 2], we further conclude that

## 3 Convergence rates

In this section we present results on the convergence rates of minimizers of the functional (11

). Since we assume the parameter space to be a Banach space, we adopt the standard approach in Banach space settings and use the Bregman distance to estimate the difference between the regularized solution

and the ground truth . Some standard results on convergence rates are found in [6, 10, 15, 22, 25], while in [14, 17, 33] exist convergence rates results using the Bregman distance. Moreover, for estimating the distance between and , we use the usual norm of the Banach space .

The definition of the Bregman distance for requires the subdifferential of the functional at an element , which is given by

 ∂\Rqe(u):={z∈Lq(Ω)∗ : ∀w∈Lq(Ω)\Rqe(w)≥\Rqe(u)+\innerz,w−uLq(Ω)∗×Lq(Ω)},

where denotes the dual space of and the dual pairing between and . Particularly for (finite) -dimensional problems, like the numerical example presented in the next section, the -insensitive measure appearing in the regularization functional is defined by . Using the classical subdifferential rules, for we compute the subdifferential

 ∂R1,\ve(u)=∂\norm\de(u)1=∂n∑i=1\abs\de(ui)=n∑i=1∂\abs\de(ui) (20)

with -th sum component given by

 ∂\abs\de(ui)=⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩{−1}if ui<−\veif ui=−\ve{0}if \absui<\veif ui=\ve{1}if ui>\ve. (21)

Similarly, for we have

 ∂R2,\ve(u)=∂\norm\de(u)22=∂n∑i=1\abs\de(ui)2=n∑i=1∂\abs\de(ui)2 (22)

with -th sum component computed as

 ∂\abs\de(ui)2=⎧⎨⎩ui+\veif ui<−\ve0if \absui≤\veui−\veif ui>\ve. (23)

Note that the tolerance function is applied in a component wise sense for computing the above subdifferentials. The previous computations are confirmed in the subdifferential’s formula for

 ∂\Rqe(u)=n∑i=1∂\abs\de(ui)q=q\abs\de(ui)q−1∂\abs\de(ui) (24)

where is determined by (21).

It is worth noting that if the tolerance is not scalar but it is given as a vector with positive entries, then instead of there will be in all of the above calculations. Given the subdifferential of , we proceed with the Bregman distance and the convergence rates.

###### Definition 13 (Bregman distance).

Let . Also, let be a convex and proper functional with subdifferential . Considering an element , the Bregman distance of at is defined by

 D\veξ(~u,u):=\Rqe(~u)−\Rqe(u)−\innerξ,~u−uLq(Ω)∗×Lq(Ω), (25)

for and it is only defined in the Bregman domain

 D\veB(\Rqe):={u∈\dom(\Rqe) : ∂\Rqe(u)≠∅}.

For notational simplicity, we use the usual inner product notation for the dual pairing. Since we work in Banach spaces, there should not be any confusion with the notation of inner products in Hilbert spaces. Moreover, when writing for and , we mean that there exist and such that for .

The classical process for proving convergence rates requires an additional assumption on the smoothness of (restriction of its nonlinearity), as well as a source condition (in [18, 30] general source conditions are discussed) which allows the estimation of the duality pairing appearing in the Bregman distance. Both are included in the following assumption.

###### Assumption 14 (Smoothness of F and source condition).

Assume that the following hold:

1. The operator is Gâteaux differentiable at and denotes its Gâteaux derivative.

2. There exists a constant , such that

 \normF(u)−F(\ud)−F′(\ud)(u−\ud)≤γD\veξ(u,\ud)

for all , with a sufficiently large .

3. There exists , such that with .

###### Theorem 15.

(Convergence rates) Let , . Moreover, we consider that Assumptions 4 and  14 hold. Assume noisy data such that and that there exists an -minimizing solution of (1), in the Bregman domain . For the minimizer of (11), we prove the following estimates:

• If and ,

• If ,

with being the conjugate of such that .

Moreover, we have:

1. For and the choice with fixed

2. For and the choice

###### Proof.

We start by comparing the functional values and . From the minimizing property of , we obtain

Then, by reordering and gathering terms we use the Bregman distance , which yields

In the next step we employ the source condition (iii) of Assumption 14 for rewriting the last term, which results into

Now, we focus on the dual pairing of the last term, for which we have

Adding and subtracting inside the last term and using the triangle inequality, yields

Furthermore, we use the smoothness assumption of defined in (ii) of Assumption 14 to write

and by defining constants such that and , we further obtain

In addition, we can estimate the term . We add and subtract and use the triangle inequality to conclude

Substituting the estimates (27), (28) into (26), we have

For , rearranging (29) yields

For sufficiently small such that , the first term is nonnegative. Moreover, the second term is nonnegative by assumption since . Therefore, we can derive the following estimates