1 Introduction
Variational methods in imaging are nowadays developing towards a quite universal and flexible tool, allowing for highly successful approaches on tasks like denoising, deblurring, inpainting, segmentation, superresolution, disparity, and optical flow estimation. The overall structure of such approaches is of the form
where the functional is a data fidelity term also depending on some input data and measuring the deviation of from such and is a regularization functional. Moreover is a (often linear) forward operator modeling the dependence of data on an underlying image, and is a positive regularization parameter. While is often smooth and (strictly) convex, the current practice almost exclusively uses nonsmooth regularization functionals. The majority of successful techniques is using nonsmooth and convex functionals like the total variation and generalizations thereof, cf. [28, 31, 40], or norms of coefficients arising from scalar products with some frame system, cf. [73] and references therein.
The efficient solution of such variational problems in imaging demands for appropriate algorithms. Taking into account the specific structure as a sum of two very different terms to be minimized, splitting algorithms are a quite canonical choice. Consequently this field has revived the interest in techniques like operator splittings or augmented Lagrangians. In this chapter we shall provide an overview of methods currently developed and recent results as well as some computational studies providing a comparison of different methods and also illustrating their success in applications.
We start with a very general viewpoint in the first sections, discussing basic notations, properties of proximal maps, firmly nonexpansive and averaging operators, which form the basis of further convergence arguments. Then we proceed to a discussion of several stateofthe art algorithms and their (theoretical) convergence properties. After a section discussing issues related to the use of analogous iterative schemes for illposed problems, we present some practical convergence studies in numerical examples related to PET and spectral CT reconstruction.
2 Notation
In the following we summarize the notations and definitions that will be used throughout the presented chapter:

, , whereby the maximum operation has to be interpreted componentwise.

is the indicator function of a set given by

is a set of proper, convex, and lower semicontinuous functions mapping from into the extended real numbers .

denotes the effective domain of .

denotes the subdifferential of at and is the set consisting of the subgradients of at . If is differentiable at , then Conversely, if contains only one element then is differentiable at and this element is just the gradient of at . By Fermat’s rule, is a global minimizer of if and only if

is the (Fenchel) conjugate of . For proper , we have if and only if . If is positively homogeneous, i.e., for all , then
In particular, the conjugate functions of norms, , are given by
(1) where and as usual corresponds to and conversely, and denotes the ball of radius with respect to the norm.
3 Proximal Operator
The algorithms proposed in this chapter to solve various problems in digital image analysis and restoration have in common that they basically reduce to the evaluation of a series of proximal problems. Therefore we start with these kind of problems. For a comprehensive overview on proximal algorithms we refer to [132].
3.1 Definition and Basic Properties
For and , the proximal operator of is defined by
(2) 
It compromises between minimizing and being near to , where is the tradeoff parameter between these terms. The Moreau envelope or MoreauYoshida regularization is given by
A straightforward calculation shows that . The following theorem ensures that the minimizer in (2) exists, is unique and can be characterized by a variational inequality. The Moreau envelope can be considered as a smooth approximation of . For the proof we refer to [8].
Theorem 3.1.
Let . Then,

For any , there exists a unique minimizer of (2).

is a minimizer of if and only if it is a fixed point of , i.e.,

The Moreau envelope is continuously differentiable with gradient
(4) 
The set of minimizers of and are the same.
Rewriting iv) as we can interpret as a gradient descent step with step size for minimizing .
Example 3.2.
Theorem 3.3 (Moreau decomposition).
For the following decomposition holds:
For a proof we refer to [141, Theorem 31.5].
Remark 3.4 (Proximal operator and resolvent).
The subdifferential operator is a setvalued function . For , we have by Fermat’s rule and subdifferential calculus that if and only if
which implies by the uniqueness of the proximum that . In particular, is a singlevalued operator which is called the resolvent of the setvalued operator . In summary, the proximal operator of coincides with the resolvent of , i.e.,
3.2 Special Proximal Operators
Algorithms involving the solution of proximal problems are only efficient if the corresponding proximal operators can be evaluated in an efficient way. In the following we collect frequently appearing proximal mappings in image processing. For epigraphical projections see [12, 47, 88].
3.2.1 Orthogonal Projections
The proximal operator generalizes the orthogonal projection operator. The orthogonal projection of onto a nonempty, closed, convex set C is given by
and can be rewritten for any as
Some special sets are considered next.
Affine set
with , .
In case of subject to we substitute which leads to
This can be directly solved, see [20], and leads after backsubstitution to
where denotes the MoorePenrose inverse of .
Halfspace
with , .
A straightforward computation gives
Box and Nonnegative Orthant
with .
The proximal operator can be applied componentwise and gives
For and we get the orthogonal projection onto the nonnegative orthant
Probability Simplex
.
Here we have
where has to be determined such that . Now can be found, e.g., by bisection with starting interval or by a method similar to those described in Subsection 3.2.2 for projections onto the ball. Note that is a linear spline function with knots so that is completely determined if we know the neighbor values of .
3.2.2 Vector Norms
We consider the proximal operator of , . By the Moreau decomposition in Theorem 3.3, regarding and by (1) we obtain
where . Thus the proximal operator can be simply computed by the projections onto the ball. In particular, it follows for :
:
For ,
where , , denotes the componentwise softshrinkage with threshold .
and  
with , where are the sorted absolute values of the components of and is the largest index such that is positive and see also [59, 62]
. Another method follows similar lines as the projection onto the probability simplex in the previous subsection.
Further, grouped/mixed norms are defined for with , by
For the norm we see that
can be computed separately for each which results by the above considerations for the norm for each in
The procedure for evaluating is sometimes called coupled or grouped shrinkage.
Finally, we provide the following rule from [53, Prop. 3.6].
Lemma 3.5.
Let , where is differentiable at with . Then .
3.2.3 Matrix Norms
Next we deal with proximation problems involving matrix norms. For , we are looking for
(6) 
where is the Frobenius norm and is any unitarily invariant matrix norm, i.e., for all unitary matrices . Von Neumann (1937) [169] has characterized the unitarily invariant matrix norms as those matrix norms which can be written in the form
where
is the vector of singular values of
and is a symmetric gauge function, see [175]. Recall that is a symmetric gauge function if it is a positively homogeneous convex function which vanishes at the origin and fulfillsfor all and all permutations of indices. An analogous result was given by Davis [60] for symmetric matrices, where is replaced by
and the singular values by the eigenvalues.
We are interested in the Schatten norms for which are defined for and by
The following theorem shows that the solution of (6) reduces to a proximal problem for the vector norm of the singular values of . Another proof for the special case of the nuclear norm can be found in [37].
Theorem 3.7.
Let
be the singular value decomposition of
and a unitarily invariant matrix norm. Then in (6) is given by , where the singular values in are determined by(7) 
with the symmetric gauge function corresponding to .
Proof.
By Fermat’s rule we know that the solution of (6) is determined by
(8) 
and from [175] that
(9) 
We now construct the unique solution of (8). Let be the unique solution of (7). By Fermat’s rule satisfies and consequently there exists such that
By (9) we see that is a solution of (8). This completes the proof. ∎
For the special matrix norms considered above, we obtain by the previous subsection
4 Fixed Point Algorithms and Averaged Operators
An operator is contractive if it is Lipschitz continuous with Lipschitz constant , i.e., there exists a norm on such that
In case , the operator is called nonexpansive. A function is firmly nonexpansive if it fulfills for all one of the following equivalent conditions [12]:
(10) 
In particular we see that a firmly nonexpansive function is nonexpansive.
Lemma 4.1.
For , the proximal operator is firmly nonexpansive. In particular the orthogonal projection onto convex sets is firmly nonexpansive.
Proof.
The Banach fixed point theorem guarantees that a contraction has a unique fixed point and that the Picard sequence
(11) 
converges to this fixed point for every initial element . However, in many applications the contraction property is too restrictive in the sense that we often do not have a unique fixed point. Indeed, it is quite natural in many cases that the reached fixed point depends on the starting value . Note that if is continuous and is convergent, then it converges to a fixed point of . In the following, we denote by the set of fixed points of . Unfortunately, we do not have convergence of just for nonexpansive operators as the following example shows.
Example 4.2.
In we consider the reflection operator
Obviously, is nonexpansive and we only have convergence of if . A possibility to obtain a ’better’ operator is to average , i.e., to build
By
(12) 
we see that and have the same fixed points. Moreover, since , the sequence converges to for every .
An operator is called averaged if there exists a nonexpansive mapping and a constant such that
Following (12) we see that
Historically, the concept of averaged mappings can be traced back to [106, 113, 149], where the name ’averaged’ was not used yet. Results on averaged operators can also be found, e.g., in [12, 36, 52].
Lemma 4.3 (Averaged, (Firmly) Nonexpansive and Contractive Operators).
space

Every averaged operator is nonexpansive.

A contractive operator with Lipschitz constant is averaged with respect to all parameters .

An operator is firmly nonexpansive if and only if it is averaged with .
Proof.
i) Let be averaged. Then the first assertion follows by
ii) We define the operator . It holds for all that
so is nonexpansive if .
iii)
With
we obtain the following equalities
and therefore after reordering
If is nonexpansive, then the last expression is and consequently (10) holds true so that is firmly nonexpansive. Conversely, if fulfills (10), then
so that is nonexpansive. This completes the proof. ∎
By the following lemma averaged operators are closed under composition.
Lemma 4.4 (Composition of Averaged Operators).
space

Suppose that is averaged with respect to . Then, it is also averaged with respect to any other parameter .

Let be averaged operators. Then, is also averaged.
Proof.
i) By assumption, with nonexpansive. We have
and for all it holds that
So, is nonexpansive.
ii) By assumption there exist nonexpansive operators
and such that
The concatenation of two nonexpansive operators is nonexpansive. Finally, the convex combination of two nonexpansive operators is nonexpansive so that is indeed nonexpansive. ∎
An operator is called asymptotically regular if it holds for all that
Note that this property does not imply convergence, even boundedness cannot be guaranteed. As an example consider the partial sums of a harmonic sequence.
Theorem 4.5 (Asymptotic Regularity of Averaged Operators).
Let be an averaged operator with respect to the nonexpansive mapping and the parameter . Assume that . Then, is asymptotically regular.
Proof.
Let and for some starting element . Since is nonexpansive, i.e., we obtain
(13) 
Using it follows
Comments
There are no comments yet.