1 Introduction
Nonlinear diffusion processes are wellknown and widely used for image noise removal. Roughly speaking, the idea is to combine an effective noise reduction by diffusion with the preservation of the edges and other important image features. Amongst the better known approaches are those where the diffusion coefficient depends on the gradient with an inverse proportion, [15, 18].
The use of nonlinear complex diffusion filters (NCDF) where the image is represented by a complex function and the process of filtering is governed by a diffusion equation with a complexvalued diffusion coefficient, was investigated in [12]. Those filters, which bring the advantage of using the imaginary part of the solution as an edge detector avoiding the computation of the gradient to control the diffusion coefficient, can be successfully applied for denoising in particular for medical imaging despeckling [6, 16].
A complex diffusion equation can be written as a crossdiffusion system for the real and imaginary parts. The use of general crossdiffusion systems for image processing, which encompasses the complexdiffusion equations as particular cases, was investigated in [2]. The theoretical studies of the correspondent initial valued boundary problems were presented in [1] and [2] for the linear and nonlinear cases, respectively, where wellposedness and some scalespace representation properties were derived under hypothesis on the crossdiffusion coefficient matrix. However, besides these conditions, there is not much insight on the form that these coefficients should take in practical applications. In [3], the same authors applied the crossdiffusion model to several examples related to the problem of reducing the speckle noise in optical coherence tomography (OCT) images. The conclusion is that the performance of the model is greatly influenced by the choice of the crossdiffusion coefficient matrix.
All those methods based on diffusion filtering are highly dependent on some crucial parameters (e.g. [2, 17]). The parameters that lead to the most effective methods vary depending on the acquisition methods, the nature of the images and the associated noise profile. Furthermore, diffusion filters require the specification of a stopping time. To circumvent this issue, a reaction term can be introduced which keeps the steadystate solution close to the original image [18].
The challenge lies in the formulation of robust models with optimal parameters corresponding to each architecture. The use of neural networks for image processing tasks has set the pace of current research in this area. However, the general intrinsic “black box” nature of such algorithms is a drawback, particularly in the context of medical applications. In
[10] the authors use a learning framework inspired in nonlinear reaction diffusion processes where the filters and the influence functions are simultaneously learned from training data. The results are auspicious. There are, however, some important questions remaining: the numerical solution is not related with the original reaction diffusion problem, and more importantly, the qualitative properties of the computed solution are not derived. Moreover, the method does not cover promising models as the one based on crossdiffusion.We propose the use of deep learning techniques such as backpropagation [13] and Adam algorithm [14] to optimize the parameters of a system of PDEs for a specific task. In this work, we focus on nonlinear crossdiffusion with reaction filters in order to optimize the reaction parameter and the influence functions of the crossdiffusion matrix for the removal of gaussian noise.
The optimization process we suggest includes the stability of the inherent numerical method, which is crucial for obtaining successful results. We drive our attention first to synthetic images in order to develop our learning strategy. The methodology consists in the parametrization of the functions that govern the crossdiffusion system, the computation of the gradient of a cost function with respect to these parameters and the application of the backpropagation technique, widely used in neural network supervised learning, to learn the optimal parameters. Experiments in real noisy images show that the training process improves significantly the performance of the filters.
Our contributions can be summarized as follows:

we derive the stability conditions for a broad class of crossdiffusion models;

we build a recurrent neural network equivalent to a crossdiffusion system where its neurons are the points of the spatial discretization and the weights are the nonlinear diffusion coefficients;

we obtain improved stable crossdiffusion models for image denoising through backpropagation.
The article is organized as follows. In Section 2 the fully discrete crossdiffusion with reaction model is presented. Next, the stability conditions are derived. Section 4 is devoted to the learning model. We present the numerical experiments in Section 5. We end the paper with a section dedicated to conclusions.
2 Crossdiffusion reaction model
In this section we present a fully discrete crossdiffusion reaction model for image restoration. The image is represented by a twocomponent vector field, , and the restoration process is governed by the nonlinear crossdiffusion reaction system
(1) 
where is the domain of interest, and are the given initial conditions for and and denotes the outward normal vector to the boundary . The crossdiffusion matrix of the model is given by
(2) 
is a time dependent nonnegative parameter.
The domain is discretized by the points , where
for two given integers , and . This spatial mesh on is denoted by and . Points halfway between two adjacent grid points are denoted by , , where is the canonical basis, that is, is the standard basis unit vector in the th direction.
For the discretization in time, we consider a mesh with time step , , where .
We denote by the value of a mesh function at the point . For the formulation of the finite difference approximations, we use the centered finite difference quotients in the th spatial direction, for ,
In order to formulate the discrete crossdiffusion restoration problem, let be a discrete realvalued function standing for the grey level values on of the noisy image to be restored. From , an initial distribution , for the crossdiffusion, is required. This is given by two realvalued functions , that can be selected following different criteria. A simple choice, that will be considered in the experiments we present later in this paper, consists of taking and , . A more detailed discussion on the initial data can be seen in [1, 2].
Let , such that . Given the initial solution , the numerical solution of (1) at the time is obtained considering the following finite difference scheme
(3)  
(4) 
where
and . If then (3) is an explicit method. If then method (3) is semiimplicit and its solution is obtained by solving a system of linear equations. In the next section we will derive the stability properties of both, explicit and semiimplicit, methods. Later, in sections 4.2 and 4.3, we will discuss their role in our learning strategy: to optimize the crossdiffusion model we will use the explicit scheme ; to denoise the images we will use the semiimplicit discretization of the optimized model.
Equations (3) and (4), for such that , are defined using points of the form and , which don’t belong to and are not yet defined. For those points and , we consider the solution in and , respectively. This corresponds to the usual discretization of the homogeneous Neumann boundary conditions on with central finite differences.
3 Stability of the numerical scheme
We will now investigate the stability of the finite difference scheme (3)(4). The approach generalizes the strategies in [4] used to study the stability of the particular case of complexdiffusion equations written as a crossdiffusion system.
For each , we define the rectangle and denote by the measure of . We consider the discrete inner products
and
Their correspondent norms are denoted by , and , respectively. For we define .
To simplify the notation and where it is clear from the context, we write in what follows instead of or , for
Multiplying both members of equations (3) and (4) by and , respectively, according to the discrete inner product , and using summation by parts we get
We can write
and then
(5)  
For any we have that
(6)  
3.1 Stability of the semiimplicit scheme
We start by considering the case where . With the assumption that
(7)  
which corresponds to the natural assumption of the diffusion matrix to be uniformly positive definite, from (5) and (6) we get
Assuming that
(8) 
for some where is a constant arbitrarily chosen, we get
If (8) holds, using the Duhamel’s principle ([9], Lemma 4.1 in Appendix B) we get
and we conclude that the method is stable.
3.2 Stability of the explicit scheme
We now consider the case where . Applying the norm on both sides of equations (3) and (4), and using the inequalities and for , we obtain, respectively,
(9)  
for any and
(10) 
Let . In order to obtain a stable scheme we impose that
(11)  
for some .
If (11) holds, then
(12)  
We now take and . From (12) and using the Duhamel’s principle we get
and we conclude that the method is stable.
Note that we can rewrite (11) as
(13) 
where is a square matrix of dimension . In order for (13) to hold we require
to be semipositive definite, that is, the eigenvalues of
are all nonnegative. Using the Gershgorin Theorem, a sufficient condition for the method to be stable is to impose that the influence functions satisfy, for some ,(14)  
4 Learning model
Our goal is to optimize the parameters of our model using deep learning techniques. In order to adapt the crossdiffusion scheme (3)(4) to a neural network architecture, we concatenate and into a single column vector with entries:
The th iteration of the scheme (3)(4) can now be written in the vector formulation
(15)  
with , and where the first difference matrices () are block matrices:
and denote the backward difference operators with respect to and , respectively. The matrices and are diagonal matrices. Each entry of and depend on the function for and on the function for , at the time . The entries of and follow the same pattern, replacing and by e , respectively. It remains to define the difference matrices and :
where and denote the forward difference operators with respect to and , respectively.
4.1 Parameterization of the influence functions
An important aspect of the methodology of our learning strategy consists in the parametrization of arbitrary influence functions. We parameterize the influence functions of the crossdiffusion matrix (2
) through gaussian radial basis functions (RBFs)
[8]. Here we consider that they only depend on the second component of the vector , i.e., they depend on the component which plays the role of edge detector. Each of these functions has the expression(16) 
where , for , are equidistant points in the set of edgedetector range of values, is a positive scalar (the so called scale of the RBF), and are, for each , the interpolants od . A good approximation requires a careful balance between and .
In the learning model, we consider in (15) , where is a positive scalar.
4.2 Constrained optimization
In order to optimize the influence functions of the cross diffusion matrix and and also the scalar which characterizes the reaction term, a training cost function and a training dataset are required. We will use a set of grayscale images which will serve as basis for our training set. We fix the timestep and a number of steps , which will result in a crossdiffusion stopping time of . In the learning viewpoint, we will have a
layered convolutional neural network. We aim to minimize the loss function
,(17) 
with , , where and are, respectively, the  th iteration of (15) of the th corrupted image and the noncorrupted image, respectively, is the learning batch size, and is the set of parameters we want to optimize. As such, the set of parameters is
and we must now obtain to solve the minimization problem.
We need to guarantee that the learning procedure ends up with a stable scheme. For that, we will make use of the stability conditions derived in Section 3. Although the use of the explicit method (15) with can bring some advantages in terms of computational effort, the stability conditions are much more restrictive when compared to the semiimplicit method (15) with . Moreover, since we need to impose the stability conditions over the values of , the nonlinearities in the stability condition (14) corresponding to the explicit scheme carry out some issues that we want to avoid here. For that reason our goal is to learn a semiimplicit scheme for image denoising.
To obtain a stable semiimplicit scheme, instead of minimizing the loss function (17) we seek the solution of the constrained optimization problem
(18a)  
s.t.  (18b)  
(18c) 
This constraints ensure that (7) hold. We first note that (18b)(18c) are not in the standard formulation over the values of . As the radial basis functions (16) are strictly positive, we replace (18b)(18c) with
(19a)  
(19b)  
(19c)  
(19d) 
using the fact that (19a)(19d) implies (18b)(18c). To solve the inequality constrained problem we define the augmented Lagrangian as
Comments
There are no comments yet.