Nonlinear diffusion filters have proven to be effective for multiscale image processing in the past two decades. The most influential nonlinear image diffusion model was proposed by Perona and Malik (PM) 
where is a non-increasing edge diffusivity function. The PM formulation can produce visually pleasing image denoising results and preserve the image structure. However, noise cannot be removed at the edges/lines of the image. Weickert  thus replaced the edge diffusivity function with an anisotropic diffusion tensor T and proposed the following tensor-based diffusion formulation
This filter can reduce noise along edges and simultaneously preserve image structure. Compared with the pure edge-based descriptors, tensor can measure the image geometry in the neighbourhood of each image pixel, thus offering a richer description for an image. This makes it more appealing to use in various image processing techniques.
Technically, an image diffusion model is a partial differential equation (PDE), and minimising an energy functional of an image with the variational framework normally results in a PDE. Various image diffusion models can be thereby derived directly from the variational framework. For instance, if, the PM model (1.1) can be derived from the total variation energy functional using the gradient descent optimisation method. The variational approach has several advantages over the PDE/diffusion-based method , including easy integration of constraints imposed by the problems and employing powerful modern optimisation techniques such as primal-dual , fast iterative shrinkage-thresholding algorithm , and alternating direction method of multipliers .
Several works have investigated the tensor-based variational approach. Krajsek and Scharr  developed a linear anisotropic regularisation term that forms the basis of a tensor-valued energy functional for image denoising. Grasmair and Lenzen [8, 9] penalised image variation by introducing a diffusion tensor that depends on the structure tensor of the image. Roussous and Maragos 
considered a functional that utilises only the eigenvalues of the structure tensor. Freddie et al. proposed an extended anisotropic diffusion model, which is a tensor-based variational formulation, for colour image denoising. They discarded the Gaussian convolution when computing the structure tensor. Hence, the Euler-equation of the functional can be elegantly derived. They later introduced a tensor-based functional named the gradient energy total variation 
, which allows them to consider both eigenvalues and eigenvectors of the gradient energy tensor. Furthermore, Lefkimmiatis et al. considered the Schatten-norm of the structure tensor eigenvalues, which is similar to Roussous and Maragos’ work .
However, these existing works mentioned above only consider the standard first order total variation (FOTV) energy. A drawback of the FOTV model is that it favours piecewise-constant solutions. Thus, it can create strong staircase artefacts in the smooth regions of the restored image. Another drawback of the FOTV model is its use of gradient magnitude to penalise image variations at pixel locations x that relies only on x without taking into account the pixel neighbourhood. As such, the FOTV has difficulties inpainting images with large gaps. High order variational models have been widely applied to remedy these side effects. Among these is the second order total variation (SOTV) model [13, 14, 15, 16, 17]. Unlike the high order variational models, such as the Gaussian curvature , mean curvature , Euler’s elastica  etc., the SOTV is a convex high order extension of the FOTV, which guarantees a global solution. The SOTV is also more efficient to implement  than the total generalised variation (TGV) . However, the inpainting results of the model highly depend on the geometry of the inpainting region, and it also tends to blur the inpainted area [15, 16]. For image denoising, the model also tends to blur object edges due to the fact that it imposes too much regularity on the image.
In this paper we propose a tensor-weighted second order (TWSO) variational model for image inpainting and denoising. A novel regulariser has been developed for the TWSO that uses the Frobenius norm of the product of the SOTV Hessian matrix and an anisotropic tensor, so that the TWSO can eliminate the staircase effect whilst preserve the sharp edges/boundaries of objects in the denoised image. For image inpainting problems, the TWSO model is able to connect large gaps regardless of the geometry of the inpainting region and it would not introduce much blur to the inpainted image. As the proposed TWSO model is based on the variational framework, the ADMM can be adapted to solve the model efficiently. Extensive numerical results show that the new TWSO model outperforms the state-of-the-art approaches for both image inpainting and denoising.
The contributions of the paper are twofold: 1) a new second order variational image reconstruction model is proposed. To the best of our knowledge, this is the first time the Frobenius norm of the product of the Hessian matrix and a tensor has been used as a regulariser for variational image denoising and inpainting; 2) A fast ADMM algorithm is developed for image reconstruction based on a forward-backward finite difference scheme.
The rest of paper is organised as follows: Section 2 introduces the proposed TWSO model and the anisotropic tensor T; Section 3 presents the discretisation of the differential operators used for ADMM based on a forward-backfard finite difference scheme; Section 4 describes the ADMM for solving the variational model efficiently. Section 5 gives details of the experiments using the proposed TWSO model and the state-of-the-art approaches for image inpainting and denoising. Section 6 concludes the paper.
2 The TWSO Model for Image Processing
2.1 The TWSO model
where is a regularisation parameter. is the second order Hessian matrix of the form
and in (2.1) is the Frobenius norm of the Hessian matrix (2.2). This high order model (2.1) is able to remove the staircase artefact associated with the FOTV for image denoising, but it can blur object edges in the image. For inpainting, as investigated in [15, 16], though the SOTV has the ability to connect large gaps in the image, such ability depends on the geometry of the inpainting region, and it can blur the inpainted image. In order to remedy these side effects in both image impaiting and denoising, we propose a more flexible and generalised variational model, i.e., the Tensor-Weighted Second Order (TWSO) model, that can take advantages of both the tensor and the second order derivative. Specifically, the TWSO model is defined as
where denotes the and data fidelity terms (i.e. and ), and is a subset of (i.e. ). For image processing applications, the is normally a rectangle domain. For image inpainting, is the given image in , where is the inpainting region with missing or degraded information. The values of on the boundaries of need to be propagated into the inpainting region via minimisation of the weighted regularisation term of the TWSO model. For image denoising, is the noisy image and . In this case, the choice of depends on the type of noise found in , e.g. for Gaussian noise while for impulsive noise.
In the regularisation term of (2.3), T is a symmetric positive semi-definite 22 diffusion tensor whose four components are , , , and . They are computed from the input image . It is worth pointing out that the regularisation term is the Frobenius norm of a 2 by 2 tensor T multiplied by a 2 by 2 Hessian matrix , which has the form of
where the two orthogonal eigenvectors of T span the rotated coordinate system in which the gradient of the input image is computed and as such T can introduce orientations to the bounded Hessian regulariser . The eigenvalues of the tensor measure the degree of anisotropy in the regulariser and weight the four second order derivatives in in the two directions given by the eigenvectors of the structure tensor introduced in . As a result, the new tensor-weighted regulariser is powerful for image image inpainting and denoising, as illustrated in the experimental section. In the next section, we shall introduce the derivation of the tensor T.
2.2 Tensor Estimation
In , the author defined the structure tensor of an image
is a Gaussian kernel whose standard deviation isand is the smoothed version of the gradient convolved by . The use of as a structure descriptor is to make insensitive to noise but sensitive to change in orientation. The structure tensor is positive semi-definite and has two orthonormal eigenvectors (in the direction of gradient) and (in the direction of the isolevel lines). The corresponding eigenvalues and can be calculated from
where , and are the components of . They are given as
The eigenvalues of describe the -averaged contrast in the eigendirections, meaning: if , the image is in homogeneous area; if , it is on a straight line; and if , it is at objects’ corner. Based on the eigenvalues, we can define the following local structural coherence quantity
This quantity is large for linear structures and small for homogeneous areas in the image. With the derived structure tensor (2.5) we define a new tensor whose eigenvectors are parallel to the ones of and its eigenvalues and are chosen depending on different image processing applications. For denoising problems, we need to prohibit the diffusion across image edges and encourage strong diffusion along edges. We therefore consider the following two diffusion coefficients for the eigenvalues
with the gradient magnitude and the contrast parameter. For inpainting problems, we want to preserve linear structures and thus regularisation along isophotes of the image is appropriate. We therefore consider the weights that Weickert  used for enhancing the coherence of linear structures. With , being the eigenvalues of as before, we define
where , . The constant determines how steep the exponential function is. The structure threshold affects how the approach interprets local structures. The larger the parameter value is, the more coherent the method will be. With these eigenvalues, the regularisation is stronger in the neighbourhood of coherent structures (note that determines the radius of neighbourhood) and weaker in homogeneous areas, at corners, and in incoherent areas of the image.
3 Discretisation of differential operators
In order to implement ADMM for the proposed TWSO model, it is necessary to discretise the derivatives involved. Let denote the two dimensional grey scale image space of size MN, and and denote the coordinates along image column and row direction respectively. The discrete second order derivatives of at point along and directions can be then written as
In (3.1)-(3.5), we assume that satisfies the periodic boundary condition so that the fast Fourier transform (FFT) solver can be applied to solve (2.3) analytically. The numerical approximation of the second order divergence operator is based on the following expansion
where P is a 22 matrix whose four components are , , and , respectively. For more detailed description of the discretisation of other high order differential operators, we refer the reader to [21, 23].
Finally, we address the implementation problem of the first order derivatives of in (2.7). Since the differentiation and convolution are commutative, we can take the derivative and smooth the image in either order. In this sense, we have
Alternatively, the central finite difference scheme can be used to approximate and to satisfy the rotation-invariant property. Once all necessary discretisation is done, the numerical computation can be implemented.
4 Numerical Optimisation Algorithm
It is nontrivial to directly solve the TWSO model due to the facts: i) it is nonsmooth; ii) it couples the tensor T and Hessian matrix in as shown in (2.4), making the resulting high order Euler-Lagrange equation drastically difficult to discretise to solve computationally. To address these two difficulties, we present an efficient numerical algorithm based on ADMM to minimise the variational model (2.3).
4.1 Alternating direction method of multipliers (ADMM)
ADMM is combines the decomposability of the dual ascent with superior convergence properties of the method of multipliers. Recent research  unveils that ADMM is also closely related to Douglas-Rachford splitting, Spingarn’s method of partial inverses, Dykstra’s alternating projections, split Bregman iterative algorithm etc. Given a constrained optimisation problem
where , , , , , and both and are assumed to be convex. The augmented Lagrange function of problem (4.1) can be written as
where is an augmented Lagrangian multiplier and is an augmented penalty parameter. At the th iteration, ADMM attempts to solve problem (4.2) by iteratively minimising with respect to and , and updating accordingly. The resulting optimisation procedure is summarised in Algorithm 1.
|Algorithm 1: ADMM|
|1: Initialization: Set , and .|
|2: while a stopping criterion is not satisfied do|
|6: end while|
4.2 Application of ADMM to solve the TWSO model
We now use ADMM to solve the minimisation problem of the proposed TWSO model (2.3). The basic idea of ADMM is to first split the original nonsmooth minimisation problem into several subproblems by introducing some auxiliary variables, and then solve each subproblem separately. This numerical algorithm benefits from both solution stability and fast convergence.
In order to implement ADMM, one scalar auxiliary variable and two 22 matrix-valued auxiliary variables W and V are introduced to reformulate (2.3) into the following constraint optimisation problem
where , , and . The constraints and are applied to handle the non-smoothness of the -norm of the data fidelity term () and the weighted regularisation term respectively, whilst decouples T and in the TWSO. To guarantee an optimal solution, the above constrained problem (4.3) can be solved through ADMM summarised in Algorithm 1. Let be the augmented Lagrange functional of (4.3), which is defined as follows
where , d and b are the augmented Lagrangian multipliers, and , and are positive penalty constants controlling the weights of the penalty terms.
We will now decompose the optimisation problem (4.4) into four subproblems with respect to , , W and V, and then update the Lagrangian multipliers , d and b until the optimal solution is found and the process converges.
1) subproblem: This problem can be solved by considering the following minimisation problem
The solution of (4.5) depends on the choice of . Given the domain for image denoising or inpainting, the closed-form formulae for the minimisers under different conditions are
where . and symbols denote the component-wise multiplication and signum function, respectively.
2) subproblem: We then solve -subproblem by minimising the following problem.
whose closed-form can be obtained using the following FFT under the assumption of the circulant boundary condition (Note that to benefit from the fast FFT solver for image inpainting problems, the introduction of is compulsory due to the fact that )
where and respectively denote the discrete Fourier transform and inverse Fourier transform; is a second order divergence operator whose discrete form is defined in (3.6); “—” stands for the pointwise division of matrices. The values of the coefficient matrix equal , where and respectively stand for the image width and height, and and are the frequencies in the frequency domain. Note that in addition to FFT, AOS and Gauss-Seidel iteration can be applied to minimise the problem (4.7) with very low cost.
3) subproblem: We now solve the W-subproblem . Note that the unknown matrix-valued variable W is componentwise separable, which can be effectively solved through the analytical shrinkage operation, also known as the soft generalised thresholding equation
whose solution is given by
with the convention that .
4) The subproblem: Given fixed , , , , the solution of the V-subproblem is equivalent to solving the following least-square optimisation problem
which results in the following linear system with respect to each component in the variable
where we define , , ; , , and . Note that the closed-forms of and can be calculated from the first two equations in (4.10), whilst the last two equations lead to the analytical solutions of and .
5) , d and b update: At each iteration, we update the augmented Lagrangian multiplies , d and b as shown from Step 09 to 11 in Algorithm 2.
In summary, an ADMM-based iterative algorithm was developed to decompose the original nonsmooth minimisation problem into four simple subproblems, each of which has a closed-form solution or can be efficiently solved using the existing numerical methods (i.e. FFT and shrinkage). The overall numerical optimisation algorithm of our proposed method for image restoration can be summarised in .
|Algorithm 2: ADMM for the proposed TWSO model for image processing|
|01: Input: , , , and .|
|02: Initialise: , .|
|03: Calculate tensor T using eigenvalues defined in (2.9) or (2.10).|
|04: while some stopping criterion is not satisfied do|
|05: Compute according to (4.6).|
|06: Compute according to (4.8).|
|07: Compute according to (4.9).|
|08: Compute according to (4.10).|
|09: Update Lagrangian multiplier .|
|10: Update Lagrangian multiplier .|
|11: Update Lagrangian multiplier .|
|12: end while|
5 Numerical Experiments
We conduct numerical experiments to compare the proposed model with state-of-the-art approaches for image impainting and denoising. The metrics for quantitative evaluation of the different methods are the peak signal-to-noise ratio (PSNR) and structure similarity index map (SSIM). The higher PSNR and SSIM a method obtains the better the method will be. In order to maximise the performance of all the compared methods, we carefully adjust their built-in parameters such that the resulting PSNR and SSIM by these methods are maximised.
5.1 Image Inpainting
In this section, we test the capability of the proposed TWSO model for image inpainting. Several state-of-the-art inpainting methods will be compared with the TWSO model. They are the TV , TGV , SOTV [15, 16], and Euler’s elastica [26, 20] inpainting models. In order to obtain a better inpainting result, we iteratively refine the diffusion tensor T using the recovered image (i.e. in Algorithm 2). This refinement is crucial as it will provide more accurate tensor information for the next round iteration, thus leading to more pleasant inpainting results. We denote the inpainting domain as in the following experiments. The region in equation (2.3) is . We use for all examples in image inpainting.
Figure 2 illustrates the importance of the tensor T
in the proposed TWSO model. We compare the inpainted results by SOTV and TWSO models on some degraded images shown in the first row. From the second and third rows, it is clear that the TWSO performs much better than SOTV. SOTV introduces blurring to the inpainted regions, while TWSO recovers these shapes very well without causing much blurring. In addition to the capability of impainting large gaps, the TWSO can make smooth interpolation along the level curves of images on the inpainting domain. This validates the effectiveness of the tensor in TWSO for inpainting.
In Figure 3, we show how different geometry of the inpainting region affects the final results by the different methods. Column (b) illustrates that TV performs satisfactorily if the inpainting area is thin. However, it fails in images that contain large gaps. Column (c) indicates that the inpainting result by SOTV depends on the geometry of the inpainting region. It is able to impaint large gaps but at the price of introducing blurring. Mathematically, TGV is a combination of TV and SOTV, so its inpainted results, as shown in Column (d), are similar to those by TV and SOTV. TGV results also seem to depend on the geometry of the inpainting area. The last column shows that TWSO interpolates the images perfectly without any blurring and regardless of the inpainting region geometry. TWSO is even slightly better than Euler’s elastica, which has proven to be very effective in dealing with larger gaps.
We now show another inpainting example of a smooth image with large gaps to further illustrate the effectiveness of the proposed TWSO model. As seen from Figure 4, all the methods except the TWSO model fail in correctly impainting the large gaps in the image. Note that TWSO integrates the tensor T with its eigenvalues defined in (2.10), which makes it suitable for restoring linear structures, as shown in the second row of Figure 4.
Figure 5 is a comparison of all the methods inpainting a real image. Though TV inpainting produces piecewise constant reconstruction in the inpainting domain, its result seems to be more satisfactory than those of the SOTV and TGV. As seen from the second row, neither TV nor TGV is able to connect the gaps on the cloth. SOTV interpolates some gaps but blurs the connection region. Euler’s elastica performs better than TV, SOTV and TGV, but no better than the proposed TWSO model. Table 1 shows that the TWSO has is the most accurate among all the methods compared for the examples in Figure 3-5.
|PSNR test||SSIM value|
|Figure #||Figure 3||Figure 4||Figure 5||Figure 3||Figure 4||Figure 5|
|1st row||2nd row||3rd row||4th row||1st row||2nd row||3rd row||4th row|
|PSNR value (Mean SD)||SSIM value (Mean SD)|
Finally, we evaluate the TV, SOTV, TGV, Euler’s elastica and TWSO on the Berkeley database BSDS500 for image inpainting. We use 4 random masks (i.e. 40, 60, 80 and 90 pixels are missing) for each of the 100 images randomly selected from the database. The performance of each method for each mask is measured in terms of mean and standard derivation of PSNR and SSIM over all 100 images. The results are demonstrated in the following Table 2 and Figure 6. The accuracy of the inpainting results obtained by different methods decreases as the percentage of missing pixels region becomes larger. The highest averaged PSNR values are again achieved by the TWSO, demonstrating its effectiveness for image inpainting.
5.2 Image Denoising
For denoising images corrubpted by Gaussian noise, we use in (2.3) and the in (2.3) is the same as the image domain . We compare the proposed TWSO model with the Perona-Malik (PM) , coherent enhancing diffusion (CED) , total variation (TV) , second order total variation (SOTV) [13, 14, 15], total generalised variation (TGV) , and extended anisotropic diffusion model444Code: http://liu.diva-portal.org/smash/get/diva2:543914/SOFTWARE01.zip (EAD)  on both synthetic and real images.
Denoising results of a synthetic test image. (a): Clean data; (b): Image corrupted by 0.02 variance Gaussian noise; (c): PM result; (d): CED result; (e): TV result; (f): SOTV result; (g): TGV result; (h): EAD result; (i): TWSO result. The second row shows the isosurface rendering of the corresponding results above.
Figure 7 shows the denoised results on a synthetic image (a) by the different methods. The results by the PM and TV models, as shown in (c) and (e), have a jagged appearance (i.e. staircase artefact). However, the scale-space-based PM model shows much stronger staircase effect than the energy-based variational TV model for piecewise smooth image denoising. Due to the anisotropy, the result (d) by the CED method displays strong directional characteristics. Due to the high order derivatives involved, the SOTV, TGV and TWSO models can eliminate the staircase artefact. However, because the SOTV imposes too much regularity on the image, it smears the sharp edges of the objects in (f). Better results are obtained by the TGV, EAD and TWSO models, which neither produce the staircase artefact nor blur object edges, though TGV leaves some noise near the discontinuities and EAD over-smooths image edges, as shown in (g) and (h).
Figure 8 presents the denoised results on a real image (a) by the different methods. Both the CED and the proposed TWSO models use the anisotropic diffusion tensor T. CED distorts the image while TWSO does not. The reason for this is that TWSO uses the eigenvalues of T defined in (2.9), which has two advantages: i) it allows us to control the degree of the anisotropy of TWSO. Specifically, if the contrast parameter in (2.9) goes to infinity, the TWSO model degenerates to the pure isotropic SOTV model. The larger is, the less anisotropy TWSO has. However, the eigenvalues (2.10) used in CED are not able to adjust the anisotropy; ii) it can decide if there exists diffusion in TWSO along the direction parallel to the image gradient direction. The diffusion halts along the image gradient direction if in (2.9) is small. The diffusion is however encouraged if is large. The eigenvalue used in (2.10) however remains small (i.e. ) all the time, meaning that the diffusion in CED is always prohibited. CED thus only prefers the orientation that is perpendicular to the image gradient. This explains why CED distorts the image. In fact, by choosing a suitable , (2.9) allows TWSO to diffuse the noise along object edges and prohibit the diffusion across edges without showing any orientations.
|Figure 7||Figure 8|
|PSNR value||SSIM value||PSNR value||SSIM value|
In addition to qualitative evaluation of the different methods in Figure 7 and 8, we also calculate the PSNR and SSIM values for these methods and show them in Table 3 and Figure 9. These metrics show that the PDE-based methods (i.e. PM and CED) perform worse than the variational methods (i.e. TV, SOTV, TGV, EAD and TWSO). The TV model introduces staircase effect, SOTV blurs the image edges, and EAD tends to over-smooth the image structure. The proposed TWSO model performs well in all the cases, validating the effectiveness of it for image denoising.
We now evaluate the PM, CED, TV, SOTV, TGV, EAD and TWSO on the Berkeley database BSDS500 for image denoising. 100 images are first randomly selected from the database and each of the chosen images is corrupted by Gaussian noise with zero-mean and 5 variances ranging from 0.005 to 0.025 at 0.005 interval. The performance of each method for each noise variance is measured in terms of mean and standard derivation of PSNR and SSIM over all the 100 images. The final quantitative results are shown in Table 4 and Figure 10. The mean values of PSNR and SSIM obtained by the TWSO remain the largest in all the cases. The standard derivation of PSNR and SSIM are smaller and relatively stable, indicating that the proposed TWSO is robust against the increasing level of noise and performs the best among all the 7 methods compared for image denoising.
|PSNR value (Mean SD)||SSIM value (Mean SD)|
In addition to demonstrating TWSO for Gaussian noise removal, Figure 11 and Table 5 show the denoising results of images with salt-and-pepper noise by the different methods. Here, we use in (2.3) and compare our proposed TWSO method with another two state-of-the-art methods (i.e. media filter and TVL1 ) that have been widely employed in salt-and-pepper noise removal. As seen from the first and second row of Table 5, all three methods perform very well when the image contains low level noise. As the noise increases, the media filter and TVL1 become increasingly worse, introducing more and more artefacts to the resulting images. Both fail in restoring the castle when the image is degraded by 90 salt-and-pepper nosie. In contrast, TWSO has produced visually more pleasing results against increasing noise and its PSNR ans SSIM remain the highest in all the cases. In fact, the introduction of the tensor in TWSO offers richer neighbourhood information, making it powerful in denoising images corrupted by salt-and-pepper noise.
|PSNR value||SNR value|