A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising

07/11/2018 ∙ by Jun Xu, et al. ∙ 0

Most of existing image denoising methods assume the corrupted noise to be additive white Gaussian noise (AWGN). However, the realistic noise in real-world noisy images is much more complex than AWGN, and is hard to be modelled by simple analytical distributions. As a result, many state-of-the-art denoising methods in literature become much less effective when applied to real-world noisy images captured by CCD or CMOS cameras. In this paper, we develop a trilateral weighted sparse coding (TWSC) scheme for robust real-world image denoising. Specifically, we introduce three weight matrices into the data and regularisation terms of the sparse coding framework to characterise the statistics of realistic noise and image priors. TWSC can be reformulated as a linear equality-constrained problem and can be solved by the alternating direction method of multipliers. The existence and uniqueness of the solution and convergence of the proposed algorithm are analysed. Extensive experiments demonstrate that the proposed TWSC scheme outperforms state-of-the-art denoising methods on removing realistic noise.



There are no comments yet.


page 2

page 12

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Noise will be inevitably introduced in imaging systems and may severely damage the quality of acquired images. Removing noise from the acquired image is an essential step in photography and various computer vision tasks such as segmentation

[1], HDR imaging [2], and recognition [3], etc. Image denoising aims to recover the clean image from its noisy observation , where is the corrupted noise. This problem has been extensively studied in literature, and numerous statistical image modeling and learning methods have been proposed in the past decades [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26].

Most of the existing methods [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 14, 17, 18, 19, 20] focus on additive white Gaussian noise (AWGN), and they can be categorized into dictionary learning based methods [4, 5], nonlocal self-similarity based methods [6, 7, 8, 9, 10, 11, 12, 13, 14], sparsity based methods [4, 5, 7, 8, 9, 10, 11], low-rankness based methods [12, 13], generative learning based methods [15, 16, 14], and discriminative learning based methods [17, 18, 19, 20], etc. However, the realistic noise in real-world images captured by CCD or CMOS cameras is much more complex than AWGN [21, 22, 23, 24, 27, 26, 28], which can be signal dependent and vary with different cameras and camera settings (such as ISO, shutter speed, and aperture, etc.). In Fig. 1, we show a real-world noisy image from the Darmstadt Noise Dataset (DND) [29] and a synthetic AWGN image from the Kodak PhotoCD Dataset (http://r0k.us/graphics/kodak/). We can see that the different local patches in real-world noisy image show different noise statistics, e.g., the patches in black and blue boxes show different noise levels although they are from the same white object. In contrast, all the patches from the synthetic AWGN image show homogeneous noise patterns. Besides, the realistic noise varies in different channels as well as different local patches [22, 23, 24, 26]. In Fig. 2

, we show a real-world noisy image captured by a Nikon D800 camera with ISO=6400, its “Ground Truth” (please refer to Section 4.3), and their differences in full color image as well as in each channel. The overall noise standard deviations (stds) in Red, Green, and Blue channels are 5.8, 4.4, and 5.5, respectively. Besides, the realistic noise is inhomogeneous. For example, the stds of noise in the three boxes plotted in Fig. 

2 (c) vary largely. Indeed, the noise in real-world noisy image is much more complex than AWGN noise. Though having shown promising performance on AWGN noise removal, many of the above mentioned methods [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 14, 17, 18, 19, 20] will become much less effective when dealing with the complex realistic noise as shown in Fig. 2.

Figure 1: Comparison of noisy image patches in real-world noisy image (left) and synthetic noisy image with additive white Gaussian noise (right).







Figure 2: An example of realistic noise. (a) A real-world noisy image captured by a Nikon D800 camera with ; (b) the “Ground Truth” image (please refer to Section 4.3) of (a); (c) difference between (a) and (b) (amplified for better illustration); (d)-(f) red, green, and blue channel of (c), respectively. The standard deviations (stds) of noise in the three boxes (white, pink, and green) plotted in (c) are 5.2, 6.5, and 3.3, respectively, while the stds of noise in each channel (d), (e), and (f) are 5.8, 4.4, and 5.5, respectively.

In the past decade, several denoising methods for real-world noisy images have been developed [21, 22, 23, 24, 26, 25]. Liu et al. [21]

proposed to estimate the noise via a “noise level function” and remove the noise for each channel of the real image. However, processing each channel separately would often achieve unsatisfactory performance and generate artifacts

[5]. The methods [22, 23]

perform image denoising by concatenating the patches of RGB channels into a vector. However, the concatenation does not consider the different noise statistics among different channels. Besides, the method of


models complex noise via mixture of Gaussian distribution, which is time-consuming due to the use of variational Bayesian inference techniques. The method of

[24] models the noise in a noisy image by a multivariate Gaussian and performs denoising by the Bayesian non-local means [30]. The commercial software Neat Image [25] estimates the global noise parameters from a flat region of the given noisy image and filters the noise accordingly. However, both the two methods [24, 25] ignore the local statistical property of the noise which is signal dependent and varies with different pixels. The method [26] considers the different noise statistics in different channels, but ignores that the noise is signal dependent and has different levels in different local patches. By far, real-world image denoising is still a challenging problem in low level vision [29].

Sparse coding (SC) has been well studied in many computer vision and pattern recognition problems

[31, 32, 33], including image denoising [4, 5, 10, 11, 14]. In general, given an input signal and the dictionary of coding atoms, the SC model can be formulated as


where is the coding vector of the signal over the dictionary , is the regularization parameter, and or to enforce sparse regularization on . Some representative SC based image denoising methods include K-SVD [4], LSSC [10], and NCSR [11]. Though being effective on dealing with AWGN, SC based denoising methods are essentially limited by the data-fidelity term described by (or Frobenius) norm, which actually assumes white Gaussian noise and is not able to characterize the signal dependent and realistic noise.

In this paper, we propose to lift the SC model (1) to a robust denoiser for real-world noisy images by utilizing the channel-wise statistics and locally signal dependent property of the realistic noise, as demonstrated in Fig. 2. Specifically, we propose a trilateral weighted sparse coding (TWSC) scheme for real-world image denoising. Two weight matrices are introduced into the data-fidelity term of the SC model to characterize the realistic noise property, and another weight matrix is introduced into the regularization term to characterize the sparsity priors of natural images. We reformulate the proposed TWSC scheme into a linear equality-constrained optimization program, and solve it under the alternating direction method of multipliers (ADMM) [34] framework. One step of our ADMM is to solve a Sylvester equation, whose unique solution is not always guaranteed. Hence, we provide theoretical analysis on the existence and uniqueness of the solution to the proposed TWSC scheme. Experiments on three datasets of real-world noisy images demonstrate that the proposed TWSC scheme achieves much better performance than the state-of-the-art denoising methods.

2 The Proposed Real-World Image Denoising Algorithm

2.1 The Trilateral Weighted Sparse Coding Model

The real-world image denoising problem is to recover the clean image from its noisy observation. Current denoising methods [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] are mostly patch based. Given a noisy image, a local patch of size is extracted from it and stretched to a vector, denoted by , where is the corresponding patch in channel , where is the index of R, G, and B channels. For each local patch , we search the most similar patches to it (including itself) by Euclidean distance in a local window around it. By stacking the similar patches column by column, we form a noisy patch matrix , where and are the corresponding clean and noise patch matrices, respectively. The noisy patch matrix can be written as , where is the sub-matrix of channel . Suppose that we have a dictionary , where is the sub-dictionary corresponding to channel . In fact, the dictionary can be learned from external natrual images, or from the input noisy patch matrix .

Under the traditional sparse coding (SC) framework [35], the sparse coding matrix of over can be obtained by


where is the regularization parameter. Once is computed, the latent clean patch matrix can be estimated as . Though having achieved promising performance on additive white Gaussian noise (AWGN), the tranditional SC based denoising methods [4, 5, 10, 11, 14] are very limited in dealing with realistic noise in real-world images captured by CCD or CMOS cameras. The reason is that the realistic noise is non-Gaussian, varies locally and across channels, which cannot be characterized well by the Frobenius norm in the SC model (2) [21, 24, 29, 36].

To account for the varying statistics of realistic noise in different channels and different patches, we introduce two weight matrices and to characterize the SC residual () in the data-fidelity term of Eq. (2). Besides, to better characterize the sparsity priors of the natural images, we introduce a third weight matrix , which is related to the distribution of the sparse coefficients matrix , into the regularization term of Eq. (2). For the dictionary , we learn it adaptively by applying the SVD [37] to the given data matrix as


Note that in this paper, we are not aiming at proposing a new dictionary learning scheme as [4] did. Once obtained from SVD, the dictionary is fixed and not updated iteratively. Finally, the proposed trilateral weighted sparse coding (TWSC) model is formulated as:


Note that the parameter has been implicitly incorporated into the weight matrix .

2.2 The Setting of Weight Matrices

In this paper, we set the three weight matrices , , and as diagonal matrices and grant clear physical meanings to them.  is a block diagonal matrix with three blocks, each of which has the same diagonal elements to describe the noise properties in the corresponding R, G, or B channel. Based on [38, 29, 36], the realistic noise in a local patch could be approximately modeled as Gaussian, and each diagonal element of

is used to describe the noise variance in the corresponding patch

. Generally speaking, is employed to regularize the row discrepancy of residual matrix (), while is employed to regularize the column discrepancy of (). For matrix , each diagonal element is set based on the sparsity priors on .

We determine the three weight matrices , , and by employing the Maximum A-Posterior (MAP) estimation technique:


The log-likelihood term is characterized by the statistics of noise. According to [38, 29, 36], it can be assumed that the noise is independently and identically distributed (i.i.d.) in each channel and each patch with Gaussian distribution. Denote by and the th column of the matrices and , respectively, and denote by the noise std of . We have


From the perspective of statistics [39], the set of can be viewed as a contingency table created by two variables and , and their relationship could be modeled by a log-linear model , where . Here we consider of equal importance and empirically set . The estimation of can be transferred to the estimation of and , which will be introduced in the experimental section (Section 4).

The sparsity prior is imposed on the coefficients matrix , we assume that each column of follows i.i.d. Laplacian distribution. Specifically, for each entry , which is the coding coefficient of the th patch over the th atom of dictionary , we assume that it follows distribution of , where is the

th diagonal element of the singular value matrix

in Eq. (3). Note that we set the scale factor of the distribution as the inverse of the th singular value . This is because the larger the singular value is, the more important the th atom (i.e., singular vector) in should be, and hence the distribution of the coding coefficients over this singular vector should have stronger regularization with weaker sparsity. The prior term in Eq. (5) becomes


Put (7) and (6) into (5) and consider the log-linear model , we have




and is the

dimensional identity matrix. Note that the diagonal elements of

and are determined by the noise standard deviations in the corresponding channels and patches, respectively. The stronger the noise in a channel and a patch, the less that channel and patch will contribute to the denoised output.

2.3 Model Optimization

Letting , we can transfer the weight matrix into the data-fidelity term of (4). Thus, the TWSC scheme (4) is reformulated as


To make the notation simple, we remove the superscript in and still use in the following development. We employ the variable splitting method [40] to solve the problem (10). By introducing an augmented variable , the problem (10) is reformulated as a linear equality-constrained problem with two variables and :


Since the objective function is separable w.r.t. the two variables, the problem (11) can be solved under the alternating direction method of multipliers (ADMM) [34] framework. The augmented Lagrangian function of (11) is:


where is the augmented Lagrangian multiplier and is the penalty parameter. We initialize the matrix variables , , and to be comfortable zero matrices and . Denote by () and the optimization variables and Lagrange multiplier at iteration (), respectively. By taking derivatives of the Lagrangian function w.r.t. and , and setting the derivatives to be zeros, we can alternatively update the variables as follows:
(1) Update by fixing and :


This is a two-sided weighted least squares regression problem with the solution satisfying that




Eq. (14) is a standard Sylvester equation (SE) which has a unique solution if and only if , where

denotes the spectrum, i.e., the set of eigenvalues, of the matrix

[41]. We can rewrite the SE (14) as


and the solution (if existed) can be obtained via , where is the inverse of the vec-operator . Detailed theoretical analysis on the existence of the unique solution is given in Section 3.1.
(2) Update by fixing and :


This problem has a closed-form solution as


where is the soft-thresholding operator.
(3) Update by fixing and :


(4) Update : , where .

The above alternative updating steps are repeated until the convergence condition is satisfied or the number of iterations exceeds a preset threshold . The ADMM algorithm converges when , , and are simultaneously satisfied, where is a small tolerance number. We summarize the updating procedures in Algorithm 1.


Algorithm 1: Solve the TWSC Model (4) via ADMM
Input: , , Tol, ;
Initialization: , , , T = False;
While (T == false) do
1. Update by solving Eq. (13);
2. Update by soft thresholding (18);
3. Update by Eq. (19);
4. Update by , where ;
5. ;
if (Converged) or ()
6.  T True;
end if
end while
Output: Matrices and .


Convergence Analysis. The convergence of Algorithm 1 can be guaranteed since the overall objective function (11) is convex with a global optimal solution. In Fig. 3, we can see that the maximal values in , , approach to simultaneously in 50 iterations.

Figure 3: The convergence curves of maximal values in entries of (blue line), (red line), and (yellow line). The test image is the image in Fig. 2 (a).

2.4 The Denoising Algorithm

Given a noisy color image, suppose that we have extracted local patches and their similar patches. Then noisy patch matrices can be formed to estimate the clean patch matrices . The patches in matrices are aggregated to form the denoised image . To obtain better denoising results, we perform the above denoising procedures for several (e.g., ) iterations. The proposed TWSC scheme based real-world image denoising algorithm is summarized in Algorithm 2.


Algorithm 2: Image Denoising by TWSC
Input: Noisy image , , ;
Initialization: , ;
for do
1. Set ;
2. Extract local patches from ;
for each patch do
3. Search nonlocal similar patches ;
4. Apply the TWSC scheme (4) to and obtain the estimated ;
end for
5. Aggregate to form the image ;
end for
Output: Denoised image .


3 Existence and Faster Solution of Sylvester Equation

The solution of the Sylvester equation (SE) (14) does not always exist, though the solution is unique if it exists. Besides, solving SE (14) is usually computationally expensive in high dimensional cases. In this section, we provide a sufficient condition to guarantee the existence of the solution to SE (14), as well as a faster solution of (14) to save the computational cost of Algorithms 1 and 2.

3.1 Existence of the Unique Solution

Before we prove the existence of unique solution of SE (14), we first introduce the following theorem.

Theorem 3.1

Assume that , are both symmetric and positive semi-definite matrices. If at least one of is positive definite, the Sylvester equation has a unique solution for .

The proof of Theorem 3.1 can be found in the supplementary file. Then we have the following corollary.

Corollary 1

The SE (14) has a unique solution.


Since in (14) are both symmetric and positive definite matrices, according to Theorem 3.1, the SE (14) has a unique solution.

3.2 Faster Solution

The solution of the SE (14) is typically obtained by the Bartels-Stewart algorithm [42]. This algorithm firstly employs a QR factorization [43]

, implemented via Gram-Schmidt process, to decompose the matrices

and into Schur forms, and then solves the obtained triangular system by the back-substitution method [44]. However, since the matrices and are of dimensions, it is computationally expensive () to calculate their QR factorization to obtain the Schur forms. By exploiting the specific properties of our problem, we provide a faster while exact solution for the SE (14).

Since the matrices in (14) are symmetric and positive definite, the matrix can be eigen-decomposed as , with computational cost of . Left multiply both sides of the SE (14) by , we can get . This can be viewed as an SE w.r.t. the matrix , with a unique solution . Since the matrix is diagonal and positive definite, its inverse can be calculated on each diagonal element of . The computational cost for this step is . Finally, the solution can be obtained via . By this way, the complexity for solving the SE (14) is reduced from to , which is a huge computational saving.

4 Experiments

To validate the effectiveness of our proposed TWSC scheme, we apply it to both synthetic additive white Gaussian noise (AWGN) corrupted images and real-world noisy images captured by CCD or CMOS cameras. To better demonstrate the roles of the three weight matrices in our model, we compare with a baseline method, in which the weight matrices are set as comfortable identity matrices, while the matrix is set as in (8). We call this baseline method the Weighted Sparse Coding (WSC).

4.1 Experimental Settings

Noise Level Estimation. For most image denoising algorithms, the standard deviation (std) of noise should be given as a parameter. In this work, we provide an exploratory approach to solve this problem. Specifically, the noise std of channel can be estimated by some noise estimation methods [45, 46, 47]. In Algorithm 2, the noise std for the th patch of can be initialized as


and updated in the following iterations as


where is the th column in the patch matrix , and is the th patch recovered in previous iteration (please refer to Section 2.4).

Implementation Details. We empirically set the parameter and . The maximum number of iteration is set as . The window size for similar patch searching is set as . For parameters , , , we set , , for ; , , for ; , , for ; , , for . All parameters are fixed in our experiments. We will release the code with the publication of this work.




15 PSNR 32.42 32.27 32.19 32.43 32.27 32.59 32.06 32.34
SSIM 0.8860 0.8849 0.8814 0.8841 0.8815 0.8879 0.8673 0.8846
25 PSNR 30.02 29.84 29.76 30.05 29.87 30.22 29.57 29.98
SSIM 0.8364 0.8329 0.8293 0.8365 0.8314 0.8415 0.8179 0.8372
35 PSNR 28.48 28.26 28.17 28.51 28.33 28.66 28.01 28.49
SSIM 0.7969 0.7908 0.7855 0.7958 0.7907 0.8021 0.7765 0.7987
50 PSNR 26.85 26.64 26.55 26.92 26.75 27.08 26.35 26.93
SSIM 0.7481 0.7405 0.7391 0.7499 0.7415 0.7563 0.7258 0.7530
75 PSNR 24.74 24.77 24.66 25.15 24.97 25.24 24.54 25.15
SSIM 0.6649 0.6746 0.6793 0.6903 0.6801 0.6931 0.6612 0.6949


Table 1: Average results of PSNR(dB) and SSIM of different denoising algorithms on 20 grayscale images corrupted by AWGN noise.

4.2 Results on AWGN Noise Removal

We first compare the proposed TWSC scheme with the leading AWGN denoising methods such as BM3D-SAPCA [9] (which usually performs better than BM3D [7]), LSSC [10], NCSR [11], WNNM [13], TNRD [19], and DnCNN [20] on 20 grayscale images commonly used in [7]. Note that TNRD and DnCNN are both discriminative learning based methods, and we use the models trained originally by the authors. Each noisy image is generated by adding the AWGN noise to the clean image, while the std of the noise is set as in this paper. Note that in this experiment we set the weight matrix since the input images are grayscale.

The averaged PSNR and SSIM [48] results are listed in Table 1. One can see that the proposed TWSC achieves comparable performance with WNNM, TNRD and DnCNN in most cases. It should be noted that TNRD and DnCNN are trained on clean and synthetic noisy image pairs, while TWSC only utilizes the tested noisy image. Besides, one can see that the proposed TWSC works much better than the baseline method WSC, which proves that the weight matrix can characterize better the noise statistics in local image patches. Due to limited space, we leave the visual comparisons of different methods in the supplementary file.

4.3 Results on Realistic Noise Removal

We evaluate the proposed TWSC scheme on three publicly available real-world noisy image datasets [49, 24, 29].

Dataset 1 is provided in [49], which includes around 20 real-world noisy images collected under uncontrolled environment. Since there is no “ground truth” of the noisy images, we only compare the visual quality of the denoised images by different methods.

Dataset 2 is provided in [24], which includes noisy images of 11 static scenes captured by Canon 5D Mark 3, Nikon D600, and Nikon D800 cameras. The real-world noisy images were collected under controlled indoor environment. Each scene was shot 500 times under the same camera and camera setting. The mean image of the 500 shots is roughly taken as the “ground truth”, with which the PSNR and SSIM [48] can be computed. 15 images of size were cropped to evaluate different denoising methods. Recently, some other datasets such as [50] are also constructed by employing the strategies of this dataset.

Dataset 3 is called the Darmstadt Noise Dataset (DND) [29], which includes 50 different pairs of images of the same scenes captured by Sony A7R, Olympus E-M10, Sony RX100 IV, and Huawei Nexus 6P. The real-world noisy images are collected under higher ISO values with shorter exposure time, while the “ground truth” images are captured under lower ISO values with adjusted longer exposure times. Since the captured images are of megapixel-size, the authors cropped 20 bounding boxes of pixels from each image in the dataset, yielding 1000 test crops in total. However, the “ground truth” images are not open access, and we can only submit the denoising results to the authors’ Project Website and get the PSNR and SSIM [48] results.

Comparison Methods. We compare the proposed TWSC method with CBM3D [8], TNRD [19], DnCNN [20], the commercial software Neat Image (NI) [25], the state-of-the-art real image denoising methods “Noise Clinic” (NC) [22], CC [24], and MCWNNM [26]. We also compare with the baseline method WSC described in Section 4 as a baseline. The methods of CBM3D and DnCNN can directly deal with color images, and the input noise std is set by Eq. (20). For TNRD, MCWNNM, and TWSC, we use [46] to estimate the noise std () for each channel. For blind mode DnCNN, we use its color version provided by the authors and there is no need to estimate the noise std. Since TNRD is designed for grayscale images, we applied them to each channel of real-world noisy images. TNRD achieves its best results when setting the noise std of the trained models at on these datasets.

(a) Noisy

(b) CBM3D (f) NC (c) TNRD (g) MCWNNM (d) DnCNN (h) WSC (e) NI (i) TWSC
Figure 4: Denoised images of the real noisy image Dog [49] by different methods. Note that the ground-truth clean image of the noisy input is not available.

Results on Dataset 1. Fig. 4 shows the denoised images of “Dog” (the method CC [24] is not compared since its testing code is not available). One can see that CBM3D, TNRD, DnCNN, NI and NC generate some noise-caused color artifacts across the whole image, while MCWNNM and WSC tend to over-smooth a little the image. The proposed TWSC removes more clearly the noise without over-smoothing much the image details. These results demonstrate that the methods designed for AWGN are not effective for realistic noise removal. Though NC and NI methods are specifically developed for real-world noisy images, their performance is not satisfactory. In comparison, the proposed TWSC works much better in removing the noise while maintaining the details (see the zoom-in window in “Dog”) than the other competing methods. More visual comparisons can be found in the supplementary file.

Results on Dataset 2. The average PSNR and SSIM results on the 15 cropped images by competing methods are listed in Table 2. One can see that the proposed TWSC is much better than other competing methods, including the baseline method WSC and the recently proposed CC, MCWNNM. Fig. 5 shows the denoised images of a scene captured by Nikon D800 at ISO = 6400. One can see that the proposed TWSC method results in not only higher PSNR and SSIM measures, but also much better visual quality than other methods. Due to limited space, we do not show the results of baseline method WSC in visual quality comparison. More visual comparisons can be found in the supplementary file.


PSNR 35.19 36.61 33.86 35.49 36.43 36.88 37.70 37.36 37.81
SSIM 0.8580 0.9463 0.8635 0.9126 0.9364 0.9481 0.9542 0.9516 0.9586


Table 2: Average results of PSNR(dB) and SSIM of different denoising methods on 15 cropped real-world noisy images used in [24].

Results on Dataset 3. In Table 3, we list the average PSNR and SSIM results of the competing methods on the 1000 cropped images in the DND dataset [29]. We can see again that the proposed TWSC achieves much better performance than the other competing methods. Note that the “ground truth” images of this dataset have not been published, but one can submit the denoised images to the project website and get the PSNR and SSIM results. More results can be found in the website of the DND dataset (https://noise.visinf.tu-darmstadt.de/benchmark/#results_srgb). Fig. 6 shows the denoised images of a scene captured by a Nexus 6P camera. One can still see that the proposed TWSC method results better visual quality than the other denoising methods. More visual comparisons can be found in the supplementary file.


PSNR 32.14 34.15 32.41 35.11 36.07 37.38 36.81 37.94
SSIM 0.7773 0.8271 0.7897 0.8778 0.9013 0.9294 0.9165 0.9403


Table 3: Average results of PSNR(dB) and SSIM of different denoising methods on 1000 cropped real-world noisy images in [29].
(a) Noisy 29.63dB/0.7107 (f) Ground Truth (b) CBM3D 31.12dB/0.7948 (g) NC 33.49dB/0.9024 (c) TNRD 32.80dB/0.8959 (h) CC 34.61dB/0.9206 (d) DnCNN 29.83dB/0.7204 (i) MCWNNM 34.80dB/0.9217 (e) NI 31.28dB/0.7781 (j) TWSC 35.47dB/0.9369
Figure 5: Denoised images of the real noisy image Nikon D800 ISO 6400 1 [24] by different methods. This scene was shot 500 times under the same camera and camera setting. The mean image of the 500 shots is roughly taken as the “Ground Truth”.

(a) Noisy

(b) CBM3D (f) NC (c) TNRD (g) MCWNNM (d) DnCNN (h) WSC (e) NI (i) TWSC
Figure 6: Denoised images of the real noisy image “0001_2” captured by Nexus 6P [29] by different methods. Note that the ground-truth clean image of the noisy input is not publicly released yet.

Comparison on Speed. We compare the average computational time (second) of different methods (except CC) to process one image on the DND Dataset [29]. The results are shown in Table 4. All experiments are run under the Matlab2014b environment on a machine with Intel(R) Core(TM) i7-5930K CPU of 3.5GHz and 32GB RAM. The fastest speed is highlighted in bold. One can see that Neat Image (NI) is the fastest and it spends about 1.1 second to process an image, while the proposed TWSC needs about 195 seconds. Noted that Neat Image is a highly-optimized software with parallelization, CBM3D, TNRD, and NC are implemented with compiled C++ mex-function and with parallelization, while DnCNN, MCWNNM, and the proposed WSC and TWSC are implemented purely in Matlab.


Time 6.9 5.2 79.5 1.1 15.6 208.1 188.6 195.2


Table 4: Average computational time (s) of different methods to process a image in the DND dataset [29].

4.4 Visualization of The Weight Matrices

The three diagonal weight matrices in the proposed TWSC model (4) have clear physical meanings, and it is interesting to analyze how the matrices actually relate to the input image by visualizing the resulting matrices. To this end, we applied TWSC to the real-world (estimated noise stds of R/G/B: 11.4/14.8/18.4) and synthetic AWGN (std of all channels: 25) noisy images shown in Fig. 1. The final diagonal weight matrices for two typical patch matrices () from the two images are visualized in Fig. 7. One can see that the matrix reflects well the noise levels in the images. Though matrix is initialized as an identity matrix, it is changed in iterations since noise in different patches are removed differently. For real-world noisy images, the noise levels of different patches in are different, hence the elements of vary a lot. In contrast, the noise levels of patches in the synthetic noisy image are similar, thus the elements of are similar. The weight matrix is basically determined by the patch structure but not noise, and we do not plot it here.

Figure 7: Visualization of weight matrices and on the real-world noisy image (left) and the synthetic noisy image (right) shown in Fig. 1.

5 Conclusion

The realistic noise in real-world noisy images captured by CCD or CMOS cameras is very complex due to the various factors in digital camera pipelines, making the real-world image denoising problem much more challenging than additive white Gaussian noise removal. We proposed a novel trilateral weighted sparse coding (TWSC) scheme to exploit the noise properties across different channels and local patches. Specifically, we introduced two weight matrices into the data-fidelity term of the traditional sparse coding model to adaptively characterize the noise statistics in each patch of each channel, and another weight matrix into the regularization term to better exploit sparsity priors of natural images. The proposed TWSC scheme was solved under the ADMM framework and the solution to the Sylvester equation is guaranteed. Experiments demonstrated the superior performance of TWSC over existing state-of-the-art denoising methods, including those methods designed for realistic noise in real-world noisy images.


  • [1] Zhu, L., Fu, C.W., Brown, M.S., Heng, P.A.: A non-local low-rank framework for ultrasound speckle reduction. In: CVPR. (2017) 5650–5658
  • [2] Granados, M., Kim, K., Tompkin, J., Theobalt, C.: Automatic noise modeling for ghost-free hdr reconstruction. ACM Trans. Graph. 32(6) (2013) 1–10
  • [3] Nguyen, A., Yosinski, J., Clune, J.:

    Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.

    In: CVPR. (2015) 427–436
  • [4] Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing 15(12) (2006) 3736–3745
  • [5] Mairal, J., Elad, M., Sapiro, G.: Sparse representation for color image restoration. IEEE Transactions on Image Processing, 17(1) (2008) 53–69
  • [6] Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: CVPR. (2005) 60–65
  • [7] Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing 16(8) (2007) 2080–2095
  • [8] Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Color image denoising via sparse 3D collaborative filtering with grouping constraint in luminance-chrominance space. In: ICIP, IEEE (2007) 313–316
  • [9] Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.:

    Bm3d image denoising with shape-adaptive principal component analysis.

    In: SPARS. (2009)
  • [10] Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. In: ICCV. (2009) 2272–2279
  • [11] Dong, W., Zhang, L., Shi, G., Li, X.: Nonlocally centralized sparse representation for image restoration. IEEE Transactions on Image Processing 22(4) (2013) 1620–1630
  • [12] Dong, W., Shi, G., Li, X.: Nonlocal image restoration with bilateral variance estimation: A low-rank approach. IEEE Transactions on Image Processing 22(2) (2013) 700–711
  • [13] Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: CVPR, IEEE (2014) 2862–2869
  • [14] Xu, J., Zhang, L., Zuo, W., Zhang, D., Feng, X.: Patch group based nonlocal self-similarity prior learning for image denoising. In ICCV (2015) 244–252
  • [15] Roth, S., Black, M.J.: Fields of experts. International Journal of Computer Vision 82(2) (2009) 205–229
  • [16] Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: ICCV. (2011) 479–486
  • [17] Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: Can plain neural networks compete with BM3D? In CVPR (2012) 2392–2399
  • [18] Schmidt, U., Roth, S.: Shrinkage fields for effective image restoration. In: CVPR. (June 2014) 2774–2781
  • [19] Chen, Y., Yu, W., Pock, T.: On learning optimized reaction diffusion processes for effective image restoration. In CVPR (2015) 5261–5269
  • [20] Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing (2017)
  • [21] Liu, C., Szeliski, R., Kang, S.B., Zitnick, C.L., Freeman, W.T.: Automatic estimation and removal of noise from a single image. IEEE TPAMI 30(2) (2008) 299–314
  • [22] Lebrun, M., Colom, M., Morel, J.M.: Multiscale image blind denoising. IEEE Transactions on Image Processing 24(10) (2015) 3149–3161
  • [23] Zhu, F., Chen, G., Heng, P.A.: From noise modeling to blind image denoising. In CVPR (June 2016)
  • [24] Nam, S., Hwang, Y., Matsushita, Y., Kim, S.J.: A holistic approach to cross-channel image noise modeling and its application to image denoising. In CVPR (2016) 1683–1691
  • [25] ABSoft, N.: Neat Image. https://ni.neatvideo.com/home
  • [26] Xu, J., Zhang, L., Zhang, D., Feng, X.: Multi-channel weighted nuclear norm minimization for real color image denoising. In: ICCV. (2017)
  • [27] Xu, J., Ren, D., Zhang, L., Zhang, D.: Patch group based bayesian learning for blind image denoising. Asian Conference on Computer Vision (ACCV) New Trends in Image Restoration and Enhancement Workshop (2016) 79–95
  • [28] Xu, J., Zhang, L., Zhang, D.: External prior guided internal prior learning for real-world noisy image denoising. IEEE Transactions on Image Processing 27(6) (June 2018) 2996–3010
  • [29] Plötz, T., Roth, S.: Benchmarking denoising algorithms with real photographs. In: CVPR. (2017)
  • [30] Kervrann, C., Boulanger, J., Coupé, P.: Bayesian non-local means filter, image redundancy and adaptive dictionaries for noise removal. International Conference on Scale Space and Variational Methods in Computer Vision (2007) 520–532
  • [31] Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.:

    Robust face recognition via sparse representation.

    IEEE TPAMI 31(2) (2009) 210–227
  • [32] Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In CVPR (2009) 1794–1801
  • [33] Yang, J., Wright, J., Huang, T., Ma, Y.:

    Image super-resolution via sparse representation.

    IEEE Transactions on Image Processing 19(11) (2010) 2861–2873
  • [34] Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1) (January 2011) 1–122
  • [35] Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996) 267–288
  • [36] Khashabi, D., Nowozin, S., Jancsary, J., Fitzgibbon, A.W.: Joint demosaicing and denoising via learned nonparametric random fields. IEEE Transactions on Image Processing 23(12) (2014) 4968–4981
  • [37] Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3) (1936) 211–218
  • [38] Leung, B., Jeon, G., Dubois, E.: Least-squares luma-chroma demultiplexing algorithm for bayer demosaicking. IEEE Transactions on Image Processing 20(7) (2011) 1885–1894
  • [39] McCullagh, P.: Generalized linear models. European Journal of Operational Research 16(3) (1984) 285–292
  • [40] Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming 55(1) (1992) 293–318
  • [41] Simoncini, V.: Computational methods for linear matrix equations. SIAM Review 58(3) (2016) 377–441
  • [42] Bartels, R.H., Stewart, G.W.: Solution of the matrix equation AX + XB = C. Commun. ACM 15(9) (1972) 820–826
  • [43] Golub, G., Van Loan, C.: Matrix Computations (3rd Ed.). Johns Hopkins University Press (1996)
  • [44] Bareiss, E.: Sylvester’s identity and multistep integer-preserving gaussian elimination. Mathematics of Computation 22(103) (1968) 565–578
  • [45] Liu, X., Tanaka, M., Okutomi, M.: Single-image noise level estimation for blind denoising. IEEE Transactions on Image Processing 22(12) (2013) 5226–5237
  • [46] Chen, G., Zhu, F., Pheng, A.H.: An efficient statistical method for image noise level estimation. In ICCV (December 2015)
  • [47] Sutour, C., Deledalle, C.A., Aujol, J.F.: Estimation of the noise level function based on a nonparametric detection of homogeneous image regions. SIAM Journal on Imaging Sciences 8(4) (2015) 2622–2661
  • [48] Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13(4) (2004) 600–612
  • [49] Lebrun, M., Colom, M., Morel, J.M.: The noise clinic: a blind image denoising algorithm. http://www.ipol.im/pub/art/2015/125/ Accessed 01 28, 2015.
  • [50] Xu, J., Li, H., Liang, Z., Zhang, D., Zhang, L.: Real-world noisy image denoising: A new benchmark. CoRR abs/1804.02603 (2018)