Artifact reduction for separable non-local means

10/26/2017 ∙ by Sanjay Ghosh, et al. ∙ indian institute of science 0

It was recently demonstrated [J. Electron. Imaging, 25(2), 2016] that one can perform fast non-local means (NLM) denoising of one-dimensional signals using a method called lifting. The cost of lifting is independent of the patch length, which dramatically reduces the run-time for large patches. Unfortunately, it is difficult to directly extend lifting for non-local means denoising of images. To bypass this, the authors proposed a separable approximation in which the image rows and columns are filtered using lifting. The overall algorithm is significantly faster than NLM, and the results are comparable in terms of PSNR. However, the separable processing often produces vertical and horizontal stripes in the image. This problem was previously addressed by using a bilateral filter-based post-smoothing, which was effective in removing some of the stripes. In this letter, we demonstrate that stripes can be mitigated in the first place simply by involving the neighboring rows (or columns) in the filtering. In other words, we use a two-dimensional search (similar to NLM), while still using one-dimensional patches (as in the previous proposal). The novelty is in the observation that one can use lifting for performing two-dimensional searches. The proposed approach produces artifact-free images, whose quality and PSNR are comparable to NLM, while being significantly faster.



There are no comments yet.


page 3

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider the problem of denoising grayscale images corrupted with additive white Gaussian noise. A popular denoising method is the non-local means (NLM) algorithm [1], where image patches are used to perform pixel aggregation. While NLM is no longer the state-of-the-art, it is still used in the image processing community due to its simplicity, decent denoising performance, and the availability of fast implementations. The NLM of an image , where , is given by [1]


where is a search window around the pixel of interest. The weights are set to be


where is a smoothing parameter and is a two-dimensional patch.

A direct implementation of (1) has the per-pixel complexity of , where and are typically in the range and .[1] Several computational tricks and approximations have been proposed to speedup the direct implementation. [2, 3, 4, 5, 6, 7, 8]. A particular means to speed up NLM is using a separable approximation, which in fact is a standard trick in the image processing literature [9, 11, 12, 10]. In separable filtering, the rows are processed first followed by the columns (or in the reverse order). Of course, if the original filter is non-separable, then the output of separable filtering is generally different from that of the original filter, since a natural image typically contains diagonal details [12]. This is the case with NLM since expression (2) is not separable. The present focus is on a recent separable approximation of NLM.[13] At the core of this proposal is a method called lifting, which computes the NLM of a one-dimensional signal using operations per sample. In other words, the complexity of lifting is independent of the patch length . Extending lifting for NLM denoising of images, however, turns out to be a difficult task. Therefore, we proposed a separable approximation, called separable NLM (SNLM)[13], in which the rows and columns of the image are independently filtered using lifting. In particular, we separately computed the “rows-then-columns” and “columns-then-rows” filtering, which were then optimally combined. The per-pixel complexity of SNLM is , which is a dramatic reduction compared to the complexity of NLM.

A flip side of SNLM (as is the case with other separable formulations[14]) is that often vertical and horizontal stripes are induced in the processed image. The stripes are more prominent along the last filtered dimension.[14] In SNLM, this problem was alleviated using the optimal recombination mentioned above followed by a bilateral filter-based post-smoothing. In this work, we demonstrate that the stripes can be mitigated in the first place simply by involving the neighboring rows (or columns) in the filtering. In other words, we use a two-dimensional search (similar to classical NLM[1]), while still using one-dimensional patches (as done previously [13]). The present novelty is in the observation that one can use lifting for performing a two-dimensional search. In particular, the per-pixel complexity of the proposed approach is , which is higher than our previous proposal, but still substantially lower than that of classical NLM. Importantly, the proposed approach no longer exhibits the visible artifacts that are otherwise obtained using SNLM.

The rest of the paper is organized as follows. We recall the SNLM algorithm in Section 2 and its fast implementation using lifting. We also illustrate the artifact problem with an example. The proposed solution is presented in Section 3, along with some algorithmic details. In Section 4, we report the denoising performance of our approach and compare it with classical NLM and SNLM. We end the paper with some concluding remarks in Section 5.

Figure 1: Denoising of Peppers

at noise standard deviation

. We see stripes in (c) in which both the patch and search window are one-dimensional (both are along rows). As seen in (d), the stripes can however be reduced using a two-dimensional search in place of the one-dimensional counterpart (though we still see some noise). The image obtained by further processing (d) using a two-dimensional search and one-dimensional patches (along columns) is shown in (e). The visual quality and PSNR (mentioned below each image) of (e) is comparable to NLM. In (f), we have reversed the order of processing: we first use one-dimensional patches along columns and then along rows (the search is two-dimensional). Notice that the order (RC/CR) has no visible impact on the final output. Also notice that residual stripes can be seen in SNLM.

2 Separable Non-Local Means

To set up the context, we briefly recall the SNLM algorithm[13]. Suppose we have a one-dimensional signal , corresponding to a row or column. The one-dimensional analogue of (1) is given by




where and is a smoothing parameter. In other words, both the search window and patch are one-dimensional in this case. It was observed in our previous work that the weights can be computed using operations with respect to . In particular, consider the matrices:




We see that is the smoothed version of , obtained by box filtering along its sub-diagonals. The important observation[13] is that we can write


In particular, using this so-called lifting, we can compute the patch distance using just three samples of , one multiplication, and two additions. The computational gain comes from the fact that the box filtering in (6) can be computed using operations with respect to using recursions[13]. Moreover, following the observation that not all samples of are used in (3), an efficient mechanism for computing (and storing) just the required samples was proposed[13]. The per-pixel complexity of computing (3) using lifting reduces to from the brute-force complexity of . Unfortunately, extending lifting to handle two-dimensional patches turns out to be difficult. Instead, we proposed to use separable filtering, where the rows (columns) are filtered using (3) followed by the columns (row). The two distinct outputs are then optimally combined to get the final image. In fact, the reason behind the averaging was to suppress artifacts in the form of stripes arising from the separable filtering. This is demonstrated with an example in Fig. 1, where we have compared NLM, SNLM, and the proposed approach. We used bilateral filtering to remove the stripes in SNLM, at an additional cost. However, the final image still has some residual artifacts.

3 Proposed Approach

We see less stripes in Fig. 1(d) precisely because we use a two-dimensional search. In other words, we use a cross between classical NLM and SNLM in which we use (8) for the aggregation and (4) for the weights. The two-dimensional search results in the averaging of pixels from across rows (and columns). This does not happen in SNLM, which causes the stripes to appear in Fig. 1(c).

Figure 2: Illustration of the idea behind the proposed method (see text for details).

The working of our proposal is explained in Fig 2. The pixel of interest in this case is the pixel at position marked with a red dot. The search window of length is marked with a green bounding box. Two neighboring pixels at locations and are marked with red dots. The former pixel is on a neighboring row, while the latter is on the same row as the pixel of interest. Similar to SNLM [13], we can consider either horizontal or vertical patches. For our example, the patches (of length ) are aligned with the image rows; they are marked with light blue rectangles. For our proposal, the denoising at is performed using the formula:




where and . To compute (8), we group the neighboring patches into two categories: (i) patches with row index , e.g., patch in Fig. 2, and (ii) patches with a different row index, e.g., patch in the figure. Let and be the -th and -th row, where is the length of a row (see Fig 2). Similar to (5) and (6), we define the matrices:


and the corresponding matrices , and , where, for example,


As in (7), the (squared) distance between patches centered at and is


On the other hand, the distance between patches centered at and is


In other words, we can compute the distance between patches centered at and using . To compute the distance between patches centered at and , we require the matrices , , and . Moreover, using these matrices, we can compute patch distances for different , and , provided the row index of and is , and the row index of is . Thus, an efficient way of computing (8) is to sequentially process the rows. For each row (fixed ), we compute , , and , where corresponds to neighboring rows that are separated by at most . We compute matrices of the form and another matrices of the form . As mentioned in Section 2, we can compute each matrix using operations with respect to . Moreover, as per the sum in (3), we only require entries within the diagonal band of each matrix. The cost of computing the banded entries is thus for each matrix. The overall cost of processing rows is . The per-pixel complexity of computing (8) using the proposed approach is thus . We can efficiently compute (and store) the banded entries using the method in Section 2.2 of the original paper[13]. The main difference with SNLM is that we require a total of matrices for processing each row; whereas, just one matrix is required in SNLM. As shown in Fig. 1(d), some residual noise can still be seen after the processing mentioned above. We perform a similar processing once more, except this time we use one-dimensional patches along columns. The visual quality and PSNR of the final image (Fig. 1(e)) are comparable to NLM (Fig. 1(h)). Moreover, we see from Figs. 1(e) and 1(f) that if we first use one-dimensional patches along columns and then along rows, then the outputs are similar. We empirically corroborate these observations in the next section. Therefore, we propose to first process the rows using (8) and then process the columns of the intermediate image using (8). A precise description of the proposed approach for processing the (noisy) image along rows using lifting is provided in Algorithm 1. We then perform column processing on the intermediate image to obtain the final output of our algorithm. That is, we simply apply Algorithm 1 on the intermediate image, where we logically switch the rows and columns in the algorithm. Suppose and are the corresponding search windows for the row-aligned and column-aligned processing. Then we set the search parameter in Algorithm 1 as: for the row-aligned processing, and for the column-aligned processing.

Data: Image of size , and parameters .
Result: Row-processed image of size given by (8).
1 for  do
2       % lifting
3       for  do
4             ;
6       end for
7      Compute matrices and using (10) and (11);
8       for  do
9             for  do
10                   ;
12             end for
13            Compute matrices , and using (10) and (11);
15       end for
16      % weight computation and pixel aggregation
17       for  do
18             Set and ;
19             for  do
20                   for  do
21                         Compute weight using (12) and (9);
22                         ;
23                         ;
25                   end for
27             end for
28            for  do
29                   for  do
30                         Compute weight using (13) and (9);
31                         ;
32                         ;
34                   end for
36             end for
37            .
38       end for
40 end for
Algorithm 1 Proposed processing along rows using lifting.
5 10 20 30 50 5 10 20 30 50
Method House () Montage ()
Noisy 34.1/83 28.1/60 22.1/34 18.6/22 14.2/12 34.2/83 28.1/61 22.1/36 18.6/25 14.1/15
NLM [1] 36.9/90 34.1/87 29.7/82 26.8/77 24.0/69 39.1/97 34.3/94 29.6/89 26.5/85 22.2/76
Darbon et al. [4] 36.1/90 31.4/75 26.1/51 22.8/36 18.6/21 38.4/89 30.9/76 25.8/52 22.6/38 18.6/24
SNLM [13] 36.6/89 33.6/86 29.4/81 26.5/76 23.7/69 39.3/97 34.6/94 29.7/89 26.7/84 22.5/77
Proposed 36.6/89 34.1/86 30.4/82 27.3/77 24.1/70 39.4/97 34.8/94 30.2/90 27.3/86 23.4/79
BM3D [19] 38.6/95 34.7/93 31.3/88 29.3/85 26.6/78 41.1/98 37.3/96 33.5/94 31.2/91 27.4/85
Method Boat () Man ()
Noisy 34.1/97 28.1/90 22.1/73 18.6/59 14.2/41 34.1/99 28.1/97 22.1/91 18.6/85 14.1/70
NLM [1] 35.1/96 30.8/89 26.7/78 24.7/70 23.0/62 35.3/98 31.1/95 27.5/88 25.8/83 24.2/76
Darbon et al. [4] 34.4/97 30.3/94 25.4/82 22.4/71 18.4/53 35.1/97 30.4/98 25.6/95 22.5/90 18.5/80
SNLM [13] 35.0/96 30.7/89 26.6/77 24.5/69 22.7/61 35.3/98 31.0/95 27.2/87 25.4/83 23.8/75
Proposed 34.9/96 30.7/89 26.8/77 24.7/70 22.9/62 35.1/98 31.1/95 27.5/88 25.8/ 84 24.2/78
BM3D [19] 37.3/98 33.9/96 30.8/92 29.0/88 26.7/81 37.3/99 34.1/98 31.2/96 29.5/94 27.4/89

Table 1: Comparison of the denoising performances on various images [15] in terms of PSNR/SSIM at various noise standard deviations . The PSNRs are rounded to one decimal place, while the SSIMs (in ) are rounded to integer.
7 10 12 7 10 12
NLM [1] 44 87 124 45 88 126
Darbon et al. [4] 0.33 0.60 0.84 0.33 0.62 0.85
SNLM [13] 0.31 0.39 0.45 0.32 0.40 0.46
Proposed 1.20 2.30 3.20 1.30 2.40 3.30

Table 2: Comparison of the run-time (in seconds) of the proposed approach with classical NLM for a image. The computations were performed using Matlab on a 3.40 GHz Intel quad-core machine with 32 GB memory.
Figure 3: Denoising of Man [15] at . Notice that stripes can be seen in (e) after smoothing (d) using a bilateral-filter. The PSNR/SSIM values with reference to the clean image are also provided. The result from our proposal (f) is visually similar to classical NLM (c). The runtime for NLM, SNLM, and the proposed method are 335, 1.7, and 9.8 seconds. We used the parameter settings mentioned in the main text. The PSNR/SSIM between the proposed approximation (f) and the classical NLM (c) are , whereas the values are for SNLM (d).
Figure 4: Denoising of kodim23 [16] at . The result obtained through our proposal (h) is visually similar to classical NLM (c). The runtime for NLM, SNLM, and the proposed method are 335, 1.7, and 9.8 seconds. The PSNR/SSIM between the proposed approximation (h) and the classical NLM (c) are , whereas these values are for (d) and for (g). We have zoomed the region around the beak in (c), (d), (g), and (h). We can see some artifacts in (d) and residual noise in (g); the zooms in (c) and (h) are visually indistinguishable.

4 Experiments

The denoising performance of the proposed method is compared with NLM and SNLM in Table 1. We have used standard grayscale images from [15, 16] for our experiments. The Matlab implementation used to generate the results in this section is publicly available222 The search windows for the three methods were set as follows. Suppose be the search window for NLM (which we take as reference). Following the original proposal[13], the window for SNLM is also set as . For a fair comparison with NLM, we ensure that equal number of pixel are averaged in both methods. This is achieved if . Moreover, following[14], we set . These equations uniquely determine and (up to an integer rounding). Moreover, we normalize the smoothing parameters in (2) and (9) using the relation . For the results in Table 1, we set , , , , and . We notice from Table 1 that the proposed approach gives comparable results in terms of PSNR and SSIM [17]. A visual comparison of the denoising results is provided in Fig. 3 and 4. We can clearly see some stripes in the images obtained using SNLM, both with and without post-processing (see the boxed areas). In contrast, there is hardly any artifacts present in the denoised image obtained using our method. A timing comparison is provided in Table 2. While the proposed method is slower than SNLM (this is the price we pay for removing the stripes), it is nevertheless significantly faster than NLM.

We note that though Darbon et al. [4]

is generally faster than our current proposal, its denoising performance starts deteriorating with the increase in noise variance. This is evident from Table

1 and Fig. 4. We also note that NLM and SNLM fall short of KSVD [18] and BM3D [19] in terms of denoising performance. Nevertheless, NLM continues to be of interest due to its decent denoising capability[20, 21, 22, 23], and importantly, the availability of fast approximations. As reported by other authors[24], NLM is quite effective in preserving fine details, while successfully removing noise.

5 Conclusion

We proposed a method that uses the idea of lifting from previous work[13] to perform fast non-local means denoising of images. The proposed method does not give rise to undesirable artifacts (as was the case with the original proposal), and produces images whose denoising quality and PSNR/SSIM are comparable to non-local means. While this comes at the expense of added computation, the proposed method nevertheless is much faster than non-local means. In fact, the speedup is about x for practical parameter settings.

6 Acknowledgements

The last author was supported by a Startup Grant from IISc and EMR Grant SB/S3/EECE/281/2016 from DST, Government of India.


  • [1] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image denoising,”

    Proc. IEEE Conference on Computer Vision and Pattern Recognition

    , 2, pp. 60-65 (2005).
  • [2] M. Mahmoudi and G. Sapiro, “Fast image and video denoising via nonlocal means of similar neighborhoods,” IEEE Signal Processing Letters, 12(12), pp. 839-842 (2005).
  • [3] J. Wang, Y. Guo, Y. Ying, Y. Liu, and Q. Peng, “Fast non-local algorithm for image denoising,” Proc. IEEE International Conference on Image Processing, pp. 1429-1432 (2006).
  • [4] J. Darbon, A. Cunha, T. F. Chan, S. Osher, and G. J. Jensen, “Fast nonlocal filtering applied to electron cryomicroscopy,” Proc. IEEE International Symposium on Biomedical Imaging, pp. 1331-1334 (2008).
  • [5] A. Dauwe, B. Goossens, H. Luong, and W. Philips, “A fast non-local image denoising algorithm,” Proc. SPIE Electronic Imaging, 68(12), pp. 1331-1334 (2008).
  • [6] J. Orchard, M. Ebrahimi, and A. Wong, “Efficient nonlocal-means denoising using the SVD,” Proc. IEEE International Conference on Image Processing, pp. 1732-1735 (2008).
  • [7] V. Karnati, M. Uliyar, and S. Dey, “Fast non-local algorithm for image denoising,” Proc. IEEE International Conference on Image Processing, pp. 3873-3876 (2009).
  • [8] L. Condat, “A simple trick to speed up and improve the non-local means,” Research Report, HAL-00512801, (2010).
  • [9] P. M. Narendra, “A separable median filter for image noise smoothing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 3, pp. 20-29 (1981).
  • [10] N. Fukushima, S. Fujita, and Y. Ishibashi, “Switching dual kernels for separable edge-preserving filtering,” IEEE International Conference on Acoustics, Speech and Signal Processing, (2015).
  • [11] T. Q. Pham and L. J. Van Vliet, “Separable bilateral filtering for fast video preprocessing,” Proc. IEEE International Conference on Multimedia and Expo, (2005).
  • [12] Y. S. Kim, H. Lim, O. Choi, K. Lee, J. D. K. Kim, and C. Kim, “Separable bilateral non-local means,” Proc. IEEE International Conference on Image Processing, pp. 1513-1516 (2011).
  • [13] S. Ghosh and K. N. Chaudhury, “Fast separable nonlocal means,” SPIE Journal of Electronic Imaging, 25(2), 023026 (2016).
  • [14] E. S. Gastal and M. M. Oliveira. “Domain transform for edge-aware image and video processing,” ACM Transactions on Graphics (ToG), 30(4), 69 (2011).
  • [15] BM3D Image Database,
  • [16] KODAK Image Database,
  • [17] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, 13(4), pp. 600-612 (2004).
  • [18] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Transactions on Image Processing, 15(12), pp. 3736-3745 (2006).
  • [19] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain collaborative filtering,” IEEE Transactions on Image Processing, 16(8), pp. 2080-2095 (2007).
  • [20] J. M. Batikian and M. Liebling, “Multicycle non-local means denoising of cardiac image sequences,” IEEE International Symposium on Biomedical Imaging, pp. 1071-1074 (2014).
  • [21] C. Chan, R. Fulton, R. Barnett, D.D. Feng, and S. Meikle, “Post-reconstruction nonlocal means filtering of whole-body PET with an anatomical prior,” IEEE Transactions on Medical Imaging, 33(3), pp. 636-650 (2014).
  • [22] G. Chen, P. Zhang, Y. Wu, D. Shen, and P.T. Yap, “Collaborative non-local means denoising of magnetic resonance images,” IEEE International Symposium on Biomedical Imaging, pp. 564-567 (2015).
  • [23] D. Zeng, J. Huang, H. Zhang, Z. Bian, S. Niu, Z. Zhang, Q. Feng, W. Chen, and J. Ma, “Spectral CT image restoration via an average image-induced nonlocal means filter,” IEEE Transactions on Biomedical Engineering, 63(5), pp. 1044-1057 (2016).
  • [24] G. Treece, “The bitonic filter: linear filtering in an edge-preserving morphological framework,” IEEE Transactions on Image Processing, 25(11), pp. 5199-5211 (2016).