# A Non-Local Means Filter for Removing the Poisson Noise

A new image denoising algorithm to deal with the Poisson noise model is given, which is based on the idea of Non-Local Mean. By using the "Oracle" concept, we establish a theorem to show that the Non-Local Means Filter can effectively deal with Poisson noise with some modification. Under the theoretical result, we construct our new algorithm called Non-Local Means Poisson Filter and demonstrate in theory that the filter converges at the usual optimal rate. The filter is as simple as the classic Non-Local Means and the simulation results show that our filter is very competitive.

## Authors

• 8 publications
• 4 publications
• 4 publications
• ### Robust non-local means filter for ultrasound image denoising

This paper introduces a new approach to non-local means image denoising....
09/26/2017 ∙ by Hamid Reza Shahdoosti, et al. ∙ 0

• ### NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal

In this paper, a non-local adaptive mean filter (NAMF) is proposed, whic...
10/17/2019 ∙ by Houwang Zhang, et al. ∙ 15

• ### An algorithm for improving Non-Local Means operators via low-rank approximation

We present a method for improving a Non Local Means operator by computin...
11/20/2014 ∙ by Victor May, et al. ∙ 0

• ### Non-local Operational Anisotropic Diffusion Filter

High-frequency noise is present in several modalities of medical images....
12/11/2018 ∙ by Fábio A. M. Cappabianco, et al. ∙ 0

• ### On a non-local spectrogram for denoising one-dimensional signals

In previous works, we investigated the use of local filters based on par...
11/13/2013 ∙ by Gonzalo Galiano, et al. ∙ 0

• ### Oracle inequalities and minimax rates for non-local means and related adaptive kernel-based methods

This paper describes a novel theoretical characterization of the perform...
12/19/2011 ∙ by Ery Arias-Castro, et al. ∙ 0

• ### Monte Carlo non local means: Random sampling for large-scale image filtering

We propose a randomized version of the non-local means (NLM) algorithm f...
12/27/2013 ∙ by Stanley H. Chan, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Noise is inevitable in any image device. A digital imaging system consists of an optical system followed by a photodetector and associated electrical filters. The photodetector converts the incident optical intensity to a detector current, i.e. photons to electrons. During the process, the true signals are contaminated by many different sources of noise. The Poisson noise appears in low-light conditions when the number of collected photons is small, such as night vision, medical imaging, underwater imaging, microscopic imaging, optical microscopy imaging and astronomy imaging. Such a noise is signal-dependent, and requires to adapt the usual denoising approaches.

The key challenge in Poisson intensity estimation problems is that the variances of the observed counts are different. As a result, many methods are introduced to transform the Poisson distributed noise to the data approximately Gaussian and homoscedastic. These methods are called Variance Stabilizing Transformations (VST), such as Anscombe root transformation (1948

ANSCOMBE1948TRANSFORMATION , and 1993 borovkov2000estimates ), multiscal VSTs (2008 ZHANG2008WAVELETS ), conditional variance stabilization (CVS) (2006 jansen2006multiscale ), or Haar-Fisz transformation (2004 FRYZLEWICZ2004HAAR and 2007 FRYZLEWICZ2007GOES ). Then we can deal with these data as Gaussian noise. Second, the noise is removed using a conventional denoising algorithm for additive white Gaussian noise, see for example Buades, Coll and Morel (2005 buades2005review ), Kervrann (2006 kervrann2006optimal ), Aharon and Elad and Bruckstein (2006 aharon2006rm ), Hammond and Simoncelli (2008 hammond2008image ), Polzehl and Spokoiny (2006 polzehl2006propagation ), Hirakawa and Parks (2006 hirakawa2006image ), Mairal, Sapiro and Elad (2008 mairal2008learning ), Portilla, Strela, Wainwright and Simoncelli (2003 portilla2003image ), Roth and Black (2009 roth2009fields ), Katkovnik, Foi, Egiazarian, and Astola (2010 Katkovnik2010local ), Dabov, Foi, Katkovnik and Egiazarian (2006 buades2009note ), Abraham, Abraham, Desolneux and Li-Thiao-Te (2007 Abraham2007significant ), and Jin, Grama and Liu (2011 JinGramaLiuowf ). After denoising, some inverse transformations, like Exact Unbiased Inverse (EUI) (2009 MAKITALO2009INVERSION and 2011 makitalo2011optimal ), are applied to the denoised signal, obtaining the estimate of the signal of interest. Many authors restore the Poisson noise by this type of methods with a three-step procedure (see boulanger2008non ; ZHANG2008WAVELETS ; lefkimmiatis2009bayesian ; luisier2010fast ).

Maxmum Likelihood (ML) estimation (1996 van1996comparing , 2009moon2009three ) and Similarity Measure (SM) (2006 alter2006intensity ) are also found to be effective since they can account for the special properties of the Poisson distribution. Others methods like as Complexity-Penalized Likelihood Estimation (CPLE) (2000 nowak2000statistical , 2005 kolaczyk2005multiscale ) and Total Variation (TV) seminorm (2009 beck2009fast ), have been introduced to deal with the Poisson noise. Le et al. ((le2007variational, )) have adapted the successful ROF model for total variation regularization to deal with Poisson noise. The gradient descent iteration for this model replaces the regularization parameter with a function.

The Non-Local Means Filter has been proposed by Buades et al (2005 buades2005review ) to denoise images damaged by additive white Gaussian noise. It is based on the similarity phenomenon existing very often in natural images, and assumes that there is enough redundant information (pixels having identical noise-free value) in the image to reduce the noise significantly. This filter is known to efficiently reduce the noise and to preserve structures. Some authors (see 2008 boulanger2008non , 2010 deledalle2010poisson ) combine the Non-Local Means method with other methods to restore the Poisson noise. Deledalle et al. (2010 (deledalle2010poisson, )) proposed an extension of the Non-Local Means for images damaged by Poisson noise. It is based on probabilistic similarities to compare noisy patches and patches of a pre-estimated image.

In this paper, a new image denoising algorithm to deal with the Poisson noise model is given, which is based on the idea of Non-Local Mean. Our main idea is as follows: we first obtain an ”Oracle” estimator by minimized a very tight upper bound of the Mean Square Error with changing the size of search window. The ”Oracle” estimator depends on the unknown target function (original image), whose concept is developed in Donoho and Johnstone donoho1994ideal . So the ”Oracle” estimator is not computable, but it can help us to find an available algorithm in mathematic theory. We second establish a theorem by the concept of the ”Oracle” to show that the Non-Local Means Filter can effectively deal with Poisson noise with some modification. Finally, replacing the unknown target function by some estimators, we construct our new algorithm called Non-Local Means Poisson Filter and demonstrate in statistic theory that the filter converges at the usual optimal rate. The filter is as simple as the classic Non-Local Means and the simulation results show that our filter is very competitive.

The remainder of this paper is organized as follow: we first introduce an ”Oracle” estimator for Poisson noise based on the idea of Non-Local Means, and present a theorem to show the rate of convergence of the ”Oracle” estimator in Section 2. We second construct an adaptive estimator according to the ”Oracle” estimator and obtain some convergence theorems of the estimator in Section 3. Finally, we demonstrate in Section 4 the ability of approach at restoring image contaminated by Poisson noise with a brief analysis.

## 2 The ”Oracle” estimator

### 2.1 Some notations

We suppose that the original image of the object being photographed is a integrable two-dimensional function , . Let the mean value of in a set be

 Λ(Bx)=N2∫Bxf(t)dt.

Typically we observe a discrete dataset of counts }, where

is a Poisson random variable of intensity

. We consider that if then is independent of . Suppose that , and . Then is a partition of the square . Using this partition we get a discrete function , . The denoising algorithm aims at estimating the underlying intensity profile discrete function . The image function is considered to be constant on each , . Therefore , . Furthermore, we can estimate the integrable function by the discrete function . Let

 Y(x)=N(Bx),x∈I. (1)

This model has been used effectively in many contexts. The Poisson noise model can be rewritten in the regression form

 Y(x)=f(x)+ϵ(x),x∈I, (2)

where . It is easy to see that and .

Let us set some notations to be used throughout the paper. The Euclidean norm of a vector

is denoted by The supremum norm of is denoted by The cardinality of a set is denoted . For a positive integer the uniform grid of pixels on the unit square is defined by

 I={1N,2N,⋯,N−1N,1}2. (3)

Each element of the grid will be called pixel. The number of pixels is For any pixel and a given the square window of pixels

 Ux0,h={x∈I:∥x−x0∥∞≤h} (4)

will be called search window at We naturally take as a multiple of ( for some ). The size of the square search window is the positive integer number

 M=(2Nh+1)2=card Ux0,h. (5)

For any pixel and a given a second square window of pixels will be called patch at . Like , the parameter is also taken as a multiple of . The size of the patch is the positive integer

 m=(2Nη+1)2=card Ux0,η. (6)

### 2.2 The Non-Local Means algorithm

The Non-Local Means algorithm (2005 buades2005review ) can be described as follows. For any ,

 ˜f=∑x∈Iw(x)Y(x), (7)

where the weights are given by

 w(x)=e−˜ρ2x0(x)/H2/∑x′∈Ie−˜ρ2x0(x′)/H2, (8)

with

 ˜ρ2x0=∑y∈Ux0,ηκ(y)|Y(y)−Y(Txy)|2∑y′∈Ux0,ηκ(y′).

Here is a bandwidth parameter, is given by (4), are some fixed kernel, and is the translation mapping:

 Tx:x0+y→x+y (9)

In practice the bandwidth parameter is often taken as a linear function of (see buades2005review ).

### 2.3 Oracle estimator

In order to adapt the Non-Local Means algorithm to the Poisson noise, we introduce an ”Oracle” estimator (for details on this concept see Donoho and Johnstone (1994 donoho1994ideal )). Denote

 f∗h=∑x∈Ux0,hw∗hY(x), (10)

where

 w∗h(x)=e−ρ2f,x0(x)H2(x0)/∑x′∈Ux0,he−ρ2f,x0(x′)H2(x0) (11)

with

 ρf,x0(x)≡|f(x)−f(x0)|, (12)

is a control function subject to

 γ=inf{H(x):x∈I}>0. (13)

It is obvious that

 ∑x∈Ux0,hw∗h(x)=1andw∗h(x)≥0. (14)

Note that the function characterizes the similarity of the image brightness at the pixel with respect to the pixel , therefore we shall call similarity function. The usual bias-variance decomposition (cf. e.g. (mandel1982use, ; terrell1992variable, ; fan1993local, )) of the Mean Squared Error (MSE)

 E(f(x0)−f∗h(x0))2 = ⎛⎜⎝∑x∈Ux0,hw∗h(x)(f(x)−f(x0))⎞⎟⎠2+∑x∈Ux0,hw∗h(x)2f(x) ≤ ⎛⎜⎝∑x∈Ux0,hw∗h(x)|f(x)−f(x0)|⎞⎟⎠2+∑x∈Ux0,hw∗h(x)2f(x).

The inequality (2.3) combining with (12) implies the following upper bound

 E(f(x0)−f∗h(x0))2≤g(w∗h(x)), (16)

where

 g(w)=⎛⎜⎝∑x∈Ux0,hw(x)ρf,x0(x)⎞⎟⎠2+∑x∈Ux0,hw(x)2f(x). (17)

We shall define a family of estimates by minimizing the function by changing the width of the search window. With a Poisson noise in low-light conditions, the upper bound of signal function is small, so we let . According to the similarity phenomenon existing very often in natural images, we suppose that the function satisfies the local Hölder condition

 |f(x)−f(y)|≤L∥x−y∥β∞,∀x,y∈Ux0,h+η, (18)

where and are constants, , and The following theorem gives the rate of convergence of the ”Oracle” estimator and the proper width of the search window.

###### Theorem 2.1

Assume that and . Suppose that the function satisfies the local Hölder condition (18) and be given by (10). Then

 E(f∗h(x0)−f(x0))2≤c0n−2β2β+2, (19)

where

 c0=22β+62β+2Γ2β2β+2L42β+2β2β2β+2. (20)

For the proof of this theorem see Section 6.1.

This theorem shows that at least from the practical point of view, it is justified to optimize the upper bound instead of optimizing the risk itself. The theorem also justifies that we can choose a small search window in place of the whole observed image to estimate a point, without loss of visual quality. That is why we only consider small search windows for the simulations of our algorithm.

## 3 Non-Local Means Poisson Filter

### 3.1 Construction of Non-Local Means Poisson Filter

With the theory of ”Oracle” estimator, we construct the Non-Local Means Poisson Filter. Let and be fixed numbers. Since , an obvious estimator of is given by

 1M∑y∈Ux0,η|Y(y)−Y(Txy)|2,

where is given by (9), and is estimated by , where

 ¯¯¯f(x0)=1M∑x∈Ux0,hf(x).

Define an estimated similarity function by

 ˆρ2x0(x)=⎛⎜⎝1M∑y∈Ux0,η|Y(y)−Y(Txy)|2−2¯¯¯f(x0)⎞⎟⎠+. (21)

The following theorem implies that it is reasonable to let be the estimator of .

###### Theorem 3.1

Assume that and . Suppose that the function satisfies the local Hölder condition (18) and is given by (21). Then there is a constant such that

 P{maxx∈Ux0,h∣∣ˆρ2x0(x)−ρ2f,x0(x)∣∣≥c2nα−12√lnn}≤O(n−1). (22)

For the proof of this theorem see Section 6.2.

As a result, it is natural to define an adaptive estimator by

 ˆfh(x0)=∑x∈Ux0,hˆwh(x)Y(x), (23)

where

 ˆwh=e−ˆρ2x0(x)H2/∑x′∈Ux0,he−ˆρ2x0(x′)H2. (24)

and given by (4).

### 3.2 Convergence theorem of Non-Local Means Poisson Filter

Now, we turn to the study of the convergence of the Non-Local Means Poisson Filter. Due to the difficulty in dealing with the dependence of the weights we shall consider a slightly modified version of the proposed algorithm: we divide the set of pixels into two independent parts, so that the weights are constructed from the one part, and the estimation of the target function is a weighted mean along the other part. More precisely, we split the set of pixels into two parts for any where

 I′x0={x0+(iN,jN)∈I:i+jiseven},

and

Define an estimated similarity function by

 ˆρ′2x0(x)=⎛⎜⎝1cardU′′x0,η∑y∈U′′x0,η|Y(y)−Y(Txy)|2−2¯¯¯f′(x0)⎞⎟⎠+,x∈U′x0,h (25)

where

 ¯¯¯f′(x)=1cardU′′x0,h∑y∈U′′x0,hY(y), (26)

and with given by (4). The adaptive estimator is denoted by

 ˆf′h(x0)=∑x∈U′x0,hˆw′h(x)Y(x), (27)

where and

 ˆw′h=e−ˆρ′2x0(x)H2(X0)/∑x′∈U′x0,he−ˆρ′2x0(x′)H2(x0). (28)

In the next theorem we prove that the Mean Squared Error of the estimator converges at the rate which is the usual optimal rate of convergence for a given Hölder smoothness (see e.g. Fan and Gijbels (1996 FanGijbels1996 )).

###### Theorem 3.2

Let , , and . Suppose that the function f satisfies the Hölder condition (18) and is given by (25). Then

 E(ˆf′h(x0)−f(x0))2≤c4n−2β2β+2, (29)

where

 c4=8⎛⎜⎝22β+62β+2Γ2β2β+2L42β+2β2β2β+2⎞⎟⎠2.

For the proof of this theorem see Section 6.2.

## 4 Simulation results

### 4.1 Computational algorithm

Throughout the simulations, we use the following algorithm to compute the Non-Local Means Poisson estimator The input values of the algorithm are (the image), and two numbers and are given by (6) and (5) respectively. In order to improve the results, we introduce a smoothed version of the estimated similarity distance

 ˆρ2κ,x0(x)=⎛⎜ ⎜ ⎜⎝∑y∈Ux0,ηκ(y)|Y(y)−Y(Txy)|2−2¯¯¯f(x0)∑y′∈Ux0,ηκ(y′)⎞⎟ ⎟ ⎟⎠+. (30)

As smoothing kernels we use the Gaussian kernel

 κg(y,hg)=exp(−N2∥y−x0∥222hg), (31)

where is a constant, and the following kernel: for ,

 κ0(y)=Nη∑k=max(1,j)1(2k+1)2, (32)

if for some . We shall also use the rectangular kernel

 κr(y)={1cardUx0,η,y∈Ux0,η,0,otherwise. (33)

For the simulation we use the kernel defined by (32). We have seen experimentally that when we take the filtering function as , where is a constant depending on the character of the image, to obtain a denoising of high visual quality. We mention that throughout the paper we symmetrize images near the boundary.

Algorithm Non-Local Means Poisson Filter (NLMPF)

Let be the parameters.

Repeat for each

- compute

Step 1

Step 2

If

compute

else

Note: we take .

### 4.2 Numerical performance of the Non-Local Means Poisson Filter

By simulations we found that the images with brightness between and (like Barbara) are well denoised by the first step, but for the low count levels images (with brightness less than ), the restored images by NLMPF are not smooth enough (see Figure 1). This explains why for the low count level images, we smooth the restored images by step 2.

Our experiments are done in the same way as in ZHANG2008WAVELETS and MAKITALO2009INVERSION to produce comparable results; we also use the same set of test images (all of in size): Spots , Galaxy , Ridges , Barbara , and Cells . The authors of ZHANG2008WAVELETS and MAKITALO2009INVERSION kindly provided us with their programs and the test images. A matlab implementation of the algorithms derived in this paper is available online. This unoptimized implemen-tation processes the set of test images seconds with a search window of size and patches of size , seconds with a search window of size and patches of size . The computational time is of about 10s per iteration on a image and Matlab on an Intel Pentium Dual CPU T3200 32-bit @ 2.00GHz CPU 3.00GHz.

Table 1 shows the NMISE values of images reconstructed by NLMPF, OWPNF (jin2012new, ), Poisson NLM (deledalle2010poisson, ), EUI+BM3D makitalo2011optimal , MS-VST+ ZHANG2008WAVELETS and MS-VST+B3 ZHANG2008WAVELETS . Our algorithm reach the best in the case of Galaxy, while OWPNF reach the best in the case of Spots; for Ridges, Barbara, and Cells, the method EUI+BM3D gives the best results, but our method is also very competitive. Table 2 shows the PSNR values of images reconstructed. Our algorithm also reach the best in the case of Galaxy. The method EUI+BM3D have the highest PSNR value. However, the most important evaluation criteria is the visual quality of restored image. Figures 2- 6 illustrate the visual quality of these denoised images. It is obvious that The visual quality of the outputs of our method have high visual quality and many details Remained. For example, in the case of restored images of Spots (cf. Figures 2), our algorithm and OWPNF remain most spots. We can see clearly spots at the third column (from left) in Figures 2 (c), while EUI+BM3D just remains spots, Poisson NLM Makes several spots sticking together, the images restored by MS-VST + 7/9 and MS-VST + B3 are not smooth enough. In the case of Galaxy (cf. Figures 3), visually, our algorithm best preserves the fine textures. In the other case, our method also lead to good result visually.

## 5 Conclusion

In this paper, we have present a new image denoising algorithm to deal with the Poisson noise model, which is based on the idea of Non-Local Mean. The ”Oracle” estimator is obtained by minimized a very tight upper bound of the Mean Square Error with changing the size of search window. It help to establish a theorem to show that the Non-Local Means Filter can effectively deal with Poisson noise with some modification. As a result, we successfully construct the new algorithm called Non-Local Means Poisson Filter and demonstrate in statistic theory that the filter converges at the usual optimal rate. The filter is as simple as the classic Non-Local Means and the simulation results show that our filter is very competitive. The idea of how to construct an algorithm for Poisson noise model is creative. With our idea, many algorithms to Remove Gaussian Noise could deal with the Poisson noise with some modification.

## 6 Appendix: Proofs of the main results

### 6.1 Proof of Theorem 2.1

Denoting for brevity

 (34)

and

 I2=f(x0)∑x∈Ux0,h(w∗h(x))2=f(x0)∑∥x−x0∥∞≤he−2ρ2f,x0(x)H2(x0)⎛⎜ ⎜⎝∑∥x−x0∥∞≤he−ρ2f,x0(x)H2(x0)⎞⎟ ⎟⎠2, (35)

then we have

 g(w∗h)=I1+I2. (36)

The conditions (13) and imply that for , we have

 L2∥x−x0∥2β∞H2(x)≤L2h2βγ2≤12. (37)

Noting that , is decreasing, and using one term Taylor expansion, the inequality (37) implies that

 ∑∥x−x0∥∞≤he−ρ2f,x0(x)H2(x0) ≥ ∑∥x−x0∥∞≤he−L2∥x−x0∥2β∞H2(x0)≥∑∥x−x0∥∞≤h(1−L2∥x−x0∥2β∞H2(x0)) (38) ≥ 2h2n.

Considering that , is increasing function,

 ∑∥x−x0∥∞≤he−ρ2f,x0(x)H2(x0)ρf,x0(x) ≤ ∑∥x−x0∥∞≤hL∥x−x0∥β∞e−L2∥x−x0∥2β∞H2(x0) (39) ≤ ∑∥x−x0∥∞≤hL∥x−x0∥β∞≤4Lhβ+2n.

The above three inequalities (34), (38) and (39) imply that

 I1≤4L2h2β. (40)

Taking into account the inequality

 ∑∥x−x0∥∞≤he−2ρ2f,x0(x)H2(x0)≤∑∥x−x0∥∞≤h1=4h2n,

(35) and (38), it is easily seen that

 I2≤Γh2n. (41)

Combining (36), (40), and (41), we give

 g(w∗h)≤4L2h2β+Γh2n. (42)

Let minimize the latter term of the above inequality (42). Then

 8βL2h2β−1−2Γh3n=0

from which we infer that

 h=(Γ4βL2)12β+2n−12β+2. (43)

Substituting (43) to (42) leads to

 g(w∗h)≤22β+62β+2Γ2β2β+2L42β+2β2β2β+2n−2β2β+2.

Therefore (47) implies (19).

### 6.2 Proof of Theorem 3.1

We shall use following lemma to finish the Proof of Theorem 3.1. The lemma can be deduced form the results in Borovkov borovkov2000estimates , see also Merlevede, Peligrad and Rio merleve2010bernstein

###### Lemma 1

If, for some and we have

 supEexp(δ|Xi|γ)≤K,i=1,...,n,

then there are two positive constants and depending only on and such that, for any

 P(n∑i=1Xi≥t)≤exp(−c1t2/n)+nexp(−c2tγ).

Proof of Theorem 3.1. Recall that and is given by (5) and (6) respectively. Consider that

 |ˆρ2x0(x)−ρ2f,x0(x)|≤∣∣∣1mS(x)+1mR(x)∣∣∣, (44)

where

 S(x)=∑y∈Ux0,ηZ(y),
 Z(y)=(Y(y)−Y(Txy))2−(f(y)−f(Txy))2−f(y)−f(Txy), (45)

and

 R(x)=∑y∈Ux0,η((f(y)−f(Txy))2+f(y)+f(Txy))−(f(x0)−f(x))2−2¯¯¯f(x0).

Since has the Poisson distribution, with mean and variance ,

 E(eY(x))=+∞∑k=0ekfk(x)e−f(x)k!=ef(x)e(e−1)f(x)≤eΓe(e−1)Γ. (46)

From the inequality (46), we easily deduce

 supE(e|Z(y)|1/2)≤supE(eY(y)+Y(Txy)+2Γ+2√Γ)≤(eΓ)2e2eΓ+2√Γ (47)

By Lemma 1, we see that there are two positive constants and such that for any ,

 P(1m|S(x)|≥z√m)≤exp(−c5z2)+mexp(−c6(√mz)12). (48)

Considering and , we have . Therefore, substituting into the inequality (48), we see that for large enough,

 P⎛⎜ ⎜⎝1m|S(x)|≥√1c5lnn2√m⎞⎟ ⎟⎠≤2exp(−lnn2)=2n2.

From this inequality we easily deduce that

 P⎛⎜ ⎜⎝maxx∈Ux0,h1m|S(x)|≥√1c5lnn2√m⎞⎟ ⎟⎠≤∑x∈Ux0,hP⎛⎜ ⎜⎝1m|S(x)|≥√1c5lnn2√m⎞⎟ ⎟⎠≤2n.

We arrive at

 P(B)≤c7n−1, (49)

where and is a constant depending only on and . It is easy to see that

 R(x)=O(nα−12). (50)

In the set , the inequality (50) implies that

 maxx∈Ux0,h|ˆρ2x0(x)−ρ2f,x0