The problem of noise removal from a digitized image is one of the most fundamental ones in digital image processing. So far, various techniques have been proposed to deal with it. Among the most important methodologies are, for example, the Wavelet-based image denoising methods, which dominates the research in recent years [2, 3]. In this paper we propose a novel approach which (to our knowledge) has not been considered before. We employ the well known powerful tool of kernels.
In kernel methodology the notion of the Reproducing Kernel Hilbert Space (RKHS) plays a crucial role. A RKHS, is a rich construct (roughly, a smooth space with an inner product), which has been proven to be a very powerful tool for non linear processing [9, 11]. In the denoising problem, we exploit a useful property of RKHS, the representer theorem . It states that the minimizer of any optimization task in , with a cost function of a certain type, has a finite representation in . We recast the image denoising problem as an optimization task of this type and use the semi-parametric version of the representer theorem. The latter, allows for explicit modeling of the edges in an image. In such a way we can deal with the smoothness which is, implicitly, imposed by the "smooth" nature of RKHS.
Though there has been some work exploring the use of kernels in the denoising problem, the methodology presented here is fundamentally different. In , the notion of kernel regression has been adopted. The original image is formulated as a Taylor approximation series around a center, , and data adaptive kernels are used, as weighted factors, to penalize distances away from . In a relatively similar context, kernels have been employed by other well known denoising methods (such as ). Kernels were also used in the context of RKHS in [6, 5]. However, the obtained results were not satisfying, especially around edges. It is exactly this drawback that is addressed by our method.
2 Mathematical Preliminaries
We start with some basic definitions regarding RKHS. Let be a non empty set with . Consider a Hilbert space of real valued functions defined on a set , with a corresponding inner product . We will call as a Reproducing Kernel Hilbert Space - RKHS, if there exists a function, known as kernel, with the following two properties:
For every , belongs to .
has the so called reproducing property, i.e. . In particular .
In can been shown that the kernel produces the entire space , i.e. There are several kernels that are used in practice (see ). In this work, we focus on one of the most widely used, the Gaussian Kernel:
due to some additional properties that it admits.
One of the many powerful tools in kernel theory is the application of the semi-parametric representer theorem to regularized risk minimization problems (see ):
Denote by , two strictly monotonic increasing functions, by a set and by an arbitrary loss function. Furthermore, consider a set of
an arbitrary loss function. Furthermore, consider a set ofreal-valued functions , with the property that the matrix has rank . Then any , with and minimizing the regularized risk functional
admits a representation of the form
Usually the regularization term takes the form . In the case of the RKHS produced by the gaussian Kernel we can prove that
with and , being the Laplacian and the gradient operator (see ). Thus, we see that the regularization term "penalizes" the derivatives of the minimizer. This results to a very smooth solution of the regularized risk minimization problem.
Note that according to theorem 2.1 the model of a function has two parts, one lying in the smooth RKHS space and another part which gives rise to the second term in the expansion (1). It is exactly this term that is exploited by our method in order to explicitly model edges.
|Image||Noise||noisy PSNR||Kernel Denoising||BiShrink ||K-SVD ||SKR ||SKR ||BM3D |
|Boat||20%||18.56 dB||32.36 dB||22.59 dB||26.46 dB||31.85 dB||28.35 dB||29.45 dB|
|30%||16.77 dB||30.66 dB||25.07 dB||26.79 dB||30.85 dB||27.05 dB||28.29 dB|
|40%||15.52 dB||29.14 dB||25.40 dB||26.08 dB||29.51 dB||25.85 dB||27.26 dB|
|50%||14.55 dB||28.10 dB||25.09 dB||25.38 dB||27.73 dB||24.90 dB||26.61 dB|
|Image||Noise||noisy PSNR||Kernel Denoising||BiShrink ||BLS-GSM ||K-SVD ||SKR ||SKR ||BM3D |
|Lena||28.12 dB||33.98 dB||34.33 dB||35.60 dB||35.47 dB||32.66 dB||35.32 dB||35.93 dB|
|22.14 dB||31.12 dB||31.17 dB||32.65 dB||32.36 dB||29.23 dB||32.62 dB||33.00 dB|
|18.72 dB||29.11 dB||29.35 dB||30.50 dB||30.30 dB||26.60 dB||30.71 dB||31.21 dB|
3 Application to the denoising problem
Let be the original image and the noisy one (we consider them as continuous functions). Also, let and be the restrictions of and on the orthogonal region centered at the pixel of each image accordingly (
is an odd number). Our task is to findfrom the given samples of . For simplicity, we drop the indices and consider and (which from now on will be written as and ) as functions defined on (and zero elsewhere). The pixel values of the digitized image are given by and where , for .
We consider a set of real valued functions with two variables suitable to represent edges; i.e., bivariate polynomials (which are controlled by the coefficients ) and functions of the form , where Erf is the error function,
for several suitable choices of , and (see figure 1). Thus we formulate the regularized risk minimization problem as follows:
Taking a closer look at the term according to equation (2), one sees that we actually penalize the derivatives of in a more influential fashion than the total variation scheme, which is often used in wavelet-based denoising and penalizes only the first order derivatives. It turns out that in our method the use of the norm in the cost function, in combination with regularization, results in sparse modeling with respect to the coefficients. It should be noted that the use of the norm, also, in the regularization term leads to similar results.
The semi-parametric theorem 2.1 ensures that the minimizer will have a finite representation of the form:
We can solve this problem using Polyak’s Projected Subgradient Method . We fix the regularization parameter and adjust and so that they take small values around edges and large values in smooth areas. In particular, as the algorithm moves from one pixel to the next, it decides whether the corresponding pixel centered region contains edges or not using the mean gradient of the specific region and then, it solves the corresponding minimization problem.
4 Experimental Results
Figure 2 and tables 1, 2 show the obtained results using our algorithm on the Lena and Boat () grayscale images. More experimental results, the code in C (for the proposed methodology), as well as details on the implementation may be found at http://cgi.di.uoa.gr/~stheodor/ker_den/index.htm. The results were compared with those obtained using several state of the art wavelet-based denoising packages, which are available on the internet ([3, 10, 4, 1, 2]). The experiments show that the kernel approach performs equally well as the well-known BiShrink wavelet-based method  in the presence of Gaussian noise. However, it outperforms significantly the other denoising methods when impulse noise or mixed noise are considered (see figure 2). This enhanced performance is obtained at the cost of higher complexity, which is basically contributed by the optimization step, which is of the order of per pixel. Currently, more efficient optimization algorithms are considered. Moreover, the whole setting is open to a straightforward parallelization, when a parallel processing environment is available. This is also currently under consideration.
A novel denoising algorithm was presented based on the theory of RKHS. The semiparametric Representer Theorem was exploited in order to cope with the problems associated with the smoothing around edges, which is a common problem in almost all denoising algorithms. The comparative study against other denoising techniques, showed that significantly enhanced results are obtained in the case of impulse noise and mixed noise.
-  A. Buades, B. Coll, and J. M. Morrel. A review of image denoising algorithms, with a new one. SIAM Multiscale Modelling and Simulation, 4(2):490–530, 2005.
-  S. Cai and K. Li. Wavelet software at Brookly Poly. http://taco.poly.edu/WaveletSoftware/index.html.
-  K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3d transform-domain collaborative filtering. IEEE Tran. Im. Proc., 16(8):2080–2095, 2007.
-  M. Elad and M. Aharon. Image denoising via sparse and redundunt representations over learned dictionaries. IEEE Tran. Im. Proc., 15(12):3736–3745, 2006.
K. Kim, M. O. Franz, and B. Scholkopf.
Iterative kernel principal component analysis for image modeling.IEEE Trans. Pattern Anal. Mach. Intell., 27(9):1351–1366, 2005.
Support vector regression based image denoising.Image Vision Comput., 27:623–627, 2009.
-  B. T. Polyak. Introduction to Optimization. New York: Optimization Software, 1987.
-  J. Portilla, V. Strela, M. Wainwright, and E. P. Simoncelli. Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Im. Proc., 12(11):1338–1351, 2003.
-  B. Scholkopf and A. J. Smola. Learning with Kernels. MIT press, 2002.
-  H. Takeda, S. Farsiu, and P. Milanfar. Kernel regression for image processing and reconstruction. IEEE Tran. Im. Proc., 16(2):349–366, 2007.
-  S. Theodoridis and K. Koutroumbas. Pattern Recognition, 4th edition. Academic Press, 2009.