Sparse Norm Filtering

05/17/2013 ∙ by Chengxi Ye, et al. ∙ 0

Optimization-based filtering smoothes an image by minimizing a fidelity function and simultaneously preserves edges by exploiting a sparse norm penalty over gradients. It has obtained promising performance in practical problems, such as detail manipulation, HDR compression and deblurring, and thus has received increasing attentions in fields of graphics, computer vision and image processing. This paper derives a new type of image filter called sparse norm filter (SNF) from optimization-based filtering. SNF has a very simple form, introduces a general class of filtering techniques, and explains several classic filters as special implementations of SNF, e.g. the averaging filter and the median filter. It has advantages of being halo free, easy to implement, and low time and memory costs (comparable to those of the bilateral filter). Thus, it is more generic than a smoothing operator and can better adapt to different tasks. We validate the proposed SNF by a wide variety of applications including edge-preserving smoothing, outlier tolerant filtering, detail manipulation, HDR compression, non-blind deconvolution, image segmentation, and colorization.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 4

page 5

page 7

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Image filtering plays a fundamental role in image processing, computer graphics and computer vision, and has been widely used to reduce noise and extract useful image structures. In particular, edge-preserving smoothing operations have been studied for decades and have been proven to be critical for a wide variety of applications including blurring, sharpening, stylization and edge detection.

In general, existing edge-preserving filtering techniques can be classified into the following two groups: weighted average filtering and optimization-based filtering.

Well-known techniques of weighted average filtering includes anisotropic diffusion [21, 1] and bilateral filtering [28]. Anisotropic diffusion uses the gradients of each pixel to guide a diffusion process and avoids blurring across edges. The bilateral filter can be regarded as a non-local diffusion process that uses pixel intensities within a neighborhood to guide the diffusion. Both approaches can be implemented using explicit weighted averaging. Acceleration of weighted filtering has been a research hotspot in recent years [18, 23, 31, 9, 7].

Optimization-based filtering formulates edge preserving filtering as an optimization problem that consists of a fidelity term and a penalty term [25, 5, 30]. Edge preserving is enforced by introducing a sparse norm penalty on the gradients, thus the cost function is usually non-quadratic, and solving the system is more time consuming [29] compared with weighted average filtering. Nevertheless, this framework often produces high quality results.

In this paper, we present a novel type of edge-preserving filter, called sparse norm filter (SNF), derived from a sparse optimization problem. For each pixel, the filtering output minimizes its difference with its neighboring pixels; the penalty is defined by a sparse norm. Although SNF is closely related to and produces results as excellent as optimization-based filters, it is conceptually and computationally simpler than optimization-based filters.

SNF naturally preserves edges through the use of the sparse norm, and is capable of producing halo-free filtering effects, which is a desirable but lacking property of current weighted average filtering techniques. We demonstrate many of the other favorable properties of this simple and versatile approach to filtering via a wide variety of applications. Fig. Sparse Norm Filtering shows some applications of our filtering technique. Fig. Sparse Norm Filtering(a) demonstrates our smoothing and sharpening results that approximate the energy. Note that the filtering result preserves edges and does not introduce halos. Fig. Sparse Norm Filtering(b) shows our norm filtering effect to remove the pepper and salt noise. Fig. Sparse Norm Filtering(c) shows a new way of seamless editing enabled by the norm filtering. More detailed discussions on applications will be presented in the Applications section.

2 Background

One simple and classic way to smooth an image is to minimize the difference of each pixel with nearby ones, which can be formulated as

(1)

The solution of this optimization can be found by averaging the nearby pixels and is known as the box filter when we consider a square neighborhood

(2)

The box filter can be calculated in linear time with the integral image technique [23]. However, it does not preserve the salient structures or edges in an image.

Modern filtering techniques solve this problem by taking the weighted average of nearby pixels [21, 1, 28]. In the anisotropic diffusion framework, the neighborhood consists of the adjacent pixels, and the system has to be iterated tens of times to produce good smoothing result. Most recent filtering techniques consider a larger neighborhood consisting of tens or hundreds of neighboring pixels and the filtering is solved in one or a few rounds. Edges are preserved by constructing the weight matrix with the criterion that similar nearby pixels shall be given higher weights. As an example, the bilateral filter [28] uses the intensity to measure similarity and assigns weights by

(3)

Edge-preserving image smoothing can also be achieved by solving the following optimization problem

(4)

The penalty term controls the amount of smoothness of the output and the fidelity term controls the similarity with the input . When , the optimization problem is the well-known Tikhonov regularization [27]. The explicit solution can be found by

(5)

Since sparse norms have better tolerance for outliers than the norm, the optimization was later extended to total variation regularization with  [25] and even sparser versions [5][30] for edge-preserving purposes. Solving these non-quadratic optimizations is more time-consuming. Thus, variable splitting [29] is usually exploited to cast the original large optimization problem into several small sub-problems and alternatively minimize each of these sub-problems.

3 The Sparse Norm Filter

3.1 Definition

We propose SNF by generalizing (1) through allowing the original norm to be a fractional-norm. To preserve strong edges, we need to smooth the image while to tolerate the outlier pixels by assigning lower weights to them. This type of adaptive weighting ideas have been well explored in robust statistics [1] and we achieve this by exploiting sparse norms. Then, SNF is defined by

(6)

Minimizing this non-quadratic cost function when is difficult. Especially when , the cost function is non-convex and conventional gradient descent-based algorithms are easily trapped into local minima.

In this paper, we consider two approximation strategies. The first strategy iteratively exploits the weighted least square technique

(7)

By taking the derivative, we find that the solution can be approximated by using weighted average filtering , with . This solution can be understood as one iteration of the anisotropic diffusion process, with the diffusity calculated based at the current pixel intensity. This way of weighting naturally enforces fidelity with the input image. Similar to the anisotropic diffusion, we can iteratively update the diffusity once we update the image with this weighted average filtering result. In practice, like the bilateral filter, one iteration is usually sufficient because the diffusion is non-local. It is noteworthy that when , the weight goes to infinity. In practice, we can avoid this by setting a threshold and raising the pixel differences by/to the threshold. We can also modify the optimization by weighting pixels according to distance using a Gaussian-like weight. However, we observe that treating all neighboring pixels equally is good enough in practice.

Another strategy quantizes the solution into a set of discrete values. For each of these discrete values , we calculate for each pixel , which can be done efficiently by using the box filter. We compare the energy at each of these discrete values and select the minimum. Similar technique is used to approximate the median filter [31]. In this strategy, only discrete solutions at certain quantization levels are allowed because this approximation is based on brute force searching in the solution space. In practice, this strategy is more preferable when images are contaminated by outliers, e.g., the salt and pepper noise, we need a large number of iterations of the first strategy (if possible) to reach a suitable solution. For example, if the center pixel is noised and we conduct one iteration of filtering, we will assign high weights to similar pixels that are potentially noised. Thus, the obtained solution can be far away from suitable.

Both strategies are valuable. The first strategy makes the results look natural to the eye and its effect is similar to the bilateral filter, while the second strategy can filter out outliers and its effect is similar to the median filter. In all experiments except the outlier-tolerant-filtering, we choose the first strategy.

3.2 Complexity

The sparse norm filter benefits from off-the-shelf acceleration methods [31, 7], and can be calculated in linear time , where is the number of bins for quantization and the pixel number . For a grayscale image, the brute force solution can be calculated with box filters if we quantize the intensities into bins. For the weighted average solution, we can similarly quantize the center pixel intensities (in the weight term) into bins [31]. The weighted sum (numerator), and the sum of weights (denominator) can also be calculated using box filters, respectively. In comparison, an excellent state-of-art filtering technique [9] uses 7 box filters. In our Matlab implementation, the box filter takes 0.04 seconds per mega-pixel. The weighted average implementation of the sparse norm filter takes 0.5 seconds per mega-pixel when , and 1 second with . Experiments are carried out on an Intel i7 3610QM CPU with 8G memory. The pixel level operations will experience significant speedups in C++ implementations. For example, our direct single thread implementation of the box filter in C++ took 0.01 seconds per mega-pixel. Filtering based methods are faster than optimization-based methods [5, 30], as the latter take 2-4 seconds per mega-pixel in the same environment.

3.3 Connections to Related Work

Optimization-based filters [25, 5, 30, 4] have been widely used image enhancement tasks, e.g. denoising, edge preserving smoothing and deconvolution [13, 12], and share the form of (4), or in the pixel level notation

(8)

The norm in the fidelity term is usually an norm in existing works. SNF simplifies (8) by integrating the fidelity term and the sparse norm penalty. By setting , changing the norm in the first term to the norm, and defining the neighborhood to contain the current pixel, we can write the first term in (8) into the second term and reduce (8) to the form of (6). Thus we establish the connection with optimization-based filtering.

It is noteworthy that in most optimization-based filters, the neighborhood is of small size and only contains the adjacent pixels. By contrast, SNF extends the concept of neighborhood in a non-local way to potentially include more pixels. We consider the difference of a pixel with all the pixels, not only those that are horizontal and vertical. SNF has advantages over optimization-based filter: a one pass approximation exists and is less likely to be trapped in poor local minima, thanks to the non-local diffusion.

In addition, SNF has a close relationship with several well-known filters. By setting , SNF reduces to the averaging filter or box filter if we consider square neighborhoods. By setting , SNF is equivalent to the median filter. This can be proved by taking the derivative on the original cost function. By setting , the sparse norm filter is the dominant mode filter [11].

4 Application

4.1 Halo Free Edge Preserving Filtering and Detail Manipulation

Explicit filtering techniques are known to create faint light rims along strong edges known as halo artifacts. This unrealistic effect has been widely discussed in [9, 5, 30]. This section shows that SNF can produce halo-free results. Fig 1 compares representative edge-preserving smoothing techniques. Although all the methods can produce high quality results, we find some tiny differences. Optimization-based smoothing algorithms [5, 30] are more capable of producing halo-free looks, but the obtained results can occasionally be unexpected if the optimization is non-convex. In Fig. 1(c) the edges look overly smoothed; the -smoothing preserves edges perfectly but it also retains speckles. Traditional weighted average filtering techniques produce smoother looks, but tend to produce halos near strong edges. These halos also lead to unnatural transitions in sharpening.

Figure 1: Filtering results. For each result, left: original image. Middle: smoothing result. Right: sharpening result. (a) Original image. (b) WLS [10] result , . (c) WLS result, , . (d) Guided image filter [7], , . (e) Guided image filter, , . (f) Bilateral filter [3], , . (g) Bilateral filter, , . (h) -smoothing [11], . (i) -smoothing, . (j) SNF, , . (k) SNF, , . (l) SNF, , .

By using SNF with , similar pixels will be assigned larger weights than dissimilar pixels, thus the filter is edge preserving. When is approaching to zero, the sparse norm approximates the energy, and the filter result exhibits no visible halo effects (Fig. 2), since pixels with different intensities are assigned much lower weights than the pixels with similar weights (Fig. 3(b)). The idea is also similar to the edge-stopping diffusion in the anisotropic diffusion framework [1].

Figure 2: Halo effects. (a) Original image. (b) Bilateral filter result using , . (c) Guided image filter result using , . (d) Our result using , . (e) Our result using , .

We use SNF to decompose the image into a base layer and detail layer . Here the base layer is the cartoon-like filtering result using the sparse norm filter. Detail enhancement can be achieved by boosting the detail layer . We demonstrate the results on a flower photo (Fig. 3(a)) by trying different combinations of the filtering radius and the norm (Fig. 4).

Figure 3: (a) The original flower image. (b) Weights assigned to different gradients under different norms.
Figure 4: Smoothing/sharpening using various radius/norm settings. Left half of each image: smoothing result. Right half: sharpening result by adding the detail layer.

4.2 Outlier Tolerant Filtering

Standard edge preserving filters [21, 28] are very effective for Gaussian-like noise reduction. In the presence of extreme noise, none of them are as robust as the classic median filter. The culprit is the weighting can be misled by noise. In comparison, the sparse norm filter is a whole class of filter that can perform similarly with the median filter.

We take an example image from [11]. We avoid the outliers by first using brute force search to approximate the global solution of (6) at a few discrete values [31]. This intermediate result has a quantized look. (Fig 5, row 1 columns 2&3) We calculate the diffusity at this approximate solution / use this as the guidance image and use the one pass weighted average filtering (7) to output a smoothed image. (Fig 5, row 2)

Input image
Figure 5: Left: original image. Middle: , . Right: , . First row: approximate solution of (6) using brute force search. Second row: followed by a guided diffusion.

4.3 HDR Compression

HDR tone mapping is a popular application which can be achieved by compressing the base layer while keeping the detail layer . In the following comparison (Fig 6) we can see that the weighted least square (WLS) [5], Fattal02 [6], Durand02 [3] have visible halos near the strong edges. For the sparse norm filter, we set and radius to be 1/6 of the image height to conduct one pass of non-local diffusion to extract the base layer. We observe under the same the WLS method seems to be trapped in a local minimum because the cost function is non-convex. Although Drago03 [2], Pattanaik00 [19], Mantiuk06 [17], Reinhard05 [24] try to reduce the halo, they fail to make some details visible in the results.

(a) Original
(b) Farbman08
(c) Durand02
(d) Drago03
(e) Fattal02
(f) Pattanaik00
(g) Mantiuk06
(h) Reinhard05
(i) SNF
Figure 6: HDR tone compression comparison.

4.4 Non-blind Deconvolution

Ringing artifacts are common in deconvolution when the kernel estimation is not accurate or when frequency nulls occur. The ringing artifacts can be significantly reduced by putting a sparse norm prior on the gradient term

[13, 12]. Similarly we put a non-local sparse norm on the gradients

(9)

and use an alternative minimization [29] technique to deconvolve the blurry image (Fig 7). In the following comparison we compare our result with the standard Tikhonov regularization which uses an penalty on the gradient term. We notice SNF can produce crisper results with fewer ringing effects.

Figure 7: Deconvolution comparison. (a) Input. (b) Estimated kernel. (c) Tikhonov regularization result. (d) Sparse norm deconvolution using , .

4.5 Joint Filtering

The sparse norm filter can naturally incorporates a guidance or joint image [22] to provide the filtering weight or diffusity. Below we show the result of flash/No-flash denoising taking the image using flash as the joint image to remove the noise in the non-flash image, using , (Fig 8).

Figure 8: (a) Noisy image. (b) Flash image. (c) Joint filtered image. (d) Recolored image.

4.6 Image Segmentation

The sparse norm is usually used to model the gradient profile of natural images in various computer vision models. In the following experiment (Fig 9) we show it can be used to accelerate the normalized cut [26]

using the joint filtering techniques. Since normalized cut finds the eigenvectors of a diffusion/affinity matrix, we replace the slow matrix multiplication (which is quadratic to the neighborhood radius) in the eigensolver with our joint SNF which takes constant time with any neighborhood size

[32]. We use the original image as the guidance image and easily provides 10x-100x acceleration depending on the filtering radius. Moreover, we can extend the technique to explain and accelerate other normalized cut related algorithms [9, 14, 16, 15, 10, 8].

Figure 9: Column1: input image from [Shi and Malik 2009]. Columns 2-6: segments using the , of the image.

4.7 Colorization

We demonstrate an application in colorization [14] as an example using joint filtering. Colorization can also be achieved by finding the stable distribution of an edge-preserving filter [32]. The guiding weight of this filter is calculated from the gray scale image, and similar nearby pixels are assigned higher weights, which is naturally enforced using the sparse norm. To promise pixels similar in gray scale intensities are assigned similar colors, we use this guiding weight/diffusity to spread the color cues obtained from the input color strokes. We use a straightforward gradient descent algorithm to update the diffusion system. With less than 10 iterations, we can obtain high quality results (Fig 10(d)). This algorithm can also be used to re-color the flash image, shown Fig 8(d).

Figure 10: Colorization. (a) Input gray scale image. (b) Input color strokes. (c) Result by [Levin et al. 2004]. (d) Our result using , of the image height.

4.8 Seamless Photo Editing

This acceleration technique enabled by the sparse norm filter can be extended to the non-sparse norm. Seamless editing is a popular feature in image processing. Due to inconsistent color between the source and target, simple drag-and-drop editing is known to create artificial boundaries. The Poisson equation is widely used to seamlessly fill in a target region using a source region. In this framework, guided interpolation is conducted via solving

in the fill-in area , subject to the Dirichlet boundary condition [20]. If we extend this equation by taking non-local gradients, the high dimensional Poisson equation, hints at a new system.

(10)

We also solve the above system also with the gradient descent algorithm. Since the diffusion is non-local, the algorithm converges within 10 iterations. Only needs to be updated in each iteration, which can be calculated using the box filter. We compare our algorithm with the original Poisson solver. In our environment, we use the backslash operation in Matlab to solve the Poisson equation, which takes 3 seconds per mega-pixel, excluding the time required to construct the sparse linear system. As reported above, the box filter takes only 0.04 seconds per mega-pixel in Matlab, or 0.01 seconds in C++. The results are comparable in quality (Fig 11).

Figure 11: Seamless editing. (a) Source image. (b) Target image. (c) Drag-and-drop result. (d) 1st iteration of our algorithm. (e) 3rd iteration of our algorithm. (f) Our algorithm output. (g) [Pérez et al. 2003] output.

5 Conclusion

In this work we present a simple but fundamental filter that builds connections with various classic smoothing techniques. The sparse norm filter can be regarded as a non-local extension of the optimization-based smoothing methods, which allows one-pass approximate solution via filtering. Through a variety of applications in image processing and computer vision, we demonstrate that the sparse norm filter gives new insights into popular applications and provides high quality accelerations.

References

  • [1] M. Black and G. Sapiro (1998) Robust anisotropic diffusion. IEEE Transactions on Image Processing 7 (3), pp. 421–432. Cited by: §1, §2, §3.1, §4.1.
  • [2] F. Drago, K. Myszkowski, T. Annen, and N. Chiba (2003) Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum 22 (3), pp. 419–426. Cited by: §4.3.
  • [3] F. Durand and J. Dorsey (2002) Fast bilateral filtering for the display of high-dynamic-range images. ACM Transactions on Graphics 21 (3), pp. 257–266. Cited by: §4.3.
  • [4] M. Elad (2002) On the origin of the bilateral filter and ways to improve it. IEEE Transactions on Image Processing 11 (10), pp. 1141–1151. Cited by: §3.3.
  • [5] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski (2008) Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Transactions on Graphics 27 (3), pp. 1.1–1.10. Cited by: §1, §2, §3.2, §3.3, §4.1, §4.3.
  • [6] R. Fattal, D. Lischinski, and M. Werman (2002) Gradient domain high dynamic range compression. ACM Transactions on Graphics 21 (3), pp. 249–256. Cited by: §4.3.
  • [7] E. S. L. Gastal and M. M. Oliveira (2012) Adaptive manifolds for real-time high-dimensional filtering. ACM Transactions on Graphics 31 (4), pp. 1–13. Cited by: §1, §3.2.
  • [8] K. He, J. Sun, and X. Tang (2010) Fast matting using large kernel matting laplacian matrices. In

    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 2165–2172. Cited by: §4.6.
  • [9] K. He, J. Sun, and X. Tang (2010) Guided image filtering. In Proceedings of European Conference on Computer Vision, pp. 1–14. Cited by: §1, §3.2, §4.1, §4.6.
  • [10] K. He, J. Sun, and X. Tang (2011) Single image haze removal using dark channel prior. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2341–2353. Cited by: §4.6.
  • [11] M. Kass and J. Solomon (2007) Smoothed local histogram filters. ACM Transactions on Graphics 29 (4), pp. 1–10. Cited by: §3.3, §4.2.
  • [12] D. Krishnan and R. Fergus (2009) Fast image deconvolution using hyper-laplacian priors. In Advances in Neural Information Processing, Cited by: §3.3, §4.4.
  • [13] A. Levin, R. Fergus, F. Durand, and W. T. Freeman (2007) Image and depth from a conventional camera with a coded aperture. ACM Transactions on Graphics 26 (10), pp. 1141–1151. Cited by: §3.3, §4.4.
  • [14] A. Levin, D. Lischinski, and Y. Weiss (2004) Colorization using optimization. ACM Transactions on Graphics 23 (3), pp. 689. Cited by: §4.6, §4.7.
  • [15] A. Levin, D. Lischinski, and Y. Weiss (2008) A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2), pp. 228–242. Cited by: §4.6.
  • [16] A. Levin, A. Rav-Acha, and D. Lischinski (2008) Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (10), pp. 1699–1712. Cited by: §4.6.
  • [17] R. Mantiuk, K. Myszkowski, and H.-P. Seidel (2002) A perceptual framework for contrast processing of high dynamic range images. ACM Transactions on Applied Perception 3 (3), pp. 87–94. Cited by: §4.3.
  • [18] S. Paris and F. Durand (2007) A fast approximation of the bilateral filter using a signal processing approach. International Journal of Computer Vision 81 (1), pp. 24–52. Cited by: §1.
  • [19] S.N. Pattanaik, J. Tumblin, H. Yee, and D.P. Greenberg (2000) Time-dependent visual adaptation for realistic image display. In Proceedings of ACM SIGGRAPH, pp. 47–54. Cited by: §4.3.
  • [20] P. Pérez, M. Gangnet, and A. Blake (2004) Poisson image editing. ACM Transactions on Graphics 22 (3), pp. 313. Cited by: §4.8.
  • [21] P. Perona and J. Malik (1990) Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (7), pp. 629–639. Cited by: §1, §2, §4.2.
  • [22] G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, and K. Toyama (2004) Digital photography with flash and no-flash image pairs. ACM Transactions on Graphics 23 (3), pp. 664. Cited by: §4.5.
  • [23] F. Porikli (2008) Constant time o(1) bilateral filtering. In Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–8. Cited by: §1, §2.
  • [24] E. Reinhard and K. Devlin (2005) Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics 11 (1), pp. 13–24. Cited by: §4.3.
  • [25] L. Rudin, S. Osher, and E. Fatemi (1992) Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 60 (1–4), pp. 259–268. Cited by: §1, §2, §3.3.
  • [26] J. Shi and J. Malik (2004) Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (8), pp. 888–905. Cited by: §4.6.
  • [27] A. N. Tikhonov, A. V. Goncharsky, V. V. Stepanov, and A. G. Yagola (1995) Numerical methods for the solution of ill-posed problems. Springer. Cited by: §2.
  • [28] C. Tomasi and R. Manduchi (1998) Bilateral filtering for gray and color images. In Proceedings of International Conference on Computer Vision, pp. 839–846. Cited by: §1, §2, §4.2.
  • [29] Y. Wang, J. Yang, W. Yin, and Y. Zhang (2008) A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences 1 (3), pp. 248. Cited by: §1, §2, §4.4.
  • [30] L. Xu and C. Lu (2011) Image smoothing via l 0 gradient minimization. Image Rochester NY 30 (6), pp. 1–12. Cited by: §1, §2, §3.2, §3.3, §4.1.
  • [31] Q. Yang, K. Tan, and N. Ahuja (2009) Real-time o(1) bilateral filtering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 557–564. Cited by: §1, §3.1, §3.2, §4.2.
  • [32] C. Ye, Y. Lin, M. Song, C. Chen, and D. W. Jacobs (2012) Spectral graph cut from a filtering point of view. arXiv preprint. Cited by: §4.6, §4.7.