Image filtering plays a fundamental role in image processing, computer graphics and computer vision, and has been widely used to reduce noise and extract useful image structures. In particular, edge-preserving smoothing operations have been studied for decades and have been proven to be critical for a wide variety of applications including blurring, sharpening, stylization and edge detection.
In general, existing edge-preserving filtering techniques can be classified into the following two groups: weighted average filtering and optimization-based filtering.
Well-known techniques of weighted average filtering includes anisotropic diffusion [21, 1] and bilateral filtering . Anisotropic diffusion uses the gradients of each pixel to guide a diffusion process and avoids blurring across edges. The bilateral filter can be regarded as a non-local diffusion process that uses pixel intensities within a neighborhood to guide the diffusion. Both approaches can be implemented using explicit weighted averaging. Acceleration of weighted filtering has been a research hotspot in recent years [18, 23, 31, 9, 7].
Optimization-based filtering formulates edge preserving filtering as an optimization problem that consists of a fidelity term and a penalty term [25, 5, 30]. Edge preserving is enforced by introducing a sparse norm penalty on the gradients, thus the cost function is usually non-quadratic, and solving the system is more time consuming  compared with weighted average filtering. Nevertheless, this framework often produces high quality results.
In this paper, we present a novel type of edge-preserving filter, called sparse norm filter (SNF), derived from a sparse optimization problem. For each pixel, the filtering output minimizes its difference with its neighboring pixels; the penalty is defined by a sparse norm. Although SNF is closely related to and produces results as excellent as optimization-based filters, it is conceptually and computationally simpler than optimization-based filters.
SNF naturally preserves edges through the use of the sparse norm, and is capable of producing halo-free filtering effects, which is a desirable but lacking property of current weighted average filtering techniques. We demonstrate many of the other favorable properties of this simple and versatile approach to filtering via a wide variety of applications. Fig. Sparse Norm Filtering shows some applications of our filtering technique. Fig. Sparse Norm Filtering(a) demonstrates our smoothing and sharpening results that approximate the energy. Note that the filtering result preserves edges and does not introduce halos. Fig. Sparse Norm Filtering(b) shows our norm filtering effect to remove the pepper and salt noise. Fig. Sparse Norm Filtering(c) shows a new way of seamless editing enabled by the norm filtering. More detailed discussions on applications will be presented in the Applications section.
One simple and classic way to smooth an image is to minimize the difference of each pixel with nearby ones, which can be formulated as
The solution of this optimization can be found by averaging the nearby pixels and is known as the box filter when we consider a square neighborhood
The box filter can be calculated in linear time with the integral image technique . However, it does not preserve the salient structures or edges in an image.
Modern filtering techniques solve this problem by taking the weighted average of nearby pixels [21, 1, 28]. In the anisotropic diffusion framework, the neighborhood consists of the adjacent pixels, and the system has to be iterated tens of times to produce good smoothing result. Most recent filtering techniques consider a larger neighborhood consisting of tens or hundreds of neighboring pixels and the filtering is solved in one or a few rounds. Edges are preserved by constructing the weight matrix with the criterion that similar nearby pixels shall be given higher weights. As an example, the bilateral filter  uses the intensity to measure similarity and assigns weights by
Edge-preserving image smoothing can also be achieved by solving the following optimization problem
The penalty term controls the amount of smoothness of the output and the fidelity term controls the similarity with the input . When , the optimization problem is the well-known Tikhonov regularization . The explicit solution can be found by
Since sparse norms have better tolerance for outliers than the norm, the optimization was later extended to total variation regularization with  and even sparser versions  for edge-preserving purposes. Solving these non-quadratic optimizations is more time-consuming. Thus, variable splitting  is usually exploited to cast the original large optimization problem into several small sub-problems and alternatively minimize each of these sub-problems.
3 The Sparse Norm Filter
We propose SNF by generalizing (1) through allowing the original norm to be a fractional-norm. To preserve strong edges, we need to smooth the image while to tolerate the outlier pixels by assigning lower weights to them. This type of adaptive weighting ideas have been well explored in robust statistics  and we achieve this by exploiting sparse norms. Then, SNF is defined by
Minimizing this non-quadratic cost function when is difficult. Especially when , the cost function is non-convex and conventional gradient descent-based algorithms are easily trapped into local minima.
In this paper, we consider two approximation strategies. The first strategy iteratively exploits the weighted least square technique
By taking the derivative, we find that the solution can be approximated by using weighted average filtering , with . This solution can be understood as one iteration of the anisotropic diffusion process, with the diffusity calculated based at the current pixel intensity. This way of weighting naturally enforces fidelity with the input image. Similar to the anisotropic diffusion, we can iteratively update the diffusity once we update the image with this weighted average filtering result. In practice, like the bilateral filter, one iteration is usually sufficient because the diffusion is non-local. It is noteworthy that when , the weight goes to infinity. In practice, we can avoid this by setting a threshold and raising the pixel differences by/to the threshold. We can also modify the optimization by weighting pixels according to distance using a Gaussian-like weight. However, we observe that treating all neighboring pixels equally is good enough in practice.
Another strategy quantizes the solution into a set of discrete values. For each of these discrete values , we calculate for each pixel , which can be done efficiently by using the box filter. We compare the energy at each of these discrete values and select the minimum. Similar technique is used to approximate the median filter . In this strategy, only discrete solutions at certain quantization levels are allowed because this approximation is based on brute force searching in the solution space. In practice, this strategy is more preferable when images are contaminated by outliers, e.g., the salt and pepper noise, we need a large number of iterations of the first strategy (if possible) to reach a suitable solution. For example, if the center pixel is noised and we conduct one iteration of filtering, we will assign high weights to similar pixels that are potentially noised. Thus, the obtained solution can be far away from suitable.
Both strategies are valuable. The first strategy makes the results look natural to the eye and its effect is similar to the bilateral filter, while the second strategy can filter out outliers and its effect is similar to the median filter. In all experiments except the outlier-tolerant-filtering, we choose the first strategy.
The sparse norm filter benefits from off-the-shelf acceleration methods [31, 7], and can be calculated in linear time , where is the number of bins for quantization and the pixel number . For a grayscale image, the brute force solution can be calculated with box filters if we quantize the intensities into bins. For the weighted average solution, we can similarly quantize the center pixel intensities (in the weight term) into bins . The weighted sum (numerator), and the sum of weights (denominator) can also be calculated using box filters, respectively. In comparison, an excellent state-of-art filtering technique  uses 7 box filters. In our Matlab implementation, the box filter takes 0.04 seconds per mega-pixel. The weighted average implementation of the sparse norm filter takes 0.5 seconds per mega-pixel when , and 1 second with . Experiments are carried out on an Intel i7 3610QM CPU with 8G memory. The pixel level operations will experience significant speedups in C++ implementations. For example, our direct single thread implementation of the box filter in C++ took 0.01 seconds per mega-pixel. Filtering based methods are faster than optimization-based methods [5, 30], as the latter take 2-4 seconds per mega-pixel in the same environment.
3.3 Connections to Related Work
Optimization-based filters [25, 5, 30, 4] have been widely used image enhancement tasks, e.g. denoising, edge preserving smoothing and deconvolution [13, 12], and share the form of (4), or in the pixel level notation
The norm in the fidelity term is usually an norm in existing works. SNF simplifies (8) by integrating the fidelity term and the sparse norm penalty. By setting , changing the norm in the first term to the norm, and defining the neighborhood to contain the current pixel, we can write the first term in (8) into the second term and reduce (8) to the form of (6). Thus we establish the connection with optimization-based filtering.
It is noteworthy that in most optimization-based filters, the neighborhood is of small size and only contains the adjacent pixels. By contrast, SNF extends the concept of neighborhood in a non-local way to potentially include more pixels. We consider the difference of a pixel with all the pixels, not only those that are horizontal and vertical. SNF has advantages over optimization-based filter: a one pass approximation exists and is less likely to be trapped in poor local minima, thanks to the non-local diffusion.
In addition, SNF has a close relationship with several well-known filters. By setting , SNF reduces to the averaging filter or box filter if we consider square neighborhoods. By setting , SNF is equivalent to the median filter. This can be proved by taking the derivative on the original cost function. By setting , the sparse norm filter is the dominant mode filter .
4.1 Halo Free Edge Preserving Filtering and Detail Manipulation
Explicit filtering techniques are known to create faint light rims along strong edges known as halo artifacts. This unrealistic effect has been widely discussed in [9, 5, 30]. This section shows that SNF can produce halo-free results. Fig 1 compares representative edge-preserving smoothing techniques. Although all the methods can produce high quality results, we find some tiny differences. Optimization-based smoothing algorithms [5, 30] are more capable of producing halo-free looks, but the obtained results can occasionally be unexpected if the optimization is non-convex. In Fig. 1(c) the edges look overly smoothed; the -smoothing preserves edges perfectly but it also retains speckles. Traditional weighted average filtering techniques produce smoother looks, but tend to produce halos near strong edges. These halos also lead to unnatural transitions in sharpening.
By using SNF with , similar pixels will be assigned larger weights than dissimilar pixels, thus the filter is edge preserving. When is approaching to zero, the sparse norm approximates the energy, and the filter result exhibits no visible halo effects (Fig. 2), since pixels with different intensities are assigned much lower weights than the pixels with similar weights (Fig. 3(b)). The idea is also similar to the edge-stopping diffusion in the anisotropic diffusion framework .
We use SNF to decompose the image into a base layer and detail layer . Here the base layer is the cartoon-like filtering result using the sparse norm filter. Detail enhancement can be achieved by boosting the detail layer . We demonstrate the results on a flower photo (Fig. 3(a)) by trying different combinations of the filtering radius and the norm (Fig. 4).
4.2 Outlier Tolerant Filtering
Standard edge preserving filters [21, 28] are very effective for Gaussian-like noise reduction. In the presence of extreme noise, none of them are as robust as the classic median filter. The culprit is the weighting can be misled by noise. In comparison, the sparse norm filter is a whole class of filter that can perform similarly with the median filter.
We take an example image from . We avoid the outliers by first using brute force search to approximate the global solution of (6) at a few discrete values . This intermediate result has a quantized look. (Fig 5, row 1 columns 2&3) We calculate the diffusity at this approximate solution / use this as the guidance image and use the one pass weighted average filtering (7) to output a smoothed image. (Fig 5, row 2)
4.3 HDR Compression
HDR tone mapping is a popular application which can be achieved by compressing the base layer while keeping the detail layer . In the following comparison (Fig 6) we can see that the weighted least square (WLS) , Fattal02 , Durand02  have visible halos near the strong edges. For the sparse norm filter, we set and radius to be 1/6 of the image height to conduct one pass of non-local diffusion to extract the base layer. We observe under the same the WLS method seems to be trapped in a local minimum because the cost function is non-convex. Although Drago03 , Pattanaik00 , Mantiuk06 , Reinhard05  try to reduce the halo, they fail to make some details visible in the results.
4.4 Non-blind Deconvolution
Ringing artifacts are common in deconvolution when the kernel estimation is not accurate or when frequency nulls occur. The ringing artifacts can be significantly reduced by putting a sparse norm prior on the gradient term[13, 12]. Similarly we put a non-local sparse norm on the gradients
and use an alternative minimization  technique to deconvolve the blurry image (Fig 7). In the following comparison we compare our result with the standard Tikhonov regularization which uses an penalty on the gradient term. We notice SNF can produce crisper results with fewer ringing effects.
4.5 Joint Filtering
The sparse norm filter can naturally incorporates a guidance or joint image  to provide the filtering weight or diffusity. Below we show the result of flash/No-flash denoising taking the image using flash as the joint image to remove the noise in the non-flash image, using , (Fig 8).
4.6 Image Segmentation
The sparse norm is usually used to model the gradient profile of natural images in various computer vision models. In the following experiment (Fig 9) we show it can be used to accelerate the normalized cut 
using the joint filtering techniques. Since normalized cut finds the eigenvectors of a diffusion/affinity matrix, we replace the slow matrix multiplication (which is quadratic to the neighborhood radius) in the eigensolver with our joint SNF which takes constant time with any neighborhood size. We use the original image as the guidance image and easily provides 10x-100x acceleration depending on the filtering radius. Moreover, we can extend the technique to explain and accelerate other normalized cut related algorithms [9, 14, 16, 15, 10, 8].
We demonstrate an application in colorization  as an example using joint filtering. Colorization can also be achieved by finding the stable distribution of an edge-preserving filter . The guiding weight of this filter is calculated from the gray scale image, and similar nearby pixels are assigned higher weights, which is naturally enforced using the sparse norm. To promise pixels similar in gray scale intensities are assigned similar colors, we use this guiding weight/diffusity to spread the color cues obtained from the input color strokes. We use a straightforward gradient descent algorithm to update the diffusion system. With less than 10 iterations, we can obtain high quality results (Fig 10(d)). This algorithm can also be used to re-color the flash image, shown Fig 8(d).
4.8 Seamless Photo Editing
This acceleration technique enabled by the sparse norm filter can be extended to the non-sparse norm. Seamless editing is a popular feature in image processing. Due to inconsistent color between the source and target, simple drag-and-drop editing is known to create artificial boundaries. The Poisson equation is widely used to seamlessly fill in a target region using a source region. In this framework, guided interpolation is conducted via solvingin the fill-in area , subject to the Dirichlet boundary condition . If we extend this equation by taking non-local gradients, the high dimensional Poisson equation, hints at a new system.
We also solve the above system also with the gradient descent algorithm. Since the diffusion is non-local, the algorithm converges within 10 iterations. Only needs to be updated in each iteration, which can be calculated using the box filter. We compare our algorithm with the original Poisson solver. In our environment, we use the backslash operation in Matlab to solve the Poisson equation, which takes 3 seconds per mega-pixel, excluding the time required to construct the sparse linear system. As reported above, the box filter takes only 0.04 seconds per mega-pixel in Matlab, or 0.01 seconds in C++. The results are comparable in quality (Fig 11).
In this work we present a simple but fundamental filter that builds connections with various classic smoothing techniques. The sparse norm filter can be regarded as a non-local extension of the optimization-based smoothing methods, which allows one-pass approximate solution via filtering. Through a variety of applications in image processing and computer vision, we demonstrate that the sparse norm filter gives new insights into popular applications and provides high quality accelerations.
-  (1998) Robust anisotropic diffusion. IEEE Transactions on Image Processing 7 (3), pp. 421–432. Cited by: §1, §2, §3.1, §4.1.
-  (2003) Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum 22 (3), pp. 419–426. Cited by: §4.3.
-  (2002) Fast bilateral filtering for the display of high-dynamic-range images. ACM Transactions on Graphics 21 (3), pp. 257–266. Cited by: §4.3.
-  (2002) On the origin of the bilateral filter and ways to improve it. IEEE Transactions on Image Processing 11 (10), pp. 1141–1151. Cited by: §3.3.
-  (2008) Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Transactions on Graphics 27 (3), pp. 1.1–1.10. Cited by: §1, §2, §3.2, §3.3, §4.1, §4.3.
-  (2002) Gradient domain high dynamic range compression. ACM Transactions on Graphics 21 (3), pp. 249–256. Cited by: §4.3.
-  (2012) Adaptive manifolds for real-time high-dimensional filtering. ACM Transactions on Graphics 31 (4), pp. 1–13. Cited by: §1, §3.2.
Fast matting using large kernel matting laplacian matrices.
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2165–2172. Cited by: §4.6.
-  (2010) Guided image filtering. In Proceedings of European Conference on Computer Vision, pp. 1–14. Cited by: §1, §3.2, §4.1, §4.6.
-  (2011) Single image haze removal using dark channel prior. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2341–2353. Cited by: §4.6.
-  (2007) Smoothed local histogram filters. ACM Transactions on Graphics 29 (4), pp. 1–10. Cited by: §3.3, §4.2.
-  (2009) Fast image deconvolution using hyper-laplacian priors. In Advances in Neural Information Processing, Cited by: §3.3, §4.4.
-  (2007) Image and depth from a conventional camera with a coded aperture. ACM Transactions on Graphics 26 (10), pp. 1141–1151. Cited by: §3.3, §4.4.
-  (2004) Colorization using optimization. ACM Transactions on Graphics 23 (3), pp. 689. Cited by: §4.6, §4.7.
-  (2008) A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2), pp. 228–242. Cited by: §4.6.
-  (2008) Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (10), pp. 1699–1712. Cited by: §4.6.
-  (2002) A perceptual framework for contrast processing of high dynamic range images. ACM Transactions on Applied Perception 3 (3), pp. 87–94. Cited by: §4.3.
-  (2007) A fast approximation of the bilateral filter using a signal processing approach. International Journal of Computer Vision 81 (1), pp. 24–52. Cited by: §1.
-  (2000) Time-dependent visual adaptation for realistic image display. In Proceedings of ACM SIGGRAPH, pp. 47–54. Cited by: §4.3.
-  (2004) Poisson image editing. ACM Transactions on Graphics 22 (3), pp. 313. Cited by: §4.8.
-  (1990) Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (7), pp. 629–639. Cited by: §1, §2, §4.2.
-  (2004) Digital photography with flash and no-flash image pairs. ACM Transactions on Graphics 23 (3), pp. 664. Cited by: §4.5.
-  (2008) Constant time o(1) bilateral filtering. In Proceedings of International Conference on Computer Vision and Pattern Recognition, pp. 1–8. Cited by: §1, §2.
-  (2005) Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics 11 (1), pp. 13–24. Cited by: §4.3.
-  (1992) Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 60 (1–4), pp. 259–268. Cited by: §1, §2, §3.3.
-  (2004) Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (8), pp. 888–905. Cited by: §4.6.
-  (1995) Numerical methods for the solution of ill-posed problems. Springer. Cited by: §2.
-  (1998) Bilateral filtering for gray and color images. In Proceedings of International Conference on Computer Vision, pp. 839–846. Cited by: §1, §2, §4.2.
-  (2008) A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences 1 (3), pp. 248. Cited by: §1, §2, §4.4.
-  (2011) Image smoothing via l 0 gradient minimization. Image Rochester NY 30 (6), pp. 1–12. Cited by: §1, §2, §3.2, §3.3, §4.1.
-  (2009) Real-time o(1) bilateral filtering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 557–564. Cited by: §1, §3.1, §3.2, §4.2.
-  (2012) Spectral graph cut from a filtering point of view. arXiv preprint. Cited by: §4.6, §4.7.