Gaussian and box filters typically work well in applications where the amount of smoothing required is small. For example, they are quite effective in removing small dosages of noise from natural images. However, when the noise floor is large, and one is required to average more pixels to suppress the noise, these filters begin to over-smooth sharp image features such as edges and corners. The over-smoothing can, however, be alleviated using some form of data-driven (non-linear) diffusion, where the quantum of smoothing is controlled using the image features. A classical example in this regard is the famous PDE-based diffusion of Perona and Malik . The bilateral filter was proposed by Tomasi and Maduchi 
as a filtering-based alternative to the Perona-Malik diffusion. The bilateral filter has turned out to be a versatile tool that has found widespread applications in image processing, computer graphics, computer vision, and computational photography. More recently, the bilateral filter has received renewed attention in the context of image denoising [5, 6].
In this paper, we consider a standard form of the bilateral filter where a Gaussian kernel is used for range filtering, and a box or Gaussian kernel is used for spatial filtering . In this setting, the bilateral filtering of an image , where is some finite rectangular domain of , is given by
The spatial filter is a Gaussian:
or a box:
The domain of the spatial kernel is a square neighbourhood, , where for the Gaussian filter. We refer the interested reader to [3, 4] for a detailed exposition on the working of the filter. We note that the bilateral filter has a straightforward extension to video and volume data. Another natural extension is the cross (or joint) bilateral filter . While we will limit our discussion to the standard bilateral filter, the main ideas in this paper can also be applied to the above-mentioned extensions.
I-a Fast Bilateral Filtering
It is clear that a direct computation of (1) requires operations per pixel. In fact, the computation is slow for practical settings of . To address this issue, researchers have come up with several fast algorithms  - 
. Most of these are based on some form of approximation, and provide various levels of compromise between speed and quality of approximation. One of the early algorithms for fast bilateral filtering involved the quantization of the image intensities, where the final output was obtained via the interpolation of the output of a set of linear filters. It was later shown that this approximation can be used to obtain a constant-time implementation which further improves its speed . In a different direction, it was observed in  that the bilateral filter can be conceived as a linear filter acting in three-dimensions, where the three-dimensions are obtained by augmenting the image intensity to the spatial dimensions. This observation was used to derive a fast filtering in three-dimensions, which was then sampled to obtained the final output. We refer the interested reader to  for a survey of fast algorithms for bilateral filtering.
The algorithms in [10, 11, 12] are particularly relevant to the present work. Here the authors proceed by approximating (2) using polynomial and trigonometric functions, and demonstrate how the bilateral filter can be decomposed into a series of spatial filterings as result. As is well-known, since spatial box and Gaussian filters can be implemented in constant-time using separability and recursion , the overall approximation can therefore be computed in constant-time.
I-B Present Contribution
We propose a fast algorithm for computing (1) which was motivated by the line of work in [12, 14].
In particular, similar to these papers, we present a novel approximation of (2) that allows us to decompose the bilateral filter into a series of spatial convolutions.
The fundamental difference between the above papers and the present approach is that, instead of approximating (2) and then translating the approximation in range space, we directly approximate the translated Gaussians appearing in (1). In particular, the computational advantages obtained using the proposed approximation are the following:
(1) For a fixed approximation order (to be defined shortly), the proposed approximation requires half the number of spatial filterings required by the approximations in [8, 10, 12].
(2) The proposed approximation does not involve the transcendental functions and which are used in [12, 14]. It only involves polynomials (and just a single Gaussian), and hence can be efficiently implemented on hardware . Moreover, the rounding error is small when working with polynomials.
As will be demonstrated shortly, the proposed algorithm is generally faster and more accurate than Yang’s algorithm , which is currently considered to be the state-of-the-art [10, 16]. In particular, we perform an error analysis whereby we compare the output obtained using the proposed algorithm with that of the exact bilateral filter. Due to the particular nature of the proposed approximation, our analysis is much more simple than that carried out for Yang’s algorithm in . Nevertheless, compared to Yang’s algorithm, we are able to establish a smaller bound on the number of spatial filterings required to achieve a given filtering accuracy. The latter is defined in terms of the error between the outputs of the bilateral filter and the fast algorithm (this will be made precise in Section III). To best of our knowledge, with the exception of , this is the only fast algorithm that comes with a provable guarantee on the quality of approximation. At this point, we note that the term “accurate” is used in the paper not just to signify that the output of the fast algorithm is visibly close to that of the target bilateral filter. It also has a precise technical meaning, namely, that we can control the approximation order to make the error between the outputs of the bilateral filter and the fast algorithm arbitrarily small.
The rest of the paper is organized as follows. We present the proposed kernel approximation and the error analysis in Section II. In Section III, we develop a fast constant-time algorithm arising from the Gaussian-polynomial approximation. We then analyze the quality of approximation that can be achieved using our algorithm. This gives us a simple rule for tuning the approximation order for a given user-defined accuracy. We present exhaustive numerical results in Section IV, and demonstrate the superior performance of the proposed algorithm over some of the existing algorithms.
Ii Gaussian-Polynomial Approximation
The present idea is to consider the translated kernel that appears in (1), where and . We can write
For a fixed translation , this is a function of . Notice that the first term is simply a scaling factor, while the second term is a Gaussian centered at the origin. In fact, the second term essentially contributes to the bell shape of the translated Gaussian. The third term is a monotonic exponential, which is increasing or decreasing depending on the sign of ; this term helps in translating the Gaussian to .
We assume (without loss of generality, as will be explained at the start of Section III) that the dynamic range of the image is . That is, the arguments and in (5) take values in . This means that the product appearing in (5) takes values in . Consider the Taylor expansion of the exponential term about the origin:
By dropping the higher-order terms, we obtain the following approximation of (5):
Being the product of a bivariate Gaussian and a polynomial, we will henceforth refer to (7) as a Gaussian-polynomial, where is its approximation order. By construction, we have the pointwise convergence
We would like to note that the above idea of splitting the kernel and approximating a part of its using Taylor polynomials was employed in  in the context of the fast Gauss transform. To the best of our knowledge, this idea has not been exploited for fast bilateral filtering along the lines of the present work.
In Figure 1, we study the approximations corresponding to different . The fundamental difference between the Taylor approximation in  and the Gaussian-polynomial approximation (8) is that instead of approximating the entire Gaussian, we approximate one of its component, namely the exponential function in (5). The intuition behind this is that the Taylor polynomial blows up as one moves away from the origin. This makes it difficult to approximate the tail part of a Gaussian using such polynomials. On the other hand, the exponential in (5) is monotonic, and hence can be closely approximated using polynomials. This point is explained with an example in Figure 2. In particular, notice in Figure (b)b that the Gaussian-polynomial approximation is quite precise over the range of interest, and is comparable to the raised-cosine approximation of same order .
Ii-a Quantitative Error Analysis
Before explaining how we can use Gaussian-polynomials to derive a fast bilateral filter in Section III, we study the kernel error incurred by approximating (2) using Gaussian-polynomials. We will see in Section III that a bound on the kernel error can in turn be used to bound the filtering accuracy of the fast algorithm. Note that (8) tells us that Gaussian-polynomial can be used to approximate the range kernel with arbitrary accuracy. However, in practice, we will be required to use a Gaussian-polynomial of some fixed order . A relevant question is the size of error incurred for a given ? A related question is that, given some error margin , how do we fix the smallest such that the corresponding error is within ?
To begin with, we define the error function
The mathematical problem is one of bounding (II-A) for fixed and . In this work, we consider the error given by
This is also referred to as the worst-case or uniform error. We note that one can measure the error using other means, e.g., using the metric. The reason why we choose the metric is that our ultimate goal is to quantify the accuracy of the final filtering arising from the approximation, and a bound on (10) is sufficient for this purpose. Moreover, computing the error is relatively simple.
Using the inequality , we can bound the first term in (II-A) by . Therefore, we have
After differentiation, we get
To get an idea of the tightness of the bound in (13), we compare the mesh plots of (II-A) and (12) in Figure 3 when and . While there is a gap between the error and the corresponding bound at certain values of , the supremum of the latter (which occurs at one of the boundaries as predicted above) is nevertheless of the same order of magnitude as the supremum of the former.
Ii-B Relation between and Kernel Error
is said to follow a Poisson distribution with parameterif
We can thus interpret the quantity on the right in (13) as the probability . In this context, the leading question is the following: given , find the smallest such that . The advantage of expressing the problem in this form is that it brings to our disposal various tools for bounding the tail probability. For example, assuming that , we have the Chebyshev bound :
On the other hand, the Chernoff bound  when is given by
Numerical experiments suggest that for and for a range of values of (to be reported shortly), the empirically computed is always larger than . Under this assumption, we have the following estimate of the smallest using (14):
where is the smallest integer greater than or equal to .
As is well-known, the Chernoff bound (15) is typically tighter than the Chebyshev bound. However, finding the smallest such that
is somewhat more involved.
Let be the inverse of the map on . Then the smallest integer greater than for which (17) holds is
where and .
The details are provided in Appendix VI-A. While can be computed using the Matlab script lambertw(0,t), we note that can be approximated using a series expansion . In particular, the first four terms are
However, we observed that (19) provides inexact estimates when is large, that is, when is small. An extremely large number of terms of the series are required to get a precise estimate. To address this problem, we propose to use Newton iterations for finding the positive root of (see Appendix VI-A for notations), where the initialization is done using (18) and (19). Namely, starting with , we run the following iterations for :
In practice, we noticed that about - iterations are sufficient to produce a good solution. In Table III, we illustrate the improvement obtained after performing the Newton iterations. The complete scheme for computing the order for a given accuracy is summarized in Algorithm 1.
Note that for , we use a fixed order of . This is because the condition in (14) and (15) is violated in this regime. Moreover, we have noticed that a small order suffices when is large. In Figure 4, we compare the estimated order obtained using the following methods: Chebyshev (16), Chernoff (18) along with (19), and Chernoff followed by Newton iterations (20). We also compare the corresponding errors (computed using exhaustive search) given by (10). Notice that the estimates are close to that obtained using exhaustive search when ; however, when , the Chebyshev bound is quite loose.
Iii Fast Bilateral Filtering
We now explain how Gaussian-polynomials can be used to derive a fast algorithm for implementing (1). As a first step, we center the intensity range around the origin. This is in keeping with the Taylor expansion in (7) which is performed around the origin. A simple means of doing so is to set , assuming the dynamic range to be , and to consider the centred image given by
The crucial observation is that that the shift operation in (21) commutes with the non-linear bilateral filtering.
In other words, we can first centre the intensity range, apply the bilateral filter, and finally add back the centre to the output. Henceforth, we will assume that the range of the input image is . For an -bit grayscale image, .
Iii-a Fast Algorithm
The underlying mechanism of the proposed fast algorithm is related to the fast algorithms in [12, 14]. The subtle difference is that instead of directly approximating (2), we approximate its translates in (1). In particular, we fix some order , and approximate the range kernel in (1) using (7). This gives us the following Gaussian-Polynomial Approximation (GPA) of (1):
Next, for , we define the images
Notice that we have effectively transferred the non-linearity of the bilateral filter to the intermediate images in (24), which are obtained from the input image using simple pointwise transforms. The computational advantage that we get from the above manipulation is that the spatial filtering in (25) can be computed using operations per pixel when is a box or a Gaussian . The overall cost of computing (23) is therefore per pixel with respect to the filter size . This is a substantial reduction from the complexity of the direct implementation of (1).
The complete algorithm for computing (23) is summarized in Algorithm 2, which we will continue to refer as GPA. Note that we efficiently implement steps (24) to (28) by avoiding redundant computations. In particular, we recursively compute the images in (24) and the factorials in (27) and (28). Notice that steps 2-2, 2-2, 2-2, and 2 are cheap pointwise operations. The main computation in Algorithm 2 is the spatial filtering in step 2, and the initial filtering in step 2. That is, the overall cost is dominated by the cost of computing spatial filtering. In this regard, we note that for the same degree , the number of spatial filterings required in [12, 14] is , and that in  is . Moreover, we note that the proposed algorithm involves the evaluation of a transcendental function just once, namely in step 2. In contrast, the algorithm in  requires evaluations of the Gaussian over the whole image. Thus, the present algorithm has smaller rounding errors, and is better suited for hardware implementations  compared to the above mentioned algorithms. Yet another key advantage with the Algorithm 2 is that we need just six images (excluding the input and output images) for the complete pipeline. As against this, the algorithm in  requires the computation and storage of principal images, which are interpolated to get the final output.
Iii-B Filtering Accuracy
It is clear that the kernel error, and hence the overall quality of approximation, is controlled by the order . In this regard, we need a rule to fix in Algorithm 2. As before, we will consider the worst-case error given by
By bounding (29), we can control the pixelwise difference between the exact and the approximate bilateral filter. In particular, we have the following result which formally establishes the intuitive fact that the filtering accuracy is essentially within a certain factor of the kernel error given by (10). The details of the derivation are provided in Appendix VI-C.
Suppose that the spatial filter is non-negative and normalized, i.e., for all , and
We note that the spatial filters (3) and (4) are non-negative, and that appears in both the numerator and denominator of (1) and (23). Therefore, we can assume (30) without any loss of generality. In fact, (30) is automatically true for the box filter. We also recall that the range of the image is assumed to be centered; the intensity values are in the interval , where for most grayscale images.
Iii-C Relation between Accuracy and
Note that by combining (31) with the bound in (15), we get a direct control on the filtering accuracy in terms of the approximation order. In particular, suppose that we want (29) to be within . A sufficient condition for this is that
To summarize, we have the following guarantee that follows from (31).
Iv Experiments and Discussion
We implemented the proposed GPA algorithm using Matlab on an Intel GHz Linux system with GB memory. The Matlab implementation has been shared here . The set of grayscale images used for the experiments are shown in Figure 5. We compared the proposed algorithm with the following fast algorithms: Yang , Paris , Weiss , and the Shiftable Bilateral Filter (SBF) . We used the Matlab implementation of these algorithms to make the comparison fair; moreover, we used the parameter settings suggested in the respective papers. For determining the order in  for a given accuracy parameter , we have used (34).
Experiment 1 The output of the proposed GPA algorithm on a couple of images are shown in Figures 6 and 7. We also provided the output obtained using exact bilateral filtering. We performed the comparison using the box and the Gaussian kernels for the spatial filter. Notice that the speedup obtained is significant. Moreover, the filtered images are visually identical and numerically very close, in terms of the and mean-squared errors. We have used the following definition of mean-squared error (MSE):
where denotes the number of pixels in the image.
To get a better understanding of how varies with , we used the following approximation (see Appendix VI-D):
An important point to note in (33) is the logarithmic dependence on . In fact, the factor can be traced back to the tail bound in (15), which, in turn, follows from the particular splitting in (7). The implication of the logarithmic dependence is that we can force to be quite small without blowing up .
The above estimate was recently derived in . In particular, notice that the dependence on is similar to that in (33). However, the dependence on is much more strong in (34) compared to (33), since when . Moreover, the leading constant in (34) is much larger than the constant in the first term in (33). As an example, when and , we have for Yang’s algorithm (this is the estimate reported in ). On the other hand, the corresponding estimate for our algorithm is (assuming that and that we use a box filter of size ). The difference becomes dramatic for smaller values of . For example, when and , the estimate from (33) is , while that from (34) is . Further comparisons are provided in Table II. Notice that the order for Yang’s approximation explodes when (sub-pixel accuracy). It is also seen from the table that (33) provides a close approximation of (18) for the setting under consideration.
Experiment 2 A graphic comparison of the algorithms for various settings of the spatial and range kernels is presented in Figures 8 and 9. As before, we performed the comparison for both the box and Gaussian spatial filters. It is evident from these results that the proposed method is competitive with existing methods in terms of the speed-accuracy tradeoff.
Experiment 3 We next compared the proposed algorithm with existing fast algorithms on the images shown in Figure 5. A summary of the comparisons (in terms of maximum pixelwise error and MSE) is provided in Figure 10.
|GPA (Gaussian)||Yang (Gaussian)||GPA (Box)||Yang (Box)|
|GPA (Gaussian)||Yang (Gaussian)||GPA (Box)||Yang (Box)|
Experiment 4 Finally, we performed a detailed comparison of the proposed algorithm with Yang’s algorithm, which is widely considered to be the state-of-the-art algorithm. In the first comparison, we fixed an image and the parameters of the bilateral filter. The order was then varied and the corresponding error and run times were noted. The results are presented in Table III. Notice that the run time of GPA is consistently smaller than that of Yang’s algorithm for both the box and Gaussian kernels. Indeed, as remarked earlier, for a fixed order , Yang’s algorithm  requires spatial filterings, while GPA requires only spatial filterings. Thus, the runtime of GPA is about half of that of Yang’s algorithm. Moreover, beyond a certain , GPA provides much better filtering accuracy. We performed a similar experiment by varying , the results of which are reported in Table IV. Notice that the run time of Yang’s algorithm becomes prohibitively large when is small.
We presented a novel fast algorithm for approximating the bilateral filter. The algorithm was shown to be both fast and accurate in practice using extensive experiments. The space and time complexity of the proposed algorithm is smaller than the state-of-the-art algorithm of Yang , and, moreover, was shown to provide much better accuracy. We also performed an error analysis of the approximation scheme, and presented a rule for setting the approximation order that can guarantee the filtering accuracy to be within a desired margin. Before concluding, we note that the proposed algorithm can be used to perform cross bilateral filtering, and can also be extended for the filtering of video and volume data.
Vi-a Derivation of (18)
Taking the logarithm of (17), we can restate the problem as one of finding the smallest integer such that
where and .
Notice that and . Hence, is strictly convex over with a minimum at . Since when , we conclude that there exists some for which . The smallest integer solution of (17) is precisely . To find , we solve the equations and . Note that we can write as
which is of the form , where . The inverse of the mapping is a well-studied function called the Lambert W-function . In particular, the inverse (which is generally multivalued) in this case is given by
Vi-B Derivation of (26)
In terms of (24), we can write as
Vi-C Derivation of (31)
Similarly, we write (23) as , where
We can then write as
We uniformly upper-bound (resp. lower-bound) the numerator (resp. denominator) in (VI-C). In particular, note that
This follows from the fact that in (1) can be expressed as a convex combination of . On the other hand, is
Therefore, using (30), we get
To uniformly lower-bound , we note that for ,
where we have used the non-negativity of the range and spatial kernels. Using the inverse triangle inequality along with (40), we have for ,
Vi-D Derivation of (33)
The authors thank Dr. Alessandro Foi and the anonymous reviewers for their useful comments and suggestions.
-  K. N. Chaudhury, “Fast and accurate bilateral filtering using Gauss-polynomial decomposition,” Proc. IEEE International Conference on Image Processing, pp. 2005 - 2009, 2015.
-  P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629-639, 1990.
-  C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” Proc. IEEE International Conference on Computer Vision, pp. 839-846, 1998.
-  S. Paris, P. Kornprobst, J. Tumblin, and F. Durand, Bilateral Filtering: Theory and Applications, Now Publishers Inc., 2009.
-  C. Knaus and M. Zwicker, “Progressive image denoising,” IEEE Transactions on Image Processing, vol. 23, no.7, pp. 3114-3125, 2014.
-  K. N. Chaudhury and K. Rithwik, “Image denoising using optimally weighted bilateral filters: A SURE and fast approach,” Proc. IEEE International Conference on Image Processing, pp. 108-112, 2015.
-  F. Durand and J. Dorsey. “Fast bilateral filtering for the display of high-dynamic-range images,” ACM Transactions on Graphics, vol. 21, no. 3, pp. 257-266, 2002.
Q. Yang, K. H. Tan, and N. Ahuja, “Real-time bilateral filtering,”
Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 557-564, 2009.
-  S. Paris and F. Durand, “A fast approximation of the bilateral filter using a signal processing approach,” Proc. European Conference on Computer Vision, pp. 568-580, 2006.
-  K. Sugimoto and S. I. Kamata, “Compressive bilateral filtering,” IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3357-3369, 2015.
-  F. Porikli, “Constant time bilateral filtering,” Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.
-  K. N. Chaudhury, D. Sage, and M. Unser, “Fast bilateral filtering using trigonometric range kernels,” IEEE Transactions on Image Processing, vol. 20, no. 12, pp. 3376-3382, 2011.
-  R. Deriche, “Recursively implementing the Gaussian and its derivatives, Research Report, INRIA-00074778, 1993.
-  K. N. Chaudhury, “Acceleration of the shiftable algorithm for bilateral filtering and nonlocal means,” IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1291-1300, 2013.
-  J. M. Muller, Elementary Functions: Algorithms and Implementation, Birkhauser Boston, 2006.
-  S. An, F. Boussaid, M. Bennamoun, and F. Sohel, “Quantitative error analysis of bilateral filtering,” IEEE Signal Processing Letters, vol. 22, no. 2, pp. 202-206, 2015.
-  C. Yang, R. Duraiswami, and N. A. Gumerov, “Improved fast gauss transform,” Technical Report CS-TR-4495, UMIACS, Univ. of Maryland, College Park, 2003.
-  M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Cambridge University Press, 2005.
-  R. M. Corless, G. H. Gonnet, D. E. Hare, D. J. Jeffrey, and D. E. Knuth, “On the Lambert W function,” Advances in Computational Mathematics, vol. 5, no. 1, pp. 329-359, 1996.
-  K. N. Chaudhury and S. Dabhade, Fast and Accurate Bilateral Filtering (www.mathworks.com/matlabcentral/fileexchange/56158), MATLAB Central File Exchange, retrieved March 25, 2016.
-  http://www.imageprocessingplace.com/root_files_V3/image_databases.htm.
-  B. Weiss, “Fast median and bilateral filtering,” Proc. ACM Siggraph, vol. 25, pp. 519-526, 2006.