1 Introduction
Edgeaware optimization is a widely utilized tool in computer vision. It is applied to a large variety of tasks, including semantic segmentation
[3], stereo [4], recoloration [5], and optical flow [6]. This has been motivated by the intuition that similarlooking pixels should have similar properties. For this reason, a wide variety of edgeaware filtering algorithms have been developed, including the bilateral filter [7], anisotropic diffusion [8], and edgeavoiding wavelets [9], all of which identify similarlooking pixels. However, using such filters in optimization frameworks typically leads to slow algorithms, and while highlevel groupings like superpixels can be used to compensate for this sluggishness [10], the colorspace clusterings of such approaches are not guaranteed to respect the semantics of the underlying domain, which often leads to processing artifacts.We propose a general optimization framework that directly operates in the pixel space while maintaining distances in the combined color and pixel space with an edgeaware regularizer. The framework can be applied for a variety of optimization problems, as we demonstrate in Fig. 1 and Sec. 3. Our method achieves competitive performance in applications like stereo optimization (Sec.3.3), rendering from defocus (Sec.3.4), and depth superresolution (Sec.3.5). This advantage becomes more pronounced with increasing image resolution, as well as a growing number of image channels. At the same time, our approach is independent of blur kernel sizes, which is not the case for existing bilateral solvers. This becomes crucial for applications where the data is of high resolution and high dimensionality, for example satellite imagery where a single image has typical resolutions of more than 100 million pixels with as many as 16 spectral bands.
The remainder of the paper is organized as follows: Sec. 2 describes the traditional approaches the computer vision community has developed for edgeaware filtering and optimization. Sec. 3 derives our domain transform solver (DTS) optimization framework and highlights its similarities as well as dissimilarities with previous work. We also describe how our framework can be adapted for various vision tasks. In Sec. 4, we provide quantitative evaluation as well as validation of the timing performance. Finally, we conclude in Sec. 5 and provide some future directions for expanding on our approach.
2 Related Work
Next, we briefly review the most relevant prior work related to edgebased filtering and optimization, namely: implementations for bilateral filters, optimizations leveraging superpixel segmentation, machine learning for edgeaware filtering, the domain transform and its filtering applications, and bilateral solvers.
Bilateral filters
The bilateral filter was introduced by Tomasi and Manduchi [7] and presented one of the initial edgeaware blurring techniques. The major bottleneck to bilateral filtering is that it is costly to compute, especially at large blur windows. Since its invention, there have been multiple approaches introduced to speed up the bilateral filter [13], [14], [15], [16], [17]: Durand and Dorsey approximate the bilateral filter by a piecewise linear approximation [18]. Pham and van Vliet proposed to approximate the bilateral filter using two 1D bilateral kernels [19]. Paris and Durand proposed to treat the image as a 5D function of color and pixel space and then apply 1D blur kernels in this high dimensional space[20]. These approximations to the bilateral filter, or the use of 1D kernels in higherdimensional spaces, enables decoupling the 2D adaptive bilateral kernel into 1D kernel, reducing the computational cost significantly. When used as a postprocessing step, the bilateral filter removes noise in homogeneous regions but is sensitive to artifacts such as salt and pepper noise [21]. Our method emphasizes edgeaware concepts in the same spirit as bilateral filters, but our formulation is fundamentally different in that it provides a generalized framework for domain transform optimization.
Superpixels
To combat issues of computational complexity during bilateral optimizations, several approaches leverage superpixels, i.e., they group pixels together based on appearance and location. Superpixel extraction algorithms like SLIC [22] are often used in optimization problems for two major reasons: 1) They reduce the number of variables in the optimization, and 2) they adhere to color and (implicitly) object boundaries. In one application, BódisSzomorú et al. [23] use sparse StructurefromMotion(SfM) data, image gradients, and superpixels for surface reconstruction to ensure that the edges of triangles are aligned to the edges in the image. Lu et al. [10] use SLIC superpixels to enforce spatially consistent depths in a PatchMatchbased matching framework [24]
to estimate stereo. Using superpixels inherently assumes local consistency and perfect segmentation, which often does not hold in practice. For example, in stereo algorithms, superpixels may cover regions with similar color but drastically different depths. At such regions, the pixels in a superpixel are incorrectly grouped because they have different depths. Although algorithm parameters can be tuned, the tradeoff in coherence versus conciseness is a limiting factor in the utility of superpixel approaches.
Machine learning for edgeawareness
Yan et al. [25]
used support vector machines (SVM) to mimic a bilateral filter by using the exponential of spatial and color distances as feature vectors to represent each pixel. Traditionally, conditional random fields (CRFs) are used for enforcing pairwise pixel smoothness via the Potts potential. For example, Krähenbühl and Koltun
[3] proposed to use the permutohedral lattice data structure [14], which is typically used in fast bilateral implementations, to accelerate inference in a fully connected CRF by using Gaussian distances in space and color. With the explosion of compute capacity and convolutional models in the vision community, there are also deeplearning methods that attempt to achieve edgeaware filtering. Chen
et al. [26] presented DeepLab to perform semantic segmentation; there, they use the fully connected CRF from Krähenbühl and Koltun [3]on top of their convolutional neural network (CNN) to improve the localization of object boundaries, which typically suffers in a CNN setting due multiple maxpoolings and the use of lowresolution images. Xu
et al. [27] presented a framework to learn edgeaware operators from the data to mimic various traditional handcrafted filters like the bilateral, weighted median, and weighted least squares filters [28]. However, machine learning approaches require large amounts of training data specific to a task, as well as significant compute power, while our approach works without any taskdependent training and runs efficiently on a single GPU.The domain transform
Gastal and Oliveira [1] introduced the domain transform, a novel and efficient method for edgeaware filtering that is akin to bilateral filters. The domain transform is defined as a 1D isometric transformation of a multivalued 1D function such that the distances in the range and domain are preserved. (See Sec. 3.2 for more details.) When applied to a 1D image with multiple color channels, the transformation maps the distances in color and pixel space into a 1D distance in the transformed space. When the scalar distance is measured in the transformed space, it is equivalent to measuring the vector distance in [R,G,B,X] space. This has the benefit of dimensionality reduction, leading to a fast edgeaware filtering technique which respects edges in color while blurring nearby pixels. To apply the domain transform to a 2D image, the authors apply two passes, once in the X direction and once in Y. Applying the domain transform to an image results in a filtering effect, while in our case we optimize according to an objective function. Chen et al. [2] proposed to perform edgeaware semantic segmentation using deep learning and use the domain transform filter in their endtoend training of their deeplearning framework. They also alter the definition of what is considered as ‘edge’ by learning an edge prediction network, and they then use the learned edgemap in a domain transform. Their application of the domain transform is in the form of a filter, and hence is similar to one iteration of our method. We use the domain transform in our method in an iterative fashion in an optimization framework because it provides an efficient way to compute the local edgeaware mean.
Bilateral solvers
More recently, Barron et al. [29] suggested to view a color image as a function of the 5D space [Y,U,V,X,Y], which they call the ‘bilateral space’ to estimate stereo for rendering defocus blur. They proposed to transform the stereo optimization problem by expressing the problem variables in the bilateral space and optimize in this new space. We will refer to this method as BLStereo. Barron and Poole’s[30] Bilateral Solver (BLSolver), on the other hand, solves a linear optimization problem in the bilateral space, which is different from BLStereo. In this setting, they require a target map to enhance, as well as a confidence map for the target map. The linearization of the problem allows them to converge to the solution faster. (See Sec 3.1 for more details.) Both of these approaches quantize the 5D space into a grid, where the grid size is governed by the blur kernel sizes. This reduces the number of optimization variables and hence the complexity, leading to fast runtimes.
Our work is closely related to Barron et al. [29] and Barron and Poole [30] in that we are targeting the same goal of developing general solvers that are edgeaware and fast. The gridding strategy of these previous methods scales well with higher blur amounts/windows. However, using higher blur windows is not a scalable option as image resolution increases, especially in largeresolution imagery where it is important to maintain fine details, such as multicamera capture for virtual reality and satellite imagery. In contrast, our method does not require large blur kernels to be efficient. Our method operates on the pixels themselves, and hence inherently has a large number of optimization variables. Despite this, our approach is inherently parallelizable, making it easy to implement it on GPUs. Our method does not depend on the blur kernel size, and hence scales well with higher image resolutions with large and small blur kernels.
In the following sections, we present our general optimization framework and demonstrate its performance on a variety of problems. We present quantitative evaluations to show the competitive accuracy of our method, as well as the significant speedup that it provides.
3 Approach
Edgeaware filtering techniques like the bilateral filter smooth similar looking regions of the image while preserving crisp edges. This is especially useful for smoothing depthmaps, where we want to preserve sharp discontinuities in depth by not filtering across depth edges while smoothing out planar regions. Although edgeaware filtering has been used for stereo as a post processing step [12], as well as during optimization [4], this approach increases the computational complexity of the algorithm substantially due to the datadependent smoothing kernel. We consider an algorithm a filtering technique when the filter operates on the input image to produce an output image. On the other hand, we consider an algorithm a solver
, when it uses one or more input images and optimizes towards a goal defined by a cost/loss function.
3.1 Optimization framework
First, we introduce our efficient domain transform solver (DTS), which leverages an efficient way of expressing distances using isometry. The DTS solves the following optimization problem:
(1) 
Here, the are the values we want to estimate, e.g., disparity and color, at the pixel of an image. The initial target estimate with a confidence are also given for the pixel. This optimization objective has an edgeaware regularizer , which forces the to be similar to the mean of the neighborhood , computed in an edgeaware sense. Hence, the neighborhood’s size changes for each according to the image content. Intuitively, by forcing to be similar to the edgeaware mean , we emulate the bilateral filter’s properties so that is similar to the other which contribute to the mean only when and are similar in color and close in pixel distance. This edgeaware mean is , where takes into account the pixel color similarity as well as pixel distance between pixel and ; see Sec.3.2 for a derivation of . We compute using the domain transform, which enables us to evaluate our pairwise regularizer faster than traditional approaches; see Sec. 3.2 for more details. is an applicationdependent term with a weighting factor of . For example, for stereo, could be the photometric matching cost for the leftright image pair.
In all applications, our method aims to solve Eq.(1). The minimum at the point of the solution necessarily has a zero derivative. Hence, we next seek to characterize this minimum in order to leverage it later in our proposed approach. For simplicity, we first only investigate a simplified version of Eq.(1) that does not contain the problemspecific term . This simplified version can be written as follows:
(2) 
Taking the gradient of Eq.(2) with respect to and setting it to zero provides
(3) 
Hence, at the minima of Eq.(2) we have
(4) 
Now, we highlight the relation to the optimization function of the BLSolver. Its optimization objective is
(5) 
Inspecting the derivative at the minimum as we did for Eq.( 2) requires us to compute the gradient of Eq.(5) with respect to and setting it to zero:
(6)  
The extra factor of 2 with is due to the fact that we have to consider the terms when the roles of and are exchanged. The solutions in Eqn.(4) and Eqn.(6) look very similar. The major difference is that in Eq.(6), the contribution of confidence scores is weighted by and hence it is edgeaware. We also weigh the confidence during gradient descent updates by to mimic its effect in Eq.(6), which provides less weight to target when we have a large support via the similarities expressed by . We compute the confidence scores
by estimating the variance of the
in an edgeaware sense using the domain transform as suggested in [30]:(7) 
In this formulation, the domain transform is treated as a local estimate of the mean in an edgeaware sense, while scales the variance to get confidence scores.
In summary, Eq.(2) and Eq.(5) have the same optimal solution, but the solution of Eq.(2) can be computed significantly faster by leveraging parallel computations. The reason is that we replace the pairwise term in Eq.(5) by the local edgeaware mean, which we can compute in an efficient manner (see Sec. 3.2 for details), and we weigh the contribution of the target s by adapting the input confidence according to the local support .
3.2 Domain transform
Gastal and Oliveira [1] define an isometric transformation, which they call the domain transform (DT) for a 1D multivalued function by treating as a curve in . The domain transform is such that it preserves distances between two points on the curve under a given norm. Unlike Gastal and Oliveira [1], we use the norm to define the distances, and hence we derive the domain transform here, which satisfies the constraint for the nearest neighbors and . This derivation follows closely Sec. 4 of Gastal and Oliveira [1]. Using a shorthand notation and assuming a small shift in , we can express the distance in pixels and color equal to the distance of the transform as follows:
(8) 
Taking the square root and constraining to be monotonic to avoid negative roots, followed by integrating both sides, we obtain
(9) 
Using this definition of the domain transform of the 4D space with the curve C defined by RGB color and X denoting the domain, we can express the edgeaware mean as follows:
(10) 
where , and represents Dirac’s delta function. To see the relation with the simple domain transform blurring [1], it can be seen that setting the confidence scores to zero in Eq.(1) will lead to the same solution as the domain transform filtering. Similarly, setting to zero, and to Gaussian weights in color and space will lead to bilateral filtering. Note that the above derivation is isometric since the function I is multivalued but with a 1dimensional domain . By extending the domain to 2D, the exact isometry is not valid, and Gastal and Oliveira [1] use alternating passes by separately considering the image as a function of X and then Y.
3.3 Stereo optimization
Stereo estimation is a wellstudied problem [31, 32] in which the task is to estimate a matching correspondence of pixels in the left image to the pixels in the right image. This matching correspondence defines the disparity of the pixels and in turn the depth, and when done for each pixel provides us with a disparity map. Typically, dense search is done along the row of a rectified pair by matching the pixel color similarity known as photometric matching cost. In the following, we refine a disparity map. We obtain the disparity map from MCCNN [12], which acts as our target (Fig. 2(c)), for which we calculate a confidence score (shown in Fig. 2(d)) using Eq.(7). We use the left color image to define and compute the edgeaware mean and optimize the disparities to obtain a disparity map that is smooth at homogeneous regions but has crisp edges (Fig. 2(e)). Similar to our proposed solver, Barron and Poole [30] show that the BLSolver works well for a wide variety of optimization problems including stereo. When they apply the BLSolver to the stereo problem, they achieve faster convergence compared to BLStereo because they neglect the physical implication of changing the disparity. In other words, if an optimizer changes the estimate of disparity at a point in left image, this gets reflected in a change in the color of its matching pixel in the right image. Here, we present a method for solving for the disparity in an edgeaware sense while having a photometric penalty for the leftright matching. Our loss for stereo optimization is as follows:
(11) 
where is the left image and is the right image of the stereo pair. For robust optimization, we use a Charbonnier loss with on the target term , which has been shown to be effective for optical flow [33]. We use Zbontar and LeCun’s MCCNN [12] as the target for our stereo optimization.
Next, we will detail the application of our method to the problem of rendering defocus from depth, which is another application heavily relying on accurate depth edges.
3.4 Synthetic defocus from depth
Interest in creating synthetic defocus from depth is growing, with phones like the Google Pixel 2 and the OnePlus 5T providing a portrait mode where the shallow depth of field effect is mimicked through the estimation of depth. BLStereo’s synthetic defocus method is used as part of the Lens Blur feature on Google’s phones [29]. We use our stereo optimization from Sec. 3.3 to estimate depth maps, which retain sharp discontinuities at color edges. Figs. 5, 5, and 5 show the original color image and the defocus rendering produced by using our estimated depthmaps and the groundtruth depthmaps for scenes in the Middlebury dataset [11]. As our stereo optimization is edgeaware, the defocus rendering maintains high quality even at the edges. Notice that in the insets of Fig. 5., MCCNN has jarring artifacts, especially at the edges, while the rendering using our estimated depthmap is more smooth. In the Jadeplant scene shown in Fig. 5, the background is in focus, and for the same scene Fig. 5 the blue block in the front is kept in focus. In the Playroom scene illustrated in Fig. 5, the front chairs are chosen to be in focus. To render the synthetic defocus, we used the algorithm described in Sec. 6 of the supplementary material of Barron et al. [29]. This shows that our results are qualitatively better than MCCNN, and the most noticeable improvements are because we optimize in an edgeaware sense.
3.5 Depth superresolution
The availability of cheap commodity depth sensors like the Microsoft Kinect, Asus Xtion, and Intel RealSense has spurred many avenues of research, including depth superresolution. Depth superresolution is important for sensors like these because, often, the color camera is of high resolution, but the depth camera/projector has lowresolution, which leads to crude depth maps [34]. Ferstl et al. [35]
adapted the Middlebury dataset for the depth superresolution task to create a benchmark, on which we evaluate our method, here. For this task, we use simple bicubic interpolation for upsampling the lowresolution depth map and use this map as a target in our optimization; we use the highresolution color image to compute the domain transform based edgeaware mean and obtain our optimized result (Fig.
6(d)). We follow Barron and Poole [30] by setting the confidence scores using a Gaussian bump model to represent the contribution of each pixel to the nearby upsampled pixels. We do not use additional penalties in Eq.(1) for this task in the form of .4 Experiments
We now present quantitative evaluation of our framework as well as timing performance.
Stereo Optimization
For the quantitative evaluation of our method, we use the Middlebury dataset [11]. Barron and Poole [30] used MCCNN [12] as their initialization, and for a fair comparison we also use it as our target disparity map. Table 1 shows our results for the training set, where we present the mean absolute error (MAE), root mean square error (RMSE), time per megapixel, and time normalized by number of disparity hypotheses for nonoccluded regions and for all pixels. All of these values were determined by the Middlebury evaluation website, and all of our times include the time to calculate MCCNN on the target disparity maps. The timing for BLSolver and our method shows the additional time spent in processing MCCNN, and the total value in the paranthesis. Note that we obtain a huge performance boost compared to MCCNN at a marginal overhead in time, and we have similar performance with Barron and Poole [30], especially in nonoccluded pixels, while running in only a fraction of their time. The results for the test set show that our method achieves comparable results in nonoccluded regions, while having significant computational savings. We used , with RGB colors normalized to a range of [0,1], , and . These parameters were found to work best via grid search strategy on the Middlebury training data. We ran a gradient descent algorithm for 3000 iterations in this experiment with a step size of 0.99 times the gradient. Fig. 8 and Fig. 8 show zoomed regions from the Jadeplant and Pipes scene to highlight that we improve the target disparity maps from MCCNN [12] to estimate sharp depth edges.
Training  

Algorithm  MAE(px)  RMSE(px)  time/MP(s)  time/GD(s) 
no occ all  no occ all  
MCCNN [12]  3.81 11.8  18.0 36.6  83.3  259 
MCCNN + BLSolver [30]  2.60 6.66  10.2 20.9  42.7 (126)  153 (412) 
MCCNN+DTS (ours)  3.02 9.12  10.8 27.4  5.9 (89.2)  19 (278) 
Testing  
MCCNN [12]  3.82 17.9  21.3 55.0  112  254 
MCCNN + BLSolver [30]  2.67 8.19  15.0 29.9  28 (140)  91 (345) 
MCCNN+DTS (ours)  3.78 14.6  17.6 43.4  10 (122)  23 (277) 
Depth superresolution
We use the dataset introduced by Ferstl et al. [35]to evaluate our method for depth superresolution. This dataset consists of three scenes (Art, Books, and Moebius) with added noise at 2, 4, 8, and 16x levels of upsampling. We used where is the amount of upsampling. We used with RGB colors normalized to a range of [0,1], , and 10 iterations of the gradient descent with a step size of 0.99. In Table 2, we present the RMS and mean geometric errors for each scene. Data for the bicubic and BLSolver were produced by using the data and code provided by Barron et al. [36]. We also used the same code to evaluate our method. Our method and the BLSolver used the bicubic upsampling as the target image. Our method is 10x times faster than Barron et al. [30] while achieving comparable performance on most images, especially images which have higher upsampling factors. Our time is the average over all images and includes 0.007 seconds required for bicubic upsampling.
Art  Books  Moebius  Avg.  Time  
2x  4x  8x  16x  2x  4x  8x  16x  2x  4x  8x  16x  (px)  (s)  
Bicubic  5.32  6.07  7.27  9.59  5.00  5.15  5.45  5.97  5.34  5.51  5.68  6.11  5.94  0.007 
BLSolver [30]  3.02  3.91  5.14  7.47  1.41  1.86  2.42  3.34  1.39  1.82  2.40  3.26  2.75  0.234 
DTS (Ours)  4.58  5.11  5.81  7.69  1.94  2.34  2.85  3.74  1.97  2.34  2.89  3.89  3.43  0.0215 
Scale
Now we present how our method scales with increasing image resolution and increasing blur kernel sizes. Our method scales linearly with the number of pixels in the image. Fig. 9(a) shows the dependence of time in seconds on the number of pixels in the image. We use the training images from the Middlebury dataset and only show the time consumed by DTS for the stereo task at 3000 iterations.
Timing vs blur kernels
The time taken by our method remains mostly constant in comparison to the vastly different blur kernel sizes. Fig. 9(b) shows average time taken when we change , while keeping constant. Fig. 9(c) shows average time taken at different values of with constant. All the times are at 3000 iterations. In fact, there is a weak dependence of time on the blur kernel sizes. The range of kernel sizes is large, hence the small time changes are negligible in practice.
Iterations
The number of iterations in the gradient descent scheme affects the accuracy. Here, we study this effect for the training images of the Middlebury dataset. Fig. 10 (a) and (b) show the MAE and RMSE, which we calculated for the training set for all the pixels including occlusions in all the training images in the dataset. The 3000 iterations result is the same as the shown in Table 1, but the numbers are different because the Middlebury evaluation internally uses a weighting scheme, which is not used here. Both MAE and RMSE reduce very quickly at approximately 300 iterations, and after that the gains are smaller. The time taken for iterations is linear – see Fig. 10(c). This allows us to easily trade off resolution, quality, and run time depending on the application.
5 Conclusion
We have presented a novel edgeaware solver that achieves scalable performance across a variety of applicable tasks. Our method is faster by an order of magnitude compared to the state of the art while performing at comparable accuracy. The approach is highly parallelizable and scales well w.r.t image resolution, and unlike existing methods, it is independent of blurring kernel size. A future step is to extend our approach to multiGPU setting, as well as use advanced optimization methods like conjugate gradient descent to obtain faster convergence.
References
 [1] Gastal, E.S., Oliveira, M.M.: Domain transform for edgeaware image and video processing. In: ACM Transactions on Graphics (ToG). Volume 30., ACM (2011) 69

[2]
Chen, L.C., Barron, J.T., Papandreou, G., Murphy, K., Yuille, A.L.:
Semantic image segmentation with taskspecific edge detection using
cnns and a discriminatively trained domain transform.
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2016) 4545–4554
 [3] Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in neural information processing systems. (2011) 109–117
 [4] Bleyer, M., Rhemann, C., Rother, C.: Patchmatch stereostereo matching with slanted support windows. In: Bmvc. Volume 11. (2011) 1–11
 [5] Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: ACM Transactions on Graphics (ToG). Volume 23., ACM (2004) 689–694
 [6] Revaud, J., Weinzaepfel, P., Harchaoui, Z., Schmid, C.: Epicflow: Edgepreserving interpolation of correspondences for optical flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2015) 1164–1172
 [7] Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Computer Vision, 1998. Sixth International Conference on, IEEE (1998) 839–846
 [8] Perona, P., Malik, J.: Scalespace and edge detection using anisotropic diffusion. IEEE Transactions on pattern analysis and machine intelligence 12(7) (1990) 629–639
 [9] Fattal, R.: Edgeavoiding wavelets and their applications. ACM Transactions on Graphics (TOG) 28(3) (2009) 22
 [10] Lu, J., Yang, H., Min, D., Do, M.N.: Patch match filter: Efficient edgeaware filtering meets randomized search for fast correspondence field estimation. In: Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, IEEE (2013) 1854–1861
 [11] Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nešić, N., Wang, X., Westling, P.: Highresolution stereo datasets with subpixelaccurate ground truth. In: German Conference on Pattern Recognition, Springer (2014) 31–42
 [12] Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research 17(132) (2016) 2
 [13] Weiss, B.: Fast median and bilateral filtering. In: Acm Transactions on Graphics (TOG). Volume 25., ACM (2006) 519–526
 [14] Adams, A., Baek, J., Davis, M.A.: Fast highdimensional filtering using the permutohedral lattice. In: Computer Graphics Forum. Volume 29., Wiley Online Library (2010) 753–762
 [15] Chen, J., Paris, S., Durand, F.: Realtime edgeaware image processing with the bilateral grid. In: ACM Transactions on Graphics (TOG). Volume 26., ACM (2007) 103
 [16] Yang, Q., Ahuja, N., Tan, K.H.: Constant time median and bilateral filtering. International Journal of Computer Vision 112(3) (2015) 307–318
 [17] Elad, M.: On the origin of the bilateral filter and ways to improve it. IEEE Transactions on image processing 11(10) (2002) 1141–1151
 [18] Durand, F., Dorsey, J.: Fast bilateral filtering for the display of highdynamicrange images. In: ACM transactions on graphics (TOG). Volume 21., ACM (2002) 257–266
 [19] Pham, T.Q., Van Vliet, L.J.: Separable bilateral filtering for fast video preprocessing. In: Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, IEEE (2005) 4–pp
 [20] Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. In: European conference on computer vision, Springer (2006) 568–580
 [21] Zhang, M., Gunturk, B.K.: Multiresolution bilateral filtering for image denoising. IEEE Transactions on image processing 17(12) (2008) 2324–2333
 [22] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to stateoftheart superpixel methods. IEEE transactions on pattern analysis and machine intelligence 34(11) (2012) 2274–2282
 [23] BódisSzomorú, A., Riemenschneider, H., Van Gool, L.: Superpixel meshes for fast edgepreserving surface reconstruction. In: Proceedings CVPR 2015. (2015) 2011–2020
 [24] Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on GraphicsTOG 28(3) (2009) 24
 [25] Yang, Q., Wang, S., Ahuja, N.: Svm for edgepreserving filtering. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, IEEE (2010) 1775–1782
 [26] Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915 (2016)
 [27] Xu, L., Ren, J., Yan, Q., Liao, R., Jia, J.: Deep edgeaware filters. In: International Conference on Machine Learning. (2015) 1669–1678
 [28] Farbman, Z., Fattal, R., Lischinski, D., Szeliski, R.: Edgepreserving decompositions for multiscale tone and detail manipulation. In: ACM Transactions on Graphics (TOG). Volume 27., ACM (2008) 67
 [29] Barron, J.T., Adams, A., Shih, Y., Hernández, C.: Fast bilateralspace stereo for synthetic defocus. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2015) 4466–4474
 [30] Barron, J.T., Poole, B.: The fast bilateral solver. In: European Conference on Computer Vision, Springer (2016) 617–632
 [31] Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense twoframe stereo correspondence algorithms. International journal of computer vision 47(13) (2002) 7–42
 [32] Hirschmuller, H., Scharstein, D.: Evaluation of cost functions for stereo matching. In: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE (2007) 1–8
 [33] Sun, D., Roth, S., Black, M.J.: Secrets of optical flow estimation and their principles. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, IEEE (2010) 2432–2439
 [34] Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2) (2012) 1437–1454
 [35] Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: Computer Vision (ICCV), 2013 IEEE International Conference on, IEEE (2013) 993–1000
 [36] Jon, B.: https://jonbarron.info/. https://drive.google.com/file/d/0B4nuwEMaEsnmaDI3bm5VeDRxams/view?usp=sharing (2008) Online; accessed 8March2018.
Comments
There are no comments yet.