A Generalized Framework for Edge-preserving and Structure-preserving Image Smoothing

07/23/2019 ∙ by Wei Liu, et al. ∙ 2

Image smoothing is a fundamental procedure in applications of both computer vision and graphics. The required smoothing properties can be different or even contradictive among different tasks. Nevertheless, the inherent smoothing nature of one smoothing operator is usually fixed and thus cannot meet the various requirements of different applications. In this paper, a non-convex non-smooth optimization framework is proposed to achieve diverse smoothing natures where even contradictive smoothing behaviors can be achieved. To this end, we first introduce the truncated Huber penalty function which has seldom been used in image smoothing. A robust framework is then proposed. When combined with the strong flexibility of the truncated Huber penalty function, our framework is capable of a range of applications and can outperform the state-of-the-art approaches in several tasks. In addition, an efficient numerical solution is provided and its convergence is theoretically guaranteed even the optimization framework is non-convex and non-smooth. The effectiveness and superior performance of our approach are validated through comprehensive experimental results in a range of applications.

READ FULL TEXT VIEW PDF

Authors

page 1

page 5

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The key challenge of many tasks in both computer vision and graphics can be attributed to image smoothing. At the same time, the required smoothing properties can vary dramatically for different tasks. In this paper, depending on the required smoothing properties, we roughly classify a large number of applications into four groups.

Applications in the first group require the smoothing operator to smooth out small details while preserving strong edges, and the amplitudes of these strong edges can be reduced but the edges should be neither blurred nor sharpened. Representatives in this group are image detail enhancement and HDR tone mapping [9, 10, 19]. Blurring edges can result in halos while sharpening edges will lead to gradient reversals [9].


Figure 1: Our method is capable of (a) image detail enhancement, (b) clip-art compression artifacts removal, (c) guided depth map upsampling and (d) image texture removal. These applications are representatives of edge-preserving and structure-preserving image smoothing and require contradictive smoothing properties.

The second group includes tasks like clip-art compression artifacts removal [29, 40, 38], image abstraction and pencil sketch production [40]. In contrast to the ones in the first group, these tasks require to smooth out small details while sharpening strong edges. This is because edges can be blurred in the compressed clip-art image and they need to be sharpened when the image is recovered (see Fig. 1(b) for example). Sharper edges can produce better visual quality in image abstraction and pencil sketch. At the same time, the amplitudes of strong edges are not allowed to be reduced in these tasks.

Guided image filtering, such as guided depth map upsampling [32, 11, 28] and flash/no flash filtering [23, 33], is categorized into the third group. The structure inconsistency between the guidance image and target image, which can cause blurring edges and texture copy artifacts in the smoothed image [18, 28], should be properly handled by the specially designed smoothing operator. They also need to sharpen edges in the smoothed image due to the reason that low-quality capture of depth and noise in the no flash images can lead to blurred edge (see Fig. 1(c) for example).

Tasks in the fourth group require to smooth the image in a scale-aware manner, e.g., image texture removal [41, 45, 7]. This kind of tasks require to smooth out small structures even when they contain strong edges, while large structure should be properly preserved even the edges are weak (see Fig. 1(d) for example). This is totally different from that in the above three groups where they all aim at preserving strong edges.

To be more explicit, we categorize the smoothing procedures in the first to the third groups as edge-preserving image smoothing since they try to preserve salient edges, while the smoothing processes in the fourth group are classified as structure-preserving image smoothing because they aim at preserving salient structures.

A diversity of edge-preserving and structure-preserving smoothing operators have been proposed for various tasks. Generally, each of them is designed to meet the requirements of certain applications, and thus its inherent smoothing nature is usually fixed. Therefore, there is seldom a smoothing operator that can meet all the smoothing requirements of the above four groups, which are quite different or even contradictive. For example, the norm smoothing [40] can sharpen strong edges and is suitable for clip-art compression artifacts removal, however, this will lead to gradient reversals in image detail enhancement and HDR tone mapping. The weighted least squares (WLS) smoothing [9] performs well in image detail enhancement and HDR tone mapping, but it is not capable of sharpening edges and structure-preserving smoothing, etc.

In contrast to most of the smoothing operators in the literature, a new smoothing operator, which is based on a non-convex non-smooth optimization framework, is proposed in this paper. It can achieve different and even contradictive smoothing behaviors and is able to handle the applications in the four groups mentioned above. The main contributions of this paper are as follows:

  • We introduce the truncated Huber penalty function which has seldom been used in image smoothing. By varying the parameters, it shows strong flexibility.

  • A robust non-convex non-smooth optimization framework is proposed. When combined with the strong flexibility of the truncated Huber penalty function, our model can achieve various and even contradictive smoothing behaviors. We show that it is able to handle the tasks in the four groups mentioned above. This has seldom been achieved by previous smoothing operators in the literature.

  • An efficient numerical solution to the proposed optimization framework is provided. Its convergence is theoretically guaranteed.

  • Our method is able to outperform the specially designed approaches in many tasks and state-of-the-art performance is achieved.

2 Related Work

Tremendous smoothing operators have been proposed in recent decades. In terms of edge-preserving smoothing, bilateral filter (BLF) [37] is the early work that has been used in various tasks such as image detail enhancement [10], HDR tone mapping [8], etc. However, it is prone to produce results with gradient reversals and halos [9]. Its alternatives [13, 12] also share a similar problem. Guided filter (GF) [19] can produce results free of gradient reversals but halos still exist. The WLS smoothing [9] solves a global optimization problem and performs well in handling these artifacts. The norm smoothing is able to eliminate low-amplitude structures while sharpening strong edges, which can be applied to the tasks in the second group. To handle the structure inconsistency problem, Shen et al. [36] proposed to perform mutual-structure joint filtering. They also explored the relation between the guidance image and target image via optimizing a scale map [35], however, additional processing was adopted for structure inconsistency handling. Ham et al. [18] proposed to handle the structure inconsistency by combining a static guidance weight with a Welsch’s penalty [20] regularized smoothness term, which leaded to a static/dynamic (SD) filter. Gu et al. [16] presented a weighted analysis representation model for guided depth map enhancement. They also proposed to smooth images by layer decomposition, and different sparse representation models were adopted for different layers [15].

In terms of structure-preserving smoothing, Zhang et al. [45] proposed to smooth structures of different scales with a rolling guidance filter (RGF). Cho et al. [7] modified the original BLF with local patch-based analysis of texture features and obtained a bilateral texture filter (BTF) for image texture removal. Karacan et al. [22] proposed to smooth image textures by making use of region covariances that captured local structure and textural information. Xu et al. [41] adopted the relative total variation (RTV) as a prior to regularize the texture smoothing procedure. Chen et al. [5] proved that the TV- model [5, 30] could smooth images in a scale-aware manner, and it is thus ideal for structure-preserving smoothing such as image texture removal [1, 4].

Most of the approaches mentioned above are limited to a few applications because their inherent smoothing natures are usually fixed. In contrast, our method proposed in this paper can have strong flexibility in achieving various smoothing behaviors, which enables wider applications of our method than most of them. Moreover, our method can show better performance than these methods in several applications that they are specially designed for.

3 Our Approach

3.1 Truncated Huber Penalty Function

We first introduce the truncated Huber penalty function which is defined as:

(1)

where are constants. is the Huber penalty function [21] defined as:

(2)

and are plotted in Fig. 2(a) with which is a sufficient small value (e.g., ). is an edge-preserving penalty function, but it cannot sharpen edges when adopted to regularize the smoothing procedure. In contrast, can sharpen edges because it is able to not penalize image edges due to the truncation. The Welsch’s penalty function [20], which was adopted in the recent proposed SD filter [18], is also plotted in the figure. This penalty function is known to be capable of sharpening edges, which is also because it seldom penalizes strong image edges. The Welsch’s penalty function is close to the norm when the input is small, while the can be close to the norm when is set sufficient small, which demonstrates can better preserve weak edges than the Welsch’s penalty function.

Figure 2: Plots of (a) different penalty functions and (b) the truncated Huber penalty function with different parameter settings.

With different parameter settings, can show strong flexibility to yield different penalty behaviors. Assume the input intensity values are within , then the amplitude of any edge will fall in . We first set . Then if we set , will be actually the same as because the second condition in Eq. (1) can never be met. Because is sufficient small, will be close to the norm in this case, and thus it will be an edge-preserving penalty function that does not sharpen edges. Conversely, when we set , the truncation in will be activated. This can lead to having penalization on weak edges without penalizing strong edges, and thus the strong edges are sharpened. To be short, can act as a switch to decide whether can sharpen edges or not. Similarly, by setting and , can be easily switched between the norm and truncated norm. Note that the truncated norm is also able to sharpen edges [42]. In contrast, the Welsch’s penalty function does not enjoy this kind of flexibility. Different cases of are illustrated in Fig. 2(b).

3.2 Model

Given an input image and a guidance image , the smoothed output image is the solution to the following objective function:

(3)

where is defined in Eq.(1); is the square patch centered at ; is the square patch centered at ; is a parameter that controls the overall smoothing strength. To be clear, we adopt and to denote the parameters of in the data term and smoothness term, respectively. The guidance weight is defined as:

(4)

where determines the sensitivity to the edges in which can be the input image, i.e., . represents the absolute value. is a small constant being set as .

The adoption of makes our model in Eq. (3) to enjoy a strong flexibility. As will be shown in Sec. 3.4, with different parameter settings, our model is able to achieve different smoothing behaviors, and thus it is capable of various tasks that require either edge-preserving smoothing or structure-preserving smoothing.

3.3 Numerical Solution

Our model in Eq. (3) is not only non-convex but also non-smooth, which arises from the adopted . Commonly used approaches [24, 31, 39, 46] for solving non-convex optimization problems are not applicable. To tackle this problem, we first rewrite in a new equivalent form. By defining and , we have:

(5)

where , is the norm of . The minimum of the right side of Eq. (5) is obtained on the condition:

(6)

The detailed proof of Eq. (5) and Eq. (6) is provided in our supplementary file. These two equations also theoretically validate our analysis in Sec. 3.1 and Fig. 2(b): we have if the intensity values are in . Then if , based on Eq. (5) and Eq. (6), we will always have which means degrades to .

A new energy function is defined as:

(7)

Based on Eq. (5) and Eq. (6), we then have:

(8)

Given Eq. (6) as the optimum condition of Eq. (8) with respect to , optimizing with respect to only involves Huber penalty function . The problem can thus be optimized through the half-quadratic (HQ) optimization technique [14, 31]. More specifically, a variable and a function with respect to exist such that:

(9)

where the optimum is yielded on the condition:

(10)

The detailed proof of Eq. (9) and Eq. (10) is provided in our supplementary file. Then we can further define a new energy function:

(11)

Based on Eq. (9) and Eq. (10), we then have:

(12)

Given Eq. (10) as the optimum condition of for Eq. (12), optimizing with respect to only involves the norm penalty function, which has a closed-form solution. However, since the optimum conditions in Eq. (6) and Eq. (10) both involve , therefore, the final solution can only be obtained in an iterative manner. Assuming we have got , then and can be updated through Eq. (6) and Eq. (10) with . Finally, is obtained with:

(13)

Eq.(13) has a close-form solution as:

(14)

where

is an affinity matrix with

, is a diagonal matrix with ,

is a vector with

and is also a vector with .

The above optimization procedure monotonically decreases the value of in each step, its convergence is theoretically guaranteed. Given in the th iteration and , then for any , we have:

(15)
(16)

Given has been updated through Eq. (6), Eq. (15) is based on Eq. (8) and Eq. (5). After has been updated through Eq. (10), Eq. (16) is based on Eq. (12) and Eq. (9). We now have:

(17)

the first and second inequalities follow from Eq. (16) and Eq. (13), respectively. We finally have:

(18)

the first and second inequalities follow from Eq. (15) and Eq. (17), respectively. Since the value of is bounded from below, Eq. (18) indicates that the convergence of our iterative scheme is theoretically guaranteed.

The above optimization procedure is iteratively performed times to get the final smoothed output . can vary for different applications, and some tasks do not need to iterate the above procedure until it converges. These will be detailed in Sec. 3.4. In all our experiments, we set , which is able to produce promising results in each application. Our optimization procedure is summarized in Algorithm 1.

0:  Input image , guide image , iteration number , parameter , , with
1:  for  do
2:     With , compute , update according to Eq. (6)
3:     With , update according to Eq. (10)
4:     With and , solve for according to Eq. (13) (or Eq. (14))
5:  end for
5:  Smoothed image
Algorithm 1 Image Smoothing via Non-convex Non-smooth Optimization

3.4 Property Analysis

With different parameter settings, the strong flexibility of makes our model able to achieve various smoothing behaviors. First, we show that some classical approaches can be viewed as special cases of our model. For example, by setting , our model is an approximation of the TV model [34] which is a representative edge-preserving smoothing operator. If we set with other parameters the same as above, then the first iteration of Algorithm 1 will be the WLS smoothing [9] which performs well in handling gradient reversals and halos in image detail enhancement and HDR tone mapping. With parameters , our model can yield very close smoothing natures as the TV- model [1, 4] which is classical for structure-preserving smoothing.

Figure 3: 1D signal with structures of different scales and amplitudes. Smoothing result of (a) TV- smoothing [4], (c) WLS [9], (e) SD filter [18], our results in (b), (d) and (f).
Figure 4: Image detail enhancement results of different approaches. (a) Input image. Result of (b) WLS [9] and (c) our method. The upper parts of each close-up in (b) and (c) correspond to the patches in the smoothed image.
Figure 5: Clip-art compression artifacts removal results of different approaches. (a) Input image. (b) Our result. Close-ups of (c) input image and results of (d) SD filter [18], (e) our method with the structure-preserving parameter setting, (f) our method with the edge-preserving and structure-preserving parameter setting.
Figure 6: Texture smoothing results of different approaches. (a) Input image. Result of (b) TV- smoothing [4], and (e) our method.

For different kinds of applications, our model can produce better results than the special cases mentioned above. To be convenient, we first start with the tasks in the fourth group which require structure-preserving smoothing. For these tasks, the parameters are set as . This parameter setting has the following two advantages: first, the setting enables our model to have the structure-preserving property similar to that of the TV- model; second, the guidance weight with can make our model to obtain sharper edges in the results than the TV- model does. We illustrate this with 1D smoothing results in Fig. 3(a) and (b). Fig. 6(b) and (c) further show a comparison of image texture removal results. As shown in the figure, both the TV- model and our model can properly remove the small textures, however, edges in our result are much sharper than that in the result of the TV- model. The typical values for are depending on the texture size. is usually smaller than 1. Larger can lead larger structures to be removed. The iteration number is set as .

When dealing with image detail enhancement and HDR tone mapping in the first group, one way is to set the parameters so that our model can perform WLS smoothing. In contrast, we can further make use of the structure-preserving property of our model to produce better results. The parameters are set as follows: . This kind of parameter setting is based on the following observation in our experiments: when we adopt and set to a large value, the amplitudes of different structures will decrease at different rates, i.e., the amplitudes of small structures can have a larger decrease than the large ones, as illustrated in Fig. 3(d). At the same time, edges are neither blurred nor sharpened. This kind of smoothing behavior is desirable for image detail enhancement and HDR tone mapping. As a comparison, Fig. 3(c) shows the smoothing result of the WLS smoothing. As can be observed from the figures, our method can better preserve the edges (see the bottom of the 1D signals in Fig. 3(c) and (d)). Fig. 4(b) and (c) further show a comparison of image detail enhancement results. We fix and vary to control the smoothing strength. for the tasks in the first group is usually much larger than that for the ones in the fourth group, for example, the result in Fig. 4(c) is generated with .

To sharpen edges that is required by the tasks in the second and the third groups, we can set in the smoothness term. In addition, we further set other parameters as . The truncation

in the data term can help our model to be robust against the outliers in the input image, for example, the noise in the no flash image and low-quality depth map. The truncation

in the smoothness term can enable our model to be an edge-preserving one. By setting , our model can further enjoy the structure-preserving property. With both edge-preserving and structure-preserving smoothing natures, our model has the ability to preserve large structures with weak edges and small structures with strong edges at the same time, which is challenging but is of practical importance. Fig. 5(a) illustrates this kind of case with an example of clip-art compression artifacts removal: both the thin black circle around the “wheel” and the gray part in the center of the “wheel” should be preserved. The challenge lies on two facts. On one hand, if we perform edge-preserving smoothing, the gray part will be removed because the corresponding edge is weak. Fig. 5(d) shows the result of the SD filter [18]. The SD filter can properly preserve the thin black circle and sharpen the edges thanks to the adopted Welsch’s penalty function, however, it fails to preserve the weak edge between the black part and the gray part. On the other hand, if we adopt structure-preserving smoothing, then the thin black circle will be smoothed due to its small structure size. Fig. 5(e) shows the corresponding result of our method with the structure-preserving parameter setting described above. In contrast, our method with the edge-preserving and structure-preserving parameter setting can preserve both these two parts and sharpen the edges, as shown in Fig. 5(f). Fig. 3(e) and (f) also show a comparison of the SD filter and our method with 1D smoothing results. We fix for the tasks in both the second and the third groups. We empirically set and depending on the applied task and the input noise level.

The structure inconsistency issue in the third group can also be easily handled by our model. Note that in Eq. (11) is computed with the smoothed image in each iteration, as formulated in Eq. (10), it thus can reflect the inherent natures of the smoothed image. The guidance weight can provide additional structural information from the guidance image . This means that and can complement each other. In fact, the equivalent guidance weight of Eq. (11) in each iteration is , which can reflect the property of both the smoothed image and the guidance image. In this way, it can properly handle the structure inconsistency problem and avoid blurring edges and texture copy artifacts. Similar ideas were also adopted in [18, 28].

4 Applications and Experimental Results

Our method is applied to various tasks in the first to the fourth groups to validate the effectiveness. Comparisons with the state-of-the-art approaches in each application are also presented. Due to the limited space, we only show experimental results of four applications.

Our experiments are performed on a PC with an Intel Core i5 3.4GHz CPU (one thread used) and 8GB memory. For an RGB image of size and in Algorithm 1, the running time is seconds in MATLAB for . Note that as described in Sec. 3.4, the value of is smaller than 3 in most cases except for guided depth map upsampling.

HDR tone mapping is a representative task in the first group. It requires to decompose the input image into a base layer and a detail layer through edge-preserving smoothing. The widely adopted mapping framework was proposed by Durand and Dorsey [8], where the decomposition is applied to the log-luminance channel of the input HDR image. The challenge of this task is that if the edges are sharpened by the smoothing procedure, it will result in gradient reversals, and halos will occur if the edges are blurred. Fig. 7 shows the tone mapping results using different edge-preserving smoothing operators. The result of GF [19] contains clear halos around the picture frames and the light fixture, as shown in Fig. 7(a). This is due to their local smoothing natures where strong smoothing can also blur salient edges [9, 19]. The norm smoothing [40] can properly eliminate halos, but there are gradient reversals in its result as illustrated in Fig. 7(b). This is because the smoothing is prone to sharpen salient edges. The WLS [9] and SG-WLS [27] smoothing perform well in handling gradient reversals and halos in most cases. However, there are slight halos in their results as illustrated in the left close-up in Fig. 7(c) and (d). These artifacts are properly eliminated in our results.

Clip-art compression artifacts removal. Clip-art images are piecewise constant with sharp edges. When they are compressed in JPEG format with low quality, there will be edge-related artifacts, and the edges are usually blurred as shown in Fig. 8(a). Therefore, when removing the compression artifacts, the edges should also be sharpened in the restored image. We thus classify this task into the second group. The approach proposed by Wang et al. [38] can seldom handle heavy compression artifacts. Their result is shown in Fig. 8(b). The norm smoothing can eliminate most compression artifacts and properly sharpen salient edges, but it fails to preserve weak edges as shown in Fig. 8(c). The region fusion approach [29] is able to produce results with sharpened edges, however, it also enhances the blocky artifacts along strong edges as highlighted in Fig. 8(d). Our result is illustrated in Fig. 8(e) with edges sharpened and compression artifacts removed.

Figure 7: HDR tone mapping results of different approaches. Result of (a) GF [19], (b) norm smoothing [40], (c) WLS [9], (d) SG-WLS [27] and (e) our method.
Figure 8: Clip-art compression artifacts removal results of different methods. (a) Input compressed image. Result of (b) the approach proposed by Wang et al. [38], (c) norm smoothing [40], (d) region fusion approach [29] and (e) our method.
Figure 9: Guided depth map upsampling results of simulated ToF data. (a) Guidance color image. (b) Ground-truth depth map. Result of (c) the approach proposed by Gu et al. [16], (d) SGF [44], (e) SD filter [18] and (f) our method.
Figure 10: Guided depth upsampling results of real ToF data. (a) Guidance intensity image. (b) Ground-truth depth map. Result of (c) the approach proposed by Gu et al. [16], (d) TGV [11], (e) SD filter [18] and (f) our method.
Art Book Dolls Laundry Moebius Reindeer
TGV[11] 0.8 1.21 2.01 4.59 0.61 0.88 1.21 2.19 0.66 0.95 1.38 2.88 0.61 0.87 1.36 3.06 0.57 0.77 1.23 2.74 0.61 0.85 1.3 3.41
AR[43] 1.17 1.7 2.93 5.32 0.98 1.22 1.74 2.89 0.97 1.21 1.71 2.74 1 1.31 1.97 3.43 0.95 1.2 1.79 2.82 1.07 1.3 2.03 3.34
SG-WLS[27] 1.26 1.9 3.07 - 0.82 1.12 1.73 - 0.87 1.11 1.81 - 0.86 1.17 2 - 0.82 1.08 1.79 - 0.9 1.32 2.01 -
FGI[26] 0.9 1.37 2.46 4.89 0.66 0.85 1.23 1.96 0.74 0.95 1.41 2.13 0.71 0.99 1.59 2.67 0.67 0.82 1.2 1.87 0.75 0.94 1.55 2.73
SGF[44] 1.42 1.85 3.06 5.55 0.84 1.11 1.76 3.03 0.87 1.2 1.88 3.26 0.74 1.1 1.96 3.63 0.81 1.13 1.84 3.16 0.93 1.25 2.03 3.67
SD Filter[18] 1.16 1.64 2.88 5.52 0.86 1.1 1.57 2.68 1.04 1.27 1.73 2.76 0.96 1.25 1.94 3.54 0.93 1.14 1.68 2.75 1.05 1.31 1.99 3.43
FBS[3] 1.93 2.39 3.29 5.05 1.42 1.55 1.76 2.48 1.33 1.45 1.69 2.26 1.32 1.49 1.77 2.67 1.16 1.29 1.61 2.44 1.63 1.76 2.01 2.69
muGIF[17] 1.00 1.26 2.00 3.46 0.73 0.89 1.35 2.15 0.85 1.04 1.50 2.45 0.64 0.87 1.36 2.57 0.67 0.85 1.35 2.25 0.78 0.94 1.39 2.52
Park et al.[32] 1.66 2.47 3.44 5.55 1.19 1.47 2.06 3.1 1.19 1.56 2.15 3.04 1.34 1.73 2.41 3.85 1.2 1.5 2.13 2.95 1.26 1.65 2.46 3.66
Shen et al.[36] 1.79 2.21 3.2 5.04 1.34 1.69 2.25 3.13 1.37 1.58 2.05 2.85 1.49 1.74 2.34 3.5 1.34 1.56 2.09 2.99 1.29 1.55 2.19 3.33
Gu et al.[16] 0.61 1.46 2.98 5.09 0.52 0.95 1.87 2.98 0.63 1.02 1.89 2.92 0.58 1.14 2.21 3.58 0.53 0.96 1.89 2.99 0.52 1.07 2.17 3.59
Li et al.[25] - 3.77 4.49 6.29 - 3.21 3.28 3.79 - 3.19 3.28 3.79 - 3.34 3.61 4.45 - 3.23 3.35 3.92 - 3.39 3.65 4.54
Ours 0.69 1.07 1.65 2.96 0.55 0.81 1.22 1.78 0.62 0.9 1.27 1.84 0.61 0.89 1.28 2.12 0.51 0.75 1.12 1.71 0.56 0.87 1.27 2.08
Table 1: Quantitative comparison on the noisy simulated ToF data. Results are evaluated in MAE. The best results are in bold and in red color. The second best results are underlined and in cyan color.
Bicubic GF[19] SD Filter[18] SG-WLS[27] Shen et al.[36] Park et al.[32] TGV[11] AR[43] Gu et al.[16] SGF[44] FGI[26] FBS[3] Li et al.[25] Ours
Books 16.23mm 15.55mm 13.47mm 14.71mm 15.47mm 14.31mm 12.8mm 14.37mm 13.87mm 13.57mm 14.21mm 15.93mm 14.33mm 12.49mm
Devil 17.78mm 16.1mm 15.99mm 16.24mm 16.18mm 15.36mm 14.97mm 15.41mm 15.36mm 15.74mm 16.43mm 17.21mm 15.09mm 14.51mm
Shark 16.66mm 17.1mm 16.18mm 16.51mm 17.33mm 15.88mm 15.53mm 16.27mm 15.88mm 16.21mm 16.37mm 16.33mm 15.82mm 15.02mm
Table 2: Quantitative comparison on real ToF dataset. The errors are calculated as MAE to the measured ground-truth in mm. The best results are in bold and in red color. The second best results are underlined and in cyan color.
Figure 11: Image texture removal results of different methods. (a) Input image. Result of (b) JCAS [15], (c) RTV [41], (d) FCN based approach [6], (e) muGIF [17] and (f) our method.

Guided depth map upsampling belongs to the guided image filtering in the third group. Depth maps captured by modern depth cameras (e.g., ToF depth camera) are usually of low resolution and contain heavy noise. To boost the resolution and quality, one way is to upsample the depth map with the guidance of a high-resolution RGB image that captures the same scene. The RGB image is usually of high quality and can provide additional structural information to restore and sharpen the depth edges. The challenge of this task is the structure inconsistency between the depth map and the RGB guidance image, which can cause blurring depth edges and texture copy artifacts in the upsampled depth map. We first test our method on the simulated dateset provided in [43]. The simulated dataset contains six depth maps and four upsampling factors for each depth map, as listed in Tab. 1. Fig. 9 shows the visual comparison between our result and the results of the recent state-of-the-art approaches. Our method shows better performance in preserving sharp depth edges and avoiding texture copy artifacts. Tab. 1 also shows the quantitative evaluation on the results of different methods. Following the measurement used in [17, 26, 27, 43], the evaluation is measured in terms of mean absolute errors (MAE). We also show RMSE measurement results in our supplemental material. As Tab. 1 shows, our method can achieve the best or the second best performance among all the compared approaches.

We further validate our method on the real data introduced by Ferstl et al. [11]. The real dataset contains three low-resolution depth maps captured by a ToF depth camera and the corresponding highly accurate ground-truth depth maps captured with structured light. The upsampling factor for the real dataset is . The visual comparison in Fig. 10 and the quantitative comparison in Tab. 2 show that our method can outperform the compared methods and achieve state-of-the-art performance.

Image texture removal belongs to the tasks in the fourth group. It aims at extracting salient meaningful structures while removing small complex texture patterns. Many meaningful structures can be formed by or appear over textured surfaces in natural images. Extracting these structures is challenging but is of great practical importance, which can benefit a number of application, such as image vectorization, edge simplification and detection, content-aware image resizing [2], etc. The challenge of this task is that it requires structure-preserving smoothing rather than the edge-preserving in the above tasks. Fig. 11(a) shows a classical example of image texture removal: the small textures with strong edges should be smoothed out while the salient structures with weak edges should be preserved. Fig. 11(b)(e) show the results of the recent state-of-the-art approaches. The joint convolutional analysis and synthesis sparse (JCAS) model [15] can well remove the textures, but the resulting edges are also blurred. The RTV method [41], muGIF [17] and FCN based approach [6] cannot completely remove the textures, in addition, the weak edges of the salient structures have also been smoothed out in their results. Our method can both preserve the weak edges of the salient structures and remove the small textures.

5 Conclusion

We propose a non-convex non-smooth optimization framework for edge-preserving and structure-preserving image smoothing. We first introduce the truncated Huber penalty function which shows strong flexibility. Then a robust framework is presented. When combined with the flexibility of the truncated Huber penalty function, our framework is able to achieve different and even contradictive smoothing behaviors using different parameter settings. This is different from most previous approaches of which the inherent smoothing natures are usually fixed. We further propose an efficient numerical solution to our model and prove its convergence theoretically. Comprehensive experimental results in a number of applications demonstrate the effectiveness of our method.

References

  • [1] J.-F. Aujol, G. Gilboa, T. Chan, and S. Osher. Structure-texture image decomposition: modeling, algorithms, and parameter selection. International Journal of Computer Vision (IJCV), 67(1):111–136, 2006.
  • [2] S. Avidan and A. Shamir. Seam carving for content-aware image resizing. In ACM Transactions on graphics (TOG), volume 26, page 10. ACM, 2007.
  • [3] J. T. Barron and B. Poole. The fast bilateral solver. In European Conference on Computer Vision (ECCV), pages 617–632. Springer, 2016.
  • [4] A. Buades, T. M. Le, J.-M. Morel, L. A. Vese, et al. Fast cartoon+ texture image filters. IEEE Transactions on Image Processing (TIP), 19(8):1978–1986, 2010.
  • [5] T. F. Chan and S. Esedoglu. Aspects of total variation regularized l 1 function approximation. SIAM Journal on Applied Mathematics, 65(5):1817–1837, 2005.
  • [6] Q. Chen, J. Xu, and V. Koltun. Fast image processing with fully-convolutional networks. In IEEE International Conference on Computer Vision (ICCV), volume 9, pages 2516–2525, 2017.
  • [7] H. Cho, H. Lee, H. Kang, and S. Lee. Bilateral texture filtering. ACM Transactions on Graphics (TOG), 33(4):128, 2014.
  • [8] F. Durand and J. Dorsey. Fast bilateral filtering for the display of high-dynamic-range images. In ACM Transactions on Graphics (TOG), volume 21, pages 257–266. ACM, 2002.
  • [9] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski. Edge-preserving decompositions for multi-scale tone and detail manipulation. In ACM Transactions on Graphics (TOG), volume 27, page 67. ACM, 2008.
  • [10] R. Fattal, M. Agrawala, and S. Rusinkiewicz. Multiscale shape and detail enhancement from multi-light image collections. In ACM Transactions on Graphics (TOG), volume 26, page 51. ACM, 2007.
  • [11] D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof. Image guided depth upsampling using anisotropic total generalized variation. In IEEE International Conference on Computer Vision (ICCV), pages 993–1000, 2013.
  • [12] E. S. Gastal and M. M. Oliveira. Domain transform for edge-aware image and video processing. In ACM Transactions on Graphics (TOG), volume 30, page 69. ACM, 2011.
  • [13] E. S. Gastal and M. M. Oliveira. Adaptive manifolds for real-time high-dimensional filtering. ACM Transactions on Graphics (TOG), 31(4):33, 2012.
  • [14] D. Geman and C. Yang. Nonlinear image recovery with half-quadratic regularization. IEEE Transactions on Image Processing, 4(7):932–946, 1995.
  • [15] S. Gu, D. Meng, W. Zuo, and L. Zhang. Joint convolutional analysis and synthesis sparse representation for single image layer separation. In IEEE International Conference on Computer Vision (ICCV), pages 1717–1725. IEEE, 2017.
  • [16] S. Gu, W. Zuo, S. Guo, Y. Chen, C. Chen, and L. Zhang. Learning dynamic guidance for depth image enhancement. In

    IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , 2017.
  • [17] X. Guo, Y. Li, J. Ma, and H. Ling. Mutually guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018.
  • [18] B. Ham, M. Cho, and J. Ponce. Robust image filtering using joint static and dynamic guidance. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4823–4831, 2015.
  • [19] K. He, J. Sun, and X. Tang. Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(6):1397–1409, 2013.
  • [20] P. W. Holland and R. E. Welsch. Robust regression using iteratively reweighted least-squares. Communications in Statistics-theory and Methods, 6(9):813–827, 1977.
  • [21] P. J. Huber et al.

    Robust estimation of a location parameter.

    The annals of mathematical statistics, 35(1):73–101, 1964.
  • [22] L. Karacan, E. Erdem, and A. Erdem. Structure-preserving image smoothing via region covariances. ACM Transactions on Graphics (TOG), 32(6):176, 2013.
  • [23] J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele. Joint bilateral upsampling. In ACM Transactions on Graphics (ToG), volume 26, page 96. ACM, 2007.
  • [24] G. R. Lanckriet and B. K. Sriperumbudur. On the convergence of the concave-convex procedure. In Advances in Neural Information Processing Systems, (NIPS), pages 1759–1767, 2009.
  • [25] Y. Li, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deep joint image filtering. In European Conference on Computer Vision (ECCV), pages 154–169. Springer, 2016.
  • [26] Y. Li, D. Min, M. N. Do, and J. Lu.

    Fast guided global interpolation for depth and motion.

    In European Conference on Computer Vision (ECCV), pages 717–733. Springer, 2016.
  • [27] W. Liu, X. Chen, C. Shen, Z. Liu, and J. Yang. Semi-global weighted least squares in image filtering. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 5861–5869, 2017.
  • [28] W. Liu, X. Chen, J. Yang, and Q. Wu. Robust color guided depth map restoration. IEEE Transactions on Image Processing (TIP), 26(1):315–327, 2017.
  • [29] R. M. Nguyen and M. S. Brown. Fast and effective l0 gradient minimization by region fusion. In IEEE International Conference on Computer Vision (ICCV), pages 208–216, 2015.
  • [30] M. Nikolova. A variational approach to remove outliers and impulse noise. Journal of Mathematical Imaging and Vision, 20(1-2):99–120, 2004.
  • [31] M. Nikolova and M. K. Ng. Analysis of half-quadratic minimization methods for signal and image recovery. SIAM Journal on Scientific computing, 27(3):937–966, 2005.
  • [32] J. Park, H. Kim, Y.-W. Tai, M. S. Brown, and I. Kweon. High quality depth map upsampling for 3d-tof cameras. In IEEE International Conference on Computer Vision (ICCV), pages 1623–1630. IEEE, 2011.
  • [33] G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, and K. Toyama. Digital photography with flash and no-flash image pairs. ACM Transactions on Graphics (TOG), 23(3):664–672, 2004.
  • [34] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4):259–268, 1992.
  • [35] X. Shen, Q. Yan, L. Xu, J. Jia, et al. Multispectral joint image restoration via optimizing a scale map. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), (1):1–1, 2015.
  • [36] X. Shen, C. Zhou, L. Xu, and J. Jia. Mutual-structure for joint filtering. In IEEE International Conference on Computer Vision (ICCV), pages 3406–3414, 2015.
  • [37] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In IEEE International Conference on Computer Vision (ICCV), pages 839–846. IEEE, 1998.
  • [38] G. Wang, T.-T. Wong, and P.-A. Heng. Deringing cartoons by image analogies. ACM Transactions on Graphics (TOG), 25(4):1360–1379, 2006.
  • [39] Y. Wang, J. Yang, W. Yin, and Y. Zhang. A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences, 1(3):248–272, 2008.
  • [40] L. Xu, C. Lu, Y. Xu, and J. Jia. Image smoothing via l 0 gradient minimization. In ACM Transactions on Graphics (TOG), volume 30, page 174. ACM, 2011.
  • [41] L. Xu, Q. Yan, Y. Xia, and J. Jia. Structure extraction from texture via relative total variation. ACM Transactions on Graphics (TOG), 31(6):139, 2012.
  • [42] L. Xu, S. Zheng, and J. Jia. Unnatural l0 sparse representation for natural image deblurring. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1107–1114, 2013.
  • [43] J. Yang, X. Ye, K. Li, C. Hou, and Y. Wang.

    Color-guided depth recovery from rgb-d data using an adaptive autoregressive model.

    IEEE Transactions on Image Processing (TIP), 23(8):3443–3458, 2014.
  • [44] F. Zhang, L. Dai, S. Xiang, and X. Zhang. Segment graph based image filtering: fast structure-preserving smoothing. In IEEE International Conference on Computer Vision (ICCV), pages 361–369, 2015.
  • [45] Q. Zhang, X. Shen, L. Xu, and J. Jia. Rolling guidance filter. In European Conference on Computer Vision (ECCV), pages 815–830. Springer, 2014.
  • [46] Z. Zhang, J. T. Kwok, and D.-Y. Yeung.

    Surrogate maximization/minimization algorithms for adaboost and the logistic regression model.

    In

    Proceedings of International Conference on Machine Learning, (ICML)

    , page 117. ACM, 2004.