1 Introduction
Image restoration is a process of reconstructing a clean image from a degraded observation. The observed data is assumed to be related to the ideal image through a forward imaging model that accounts for noise, blurring, and sampling. However, a simple modeling only with the observed data is insufficient for an effective restoration, and thus a priori constraint about the solution is commonly used. To this end, the image restoration is usually formulated as an energy minimization problem with an explicit regularization function (or regularizer). Recent work on joint restoration leverages a guidance signal, captured from different devices, as an additional cue to regularize the restoration process. These approaches have been successfully applied to various applications including joint upsampling [11], crossfield noise reduction [32], dehazing [31], and intrinsic image decomposition [8].
The regularizationbased image restoration involves the minimization of nonconvex and nonsmooth energy functionals for yielding highquality restored results. Solving such functionals typically requires a huge amount of iterations, and thus an efficient optimization is preferable, especially in some applications the runtime is crucial. One of the most popular optimization methods is the alternating minimization (AM) algorithm [34] that introduces auxiliary variables. The energy functional is decomposed into a series of subproblems that is relatively simple to optimize, and the minimum with respect to each of the variables is then computed. For the image restoration, the AM algorithm has been widely adopted with various regularization functions, e.g., total variation [34], norm [36], and norm (hyperLaplacian) [16]. It is worth noting that these functions are all handcrafted models. The hyperLaplacian of image gradients [16] reflects the statistical property of natural images relatively well, but the restoration quality of gradientbased regularization methods using the handcrafted model is far from that of the stateoftheart approaches [30, 9]. In general, it is nontrivial to design an optimal regularization function for a specific image restoration problem.
Over the past few years, several attempts have been made to overcome the limitation of handcrafted regularizer by learning the image restoration model from a largescale training data [30, 9, 39]. In this work, we propose a novel method for image restoration that effectively uses a datadriven approach in the energy minimization framework, called deeply aggregated alternating minimization (DeepAM). Contrary to existing datadriven approaches that just produce the restoration results from the convolutional neural networks (CNNs), we design the CNNs to implicitly learn the regularizer of the AM algorithm. Since the CNNs are fully integrated into the AM procedure, the whole networks can be learned simultaneously in an endtoend manner. We show that our simple model learned from the deep aggregation achieves better results than the recent datadriven approaches [30, 9, 17] as well as the stateoftheart nonlocalbased methods [10, 12].
Our main contributions can be summarized as follows:

We design the CNNs to learn the regularizer of the AM algorithm, and train the whole networks in an endtoend manner.

We introduce the aggregated (or multivariate) mapping in the AM algorithm, which leads to a better restoration model than the conventional pointwise proximal mapping.

We extend the proposed method to joint restoration tasks. It has broad applicability to a variety of restoration problems, including image denoising, RGB/NIR restoration, and depth superresolution.
2 Related Work
Regularizationbased image restoration
Here, we provide a brief review of the regularizationbased image restoration. The total variation (TV) [34] has been widely used in several restoration problems thanks to its convexity and edgepreserving capability. Other regularization functions such as total generalized variation (TGV) [4] and norm [16] have also been employed to penalize an image that does not exhibit desired properties. Beyond these handcrafted models, several approaches have been attempted to learn the regularization model from training data [30, 9]. Schmidt et al. [30] proposed a cascade of shrinkage fields (CSF) using learned Gaussian RBF kernels. In [9], a nonlinear diffusionreaction process was modeled by using parameterized linear filters and regularization functions. Joint restoration methods using a guidance image captured under different configurations have also been studied [3, 11, 31, 17]. In [3], an RGB image captured in dim light was restored using flash and nonflash pairs of the same scene. In [11, 15], RGB images was used to assist the regularization process of a lowresolution depth map. Shen et al. [31] proposed to use darkflashed NIR images for the restoration of noisy RGB image. Li et al. used the CNNs to selectively transfer salient structures that are consistent in both guidance and target images [17].
Use of energy minimization models in deep network
The CNNs lack imposing the regularity constraint on adjacent similar pixels, often resulting in poor boundary localization and spurious regions. To deal with these issues, the integration of energy minimization models into CNNs has received great attention [24, 38, 25, 26]. Ranftl et al. [24]
defined the unary and pairwise terms of Markov Random Fields (MRFs) using the outputs of CNNs, and trained network parameters using the bilevel optimization. Similarly, the mean field approximation for fully connected conditional random fields (CRFs) was modeled as recurrent neural networks (RNNs)
[38]. A nonlocal Huber regularization was combined with CNNs for a high quality depth restoration [25]. Riegler et al. [26] integrated anisotropic TGV into the top of deep networks. They also formulated the bilevel optimization problem and trained the network in an endtoend manner by unrolling the TGV minimization. Note that the bilevel optimization problem is solvable only when the energy minimization model is convex and is twice differentiable [24]. The aforementioned methods try to integrate handcrafted regularization models into top of the CNNs. In contrast, we design the CNNs to parameterize the regularization process in the AM algorithm.3 Background and Motivation
The regularizationbased image reconstruction is a powerful framework for solving a variety of inverse problems in computational imaging. The method typically involves formulating a data term for the degraded observation and a regularization term for the image to be reconstructed. An output image is then computed by minimizing an objective function that balances these two terms. Given an observed image and a balancing parameter , we solve the corresponding optimization problem^{1}^{1}1For the superresolution, we treat as the bilinearly upsampled image from the lowresolution input.:
(1) 
denotes the , where (or ) is a discrete implementation of derivative (or derivative) of the image. is a regularization function that enforces the output image to meet desired statistical properties. The unconstrained optimization problem of (1) can be solved using numerous standard algorithms. In this paper, we focus on the additive form of alternating minimization (AM) method [34], which is the adhoc for a variety of problems in the form of (1).
3.1 Alternating Minimization
The idea of AM method is to decouple the data and regularization terms by introducing a new variable and to reformulate (1) as the following constrained optimization problem:
(3) 
where is the penalty parameter. The AM algorithm consists of repeatedly performing the following steps until convergence.
3.2 Motivation
Minimizing the first step in (4) varies depending on the choices of the regularization function and . This step can be regarded as the proximal mapping [22] of associated with . When is the sum of or norm, it amounts to soft or hard thresholding operators (see Fig. 1 and [22] for various examples of this relation). Such mapping operators may not unveil the full potential of the optimization method of (4), since and are chosen manually. Furthermore, the mapping operator is performed for each pixel individually, disregarding spatial correlation with neighboring pixels.
Building upon this observation, we propose the new approach in which the regularization function and the penalty parameter are learned from a largescale training dataset. Different from the pointwise proximal mapping based on the handcrafted regularizer, the proposed method learns and aggregates the mapping of through CNNs.
4 Proposed Method
In this section, we first introduce the DeepAM for a single image restoration, and then extend it to joint restoration tasks. In the following, the subscripts and
denote the location of a pixel (in a vector form).
4.1 Deeply Aggregated AM
We begin with some intuition about why our learned and aggregated mapping is crucial to the AM algorithm. The first step in (4) maps with a small magnitude into zero since it is assumed that they are caused by noise, not an original signal. Traditionally, this mapping step has been applied in a pointwise manner, not to mention whether it is learned or not. With , Schmidt et al. [30] modeled the pointwise mapping function as Gaussian RBF kernels, and learned their mixture coefficients^{2}^{2}2When , the first step in (4) is separable with respect to each . Thus, it can be modeled by pointwise operation.. Contrarily, we do not presume any property of . We instead train the multivariate mapping process () associated with and by making use of the CNNs. Figure 2 shows the denoising examples of TV [34], CSF [30], and ours. Our method outperforms other methods using the pointwise mapping based on handcrafted model (Fig. 2(b)) or learned model (Fig. 2(c)) (see the insets).
We reformulate the original AM iterations in (4) with the following steps^{3}^{3}3The gradient operator is absorbed into the CNNs..
(5) 
(6) 
where denotes a convolutional network parameterized by and . Note that is completely absorbed into the CNNs, and fused with the balancing parameter (which will also be learned). is estimated by deeply aggregating through CNNs. This formulation allows us to turn the optimization procedure in (1) into a cascaded neural network architecture, which can be learned by the standard backpropagation algorithm [20].
The solution of (6) satisfies the following linear system:
(7) 
where the Laplacian matrix . It can be seen that (7) plays a role of naturally imposing the spatial and appearance consistency on the intermediate output image using a kernel matrix [38]. The linear system of (7) becomes the part of deep neural network (see Fig. 3). When is a constant, the block Toeplitz matrix
is diagonalizable with the fast Fourier transform (FFT). However, in our framework, the direct application of FFT is not feasible since
is spatially varying for the adaptive regularization. Fortunately, the matrix is still sparse and positive semidefinite as the simple gradient operator is used. We adopt the preconditioned conjugate gradient (PCG) method to solve the linear system of (7). The incomplete Cholesky factorization [1] is used for computing the preconditioner.Very recently, Chan et al. [7] replaced the proximal mapping in (4) with an offtheshelf image denoising algorithm , e.g., nonlocal means [5], as follows:
(8) 
Although this is conceptually similar to our aggregation approach^{4}^{4}4Aggregation using neighboring pixels are commonly used in stateofthearts denoising methods., the operator in [7] still relies on the handcrafted model. Figure 3 shows the proposed learning model for image restoration tasks. The DeepAM, consisting of deep aggregation network, parameter network, guidance network (which will be detailed in next section), and reconstruction layer, is iterated times, followed by the loss layer.
Figure 4
shows the denoising result of our method. Here, it is trained with three passes of DeepAM. The input image is corrupted by Gaussian noise with standard deviation
. We can see that as iteration proceeds, the highquality restoration results are produced. The trained networks in the first and second iterations remove the noise, but intermediate results are over smoothed (Figs. 4(a) and (b)). The highfrequency information is then recovered in the last network (Fig. 4(c)). To analyze this behavior, let us date back to the existing softthresholding operator, in [34]. The conventional AM method sets as a small constant and increases it during iterations. When is small, the range of is shrunk, penalizing large gradient magnitudes. The highfrequency details of an image are recovered as increases. Interestingly, the DeepAM shows very similar behavior (Figs. 4(d)(f)), but outperforms the existing methods thanks to the aggregated mapping through the CNNs, as will be validated in experiments.4.2 Extension to Joint Restoration
In this section, we extend the proposed method to joint restoration tasks. The basic idea of joint restoration is to provide structural guidance, assuming structural correlation between different kinds of feature maps, e.g., depth/RGB and NIR/RGB. Such a constraint has been imposed on the conventional mapping operator by considering structures of both input and guidance images [15]. Similarly, one can modify the deeply aggregated mapping of (5) as follows:
(9) 
where is a guidance image and denotes a concatenation operator. However, we find such early concatenation to be less effective since the guidance image mixes heterogeneous data. This coincides with the observation in the literature of multispectral pedestrian detection [18]. Instead, we adopt the halfway concatenation similar to [18, 17]. Another subnetwork is introduced to extract the effective representation of the guidance image, and it is then combined with intermediate features of (see Fig. 3).
4.3 Learning Deeply Aggregated AM
In this section, we will explain the network architecture and training method using standard backpropagation algorithm. Our code will be publicly available later.
Network architecture
One iteration of the proposed DeepAM consists of four major parts: deep aggregation network, parameter network, guidance network (for joint restoration), and reconstruction layer, as shown in Fig. 3. The deep aggregation network consists of 10 convolutional layers with filters (a receptive field is of ). Each hidden layer of the network has 64 feature maps. Since
contains both positive and negative values, the rectified linear unit (ReLU) is not used for the last layer. The input distributions of all convolutional layers are normalized to the standard Gaussian distribution
[21]. The output channel of the deep aggregation network is 2 for the horizontal and vertical gradients. We also extract the spatially varying by exploiting features from the eighth convolutional layer of the deep aggregation network. The ReLU is used for ensuring the positive values of .For joint image restoration, the guidance network consists of 3 convolutional layers, where the filters operate on spatial region. It takes the guidance image as an input, and extracts a feature map which is then concatenated with the third convolutional layer of the deep aggregation network. There are no parameters to be learned in the reconstruction layer.
Training
The DeepAM is learned via standard backpropagation algorithm [20]. We do not require the complicated bilevel formulation [24, 26]. Given training image pairs , we learn the network parameters by minimizing the loss function.
C. Man  House  Pepp.  Starf.  Fly  Airpl.  Parrot  Lena  Barb.  Boat  Man  Couple  

BM3D [10]  29.47  32.99  30.29  28.57  29.32  28.49  28.97  32.03  30.73  29.88  29.59  29.70 
CSF [30]  29.51  32.41  30.32  28.87  29.69  28.80  28.91  31.87  28.99  29.75  29.68  29.50 
EPLL [39]  29.21  32.14  30.12  28.48  29.35  28.66  28.96  31.58  28.53  29.64  29.57  29.46 
MLP [6]  29.36  32.53  30.20  28.88  29.73  28.84  29.11  32.07  29.17  29.86  29.79  29.68 
TRD [9]  29.71  32.62  30.57  29.05  29.97  28.95  29.22  32.02  29.39  29.91  29.83  29.71 
WNNM [12]  29.63  33.39  30.55  29.09  29.98  28.81  29.13  32.24  31.28  29.98  29.74  29.80 
29.97  33.35  30.89  29.43  30.27  29.03  29.41  32.52  29.52  30.23  30.07  30.15 
(10) 
where and denote the ground truth image and the output of the last reconstruction layer in (7), respectively. It is known that loss in deep networks reduces splotchy artifacts and outperforms loss for pixellevel prediction tasks [37]
. We use the stochastic gradient descent (SGD) to minimize the loss function of (
10). The derivative for the backpropagation is obtained as follows:(11) 
To learn the parameters in the network, we need the derivatives of the loss with respect to and
. By the chain rule of differentiation,
can be derived from (7):(12) 
is obtained by solving the linear system of (12). Similarly for , we have:
(13) 
where “” is an elementwise multiplication. Since the loss is a scalar value, and are and vectors, respectively, where is total number of pixels. More details about the derivations of (12) and (13) are available in the supplementary material. The system matrix is shared in (12) and (13), thus its incomplete factorization is performed only once.
Figure 5 shows the convergence of the PCG method for solving the linear system of (12
). We find that a few PCG iterations are enough for the backpropagation. The average residual,
on 20 images is 1.3, after 10 iterations. The table in Fig. 5 compares the runtime of PCG iterations and MATLAB backslash (on 256256 image). The PCG with 10 iterations is about 5 times faster than the direct linear system solver.5 Experiments
We jointly train our DeepAM for 20 epochs. From here on, we call
the method trained through a cascade of DeepAM iterations. The MatConvNet library [2] (with 12GB NVIDIA Titan GPU) is used for network construction and training. The networks are initialized randomly using Gaussian distributions. The momentum and weight decay parameters are set to 0.9 and 0.0005, respectively. We do not perform any pretraining (or finetuning). The proposed method is applied to single image denoising, depth superresolution, and RGB/NIR restoration. The results for the comparison with other methods are obtained from source codes provided by the authors. Additional results and analyses are available in the supplementary material.PSNR / SSIM  

BM3D [10]  MLP [6]  CSF [30]  TRD [9]  
15  31.12 / 0.872    31.24 / 0.873  31.42 / 0.882  31.40 / 0.882  31.65 / 0.885  31.68 / 0.886 
25  28.61 / 0.801  28.84 / 0.812  28.73 / 0.803  28.91 / 0.815  28.95 / 0.816  29.18 / 0.824  29.21 / 0.825 
50  25.65 / 0.686  26.00 / 0.708    25.96 / 0.701  25.94 / 0.701  26.20 / 0.714  26.24 / 0.716 
5.1 Single Image Denoising
We learned the from a set of , patches sampled from the BSD300 [19] dataset. Here was set to 3 as the performance of the DeepAM converges after 3 iterations (refer to Table 2). The noise levels were set to , and . We compared against a variety of recent stateoftheart techniques, including BM3D [10], WNNM [12], CSF [30], TRD [9], EPLL [39], and MLP [6]. The first two methods are based on the nonlocal regularization and the others are learningbased approaches.
Table 1 shows the peak signaltonoise ratio (PSNR) on the 12 test images [10]. The best results for each image are highlighted in bold. The yields the highest PSNR results on most images. We could find that our deep aggregation used in the mapping step outperforms the pointwise mapping of the CSF [30] by 0.30.5dB. Learningbased methods tend to have better performance than handcrafted models. We, however, observed that the methods (BM3D [10] and WNNM [12]) based on the nonlocal regularization usually work better on images that are dominated by repetitive textures, e.g., ‘House’ and ‘Barbara’. The nonlocal selfsimilarity is a powerful prior on regular and repetitive texture, but it may lead to inferior results on irregular regions.
BMP (3): NYU v2 [33] / Middlebury [29]  
Method  
NMRF [23]  1.41 / 4.56  4.21 / 7.59  16.25 / 13.22 
TGV [11]  1.58 / 5.72  5.42 / 8.82  17.89 / 13.47 
SD filter [13]  1.27 / 2.41  3.56 / 5.97  15.43 / 12.18 
DJF [17]  0.68 / 3.75  1.92 / 6.37  5.82 / 12.63 
0.57 / 3.14  1.58 / 5.78  4.63 / 10.45 
Figure 6 shows denoising results using one image from the BSD68 dataset [27]. The visually outperforms stateoftheart methods. Table 2 summarizes an objective evaluation by measuring average PSNR and structural similarity indexes (SSIM) [35] on 68 images from the BSD68 dataset [27]. As expected, our method achieves a significant improvement over the nonlocalbased method as well as the recent datadriven approaches. Due to the space limit, some methods were omitted in the table, and full performance comparison is available in the supplementary materials.
5.2 Depth Superresolution
Modern depth sensors, e.g. MS Kinect, provide dense depth measurement in dynamic scene, but typically have a low resolution. A common approach to tackle this problem is to exploit a highresolution (HR) RGB image as guidance. We applied our to this task, and evaluated it on the NYU v2 dataset [33] and Middlebury dataset [29]. The NYU v2 dataset [33] consists of 1449 RGBD image pairs of indoor scenes, among which 1000 image pairs were used for training and 449 image pairs for testing. Depth values are normalized within the range [0,255]. To train the network, we randomly collected RGBD patch pairs of size from training set. A lowresolution (LR) depth image was synthesized by nearestneighbor downsampling (, , and
). The network takes the LR depth image, which is bilinearly interpolated into the desired HR grid, and the HR RGB image as inputs.
Figure 7 shows the superresolution results of NMRF [23], TGV [11], deep joint image filtering (DJF) [17], and . The TGV model [11]
uses an anisotropic diffusion tensor that solely depends on the RGB image. The major drawback of this approach is that the RGBdepth coherence assumption is violated in textured surfaces. Thus, the restored depth image may contain gradients similar to the color image, which causes texture copying artifacts (Fig.
7(d)). Although the NMRF [23] combines several weighting schemes, computed from RGB image, segmentation, and initially interpolated depth, the texture copying artifacts are still observed (Fig. 7(c)). The NMRF [23] preserves depth discontinuities well, but shows poor results in smooth surfaces. The DJF [17] avoids the texture copying artifacts thanks to faithful CNN responses extracted from both color image and depth map (Fig. 7(e)). However, this method lacks the regularization constraint that encourages spatial and appearance consistency on the output, and thus it oversmooths the results and does not protect thin structures. Our preserves sharp depth discontinuities without notable artifacts as shown in Fig. 7(f). The quantitative evaluations on the NYU v2 dataset [33] and Middlebury dataset [29] are summarized in Table 3. The accuracy is measured by the bad matching percentage (BMP) [29] with tolerance 3.5.3 RGB/NIR Restoration
The RGB/NIR restoration aims to enhance a noisy RGB image taken under low illumination using a spatially aligned NIR image. The challenge when applying our model to the RGB/NIR restoration is the lack of the ground truth data for training. For constructing a large training data, we used the indoor IVRL dataset consisting of 400 RGB/NIR pairs [28] that were recorded under daylight illumination^{5}^{5}5This dataset [28] was originally introduced for semantic segmentation.. Specifically, we generated noisy RGB images by adding the synthetic Gaussian noise with and , and used 300 image pairs for training.
In Table 4, we performed an objective evaluation using 5 test images in [14]. The gives better quantitative results than other stateoftheart methods [31, 10, 13]. Figure 8 compares the RGB/NIR restoration results of Crossfield [31], DJF [17], and our on the realworld example. The input RGB/NIR pair was taken from the project website of [31]. This experiment shows the proposed method can be applied to realworld data, although it was trained from the synthetic dataset. It was reported in [14] that the restoration algorithm designed (or trained) to work under a daylight condition could also be used for both daylight and night conditions.
6 Conclusion
We have explored a general framework called the DeepAM, which can be used in various image restoration applications. Contrary to existing datadriven approaches that just produce the restoration result from the CNNs, the DeepAM uses the CNNs to learn the regularizer of the AM algorithm. Our formulation fully integrates the CNNs with an energy minimization model, making it possible to learn whole networks in an endtoend manner. Experiments demonstrate that the deep aggregation in the mapping step is the critical factor of the proposed learning model. As future work, we will further investigate an adversarial loss in pixellevel prediction tasks.
References
 [1] http://faculty.cse.tamu.edu/davis/suitesparse.html/.
 [2] http://www.vlfeat.org/matconvnet/.
 [3] A. Agrawal, R. Raskar, S. Nayar, and Y. Li. Removing photography artifacts using gradient projection and flashexposure sampling. ACM Trans. Graph., 24(3), 2005.
 [4] K. Bredies, K. Kunisch, and T. Pock. Total generalized variation. SIAM J. Imag. Sci., 3(3), 2010.
 [5] A. Buades, B. Coll, and J. Morel. A nonlocal algorithm for image denoising. CVPR, 2005.
 [6] H. C. Burger, C. J. Schuler, and S. Harmeling. Image denoising: can plain neural networks compete with bm3d? CVPR, 2012.
 [7] S. Chan, X. Wang, and O. Elgendy. Plugandplay admm for image restoration: fixed point convergence and applications. arXiv, 2016.
 [8] Q. Chen and V. Koltun. A simple model for intrinsic image decomposition with depth cues. ICCV, 2013.
 [9] Y. Chen, W. Yu, , and T. Pock. On learning optimized reaction diffusion processes for effective image restoration. CVPR, 2015.
 [10] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3d transformdomain collaborative filtering. IEEE Trans. Image Process., 16(8), 2007.
 [11] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruther, and H. Bischof. Image guided depth upsampling using anisotropic total generalized variation. ICCV, 2013.
 [12] S. Gu, L. Zhang, W. Zuo, and X. Feng. Weighted nuclear norm minimization with application to image denoising. CVPR, 2014.
 [13] B. Ham, M. Cho, and J. Ponce. Robust image filtering using joint static and dynamic guidance. CVPR, 2015.
 [14] H. Honda and L. V. G. R. Timofte. Make may day  highfidelity color denoising with nearinfrared. CVPRW, 2015.
 [15] Y. Kim, B. Ham, C. Oh, and K. Sohn. Structure selective depth superresolution for rgbd cameras. IEEE Trans. Image Process., 25(11), 2016.
 [16] D. Krishnan and R. Fergus. Fast image deconvolution using hyperlaplacian priors. NIPS, 2009.
 [17] Y. Li, J. Huang, N. Ahuja, and M. Yang. Deep joint image filtering. ECCV, 2016.
 [18] J. Liu, S. Zhang, S. Wang, and D. Metaxas. Multispectral deep neural networks for pedestrian detection. BMVC, 2016.
 [19] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. ICCV, 2001.

[20]
M. Mozer.
A focused backpropagation algorithm for temporal pattern recognition.
Complex Systems, 3(4), 1989.  [21] H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. ICCV, 2015.
 [22] N. Parikh and S. Boyd. Proximal algorithms. Found. and Trends in optimization, 2014.
 [23] J. Park, H. Kim, Y. W. Tai, M. S. Brown, and I. Kweon. High quality depth map upsampling for 3dtof cameras. ICCV, 2011.
 [24] R. Ranftl and T. Pock. A deep variational model for image segmentation. GCPR, 2014.
 [25] G. Riegler, D. Ferstl, M. Rther, and H. Bischof. A deep primaldual network for guided depth superresolution. BMVC, 2016.
 [26] G. Riegler, M. Rther, and H. Bischof. Atgvnet: Accurate depth superresolution. ECCV, 2016.
 [27] S. Roth and M. J. Black. Fields of experts. IJCV, 82(2), 2009.
 [28] N. Salamati, D. Larlus, G. Csurka, and S. Susstrunk. Incorporating nearinfrared information into semantic image segmentation. arXiv, 2014.
 [29] D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense twoframe stereo correspondence algorithms. IJCV, 47(1).
 [30] U. Schmidt and S. Roth. Shrinkage fields for effective image restoration. CVPR, 2014.
 [31] X. Shen, Q. Yan, L. Xu, L. Ma, and J. Jia. Multispectral joint image restoration via optimizing a scale map. IEEE Trans. Pattern Anal. Mach. Intell., 1(1), 2015.
 [32] X. Shen, C. Zhou, L. Xu, and J. Jia. Mutualstructure for joint filtering. ICCV, 2015.
 [33] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from rgbd images. ECCV, 2012.
 [34] Y. Wang, J. Yang, W. Yin, , and Y. Zhang. A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imag. Sci., 1(3), 2008.
 [35] Z. Wang, A. C. Bovik, H. Rahim, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4), 2004.
 [36] L. Xu, C. Lu, Y. Xu, , and J. Jia. Image smoothing via gradient minimization. ACM Trans. Graph., 30(6), 2011.
 [37] H. Zhao, O. Gallo, I. Frosio, and J. Kautz. Loss functions for neural networks for image processing. arXiv, 2015.
 [38] S. Zheng, S. Jayasumana, B. Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr. Conditional random fields as recurrent neural networks. ICCV, 2015.
 [39] D. Zoran and Y. Weiss. From learning models of natural image patches to whole image restoration. ICCV, 2011.
Comments
There are no comments yet.