1 Introduction
Single image nonblind deconvolution aims to recover a sharp latent image given a blurred image and the blur kernel. The community has made active research effort on this classical problem with the last decade. Assuming the camera motion is spatially invariant, a blurred image can be modeled as a convolution between a blur kernel and a latent image :
(1) 
where is additive noise and is the convolution operator. In nonblind deconvolution, we solve from and , which is an illposed problem since the noise is unknown.
Conventional approaches, such as the RichardsonLucy deconvolution [17] and the Wiener filter [30]
, suffer from serious ringing artifacts and are less effective to deal with large motion and outliers. Several methods focus on developing effective image priors for image restoration, including HyperLaplacian priors
[12, 13], nonlocal means [2], field of experts [21, 24, 23, 20], patchbased priors [35, 27] and shrinkage fields [22]. However, these image priors are heavily based on the empirical statistics of natural images. In addition, these image priors typically lead to highly nonconvex optimization problems. Most of the aforementioned methods need expensive computational costs to obtain stateoftheart deblurred results.Recently, the deep neural network has been applied to image restoration [25, 32]. However, these methods need to retrain the network for different blur kernels, which is not practical in realworld scenarios.
Different from existing methods, we propose a FCNN for iterative nonblind deconvolution, which is able to automatically learn effective image prior and does not need to retrain the network for different blur kernels. The proposed method decomposes the nonblind deconvolution into two steps: image denoising and image deconvolution. In the image denoising step, we train a FCNN to remove noises and outliers in the gradient domain. The learned image gradients are treated as image priors to guide the image deconvolution. In the image deconvolution step, we concatenate a deconvolution module at the end of the FCNN to remove the blur from input images. We cascade the FCNN into a multistage architecture to deconvolve blurred images iteratively. The proposed FCNN adaptively learns effective image priors to preserve image details and structures. In order to effectively suppress ringing artifacts and noises in the smooth regions, we propose to optimize the FCNN with a robust loss function instead of a commonly used loss function. In addition, we optimize the hyperparameters in the deconvolution modules. Extensive evaluation on benchmark datasets demonstrates that the proposed method performs favorably against stateoftheart algorithms in terms of quality and speed.
2 Related Work
Nonblind deconvolution has been studied extensively and numerous algorithms have been proposed. In this section we, discuss the most relevant algorithms and put this work in proper context.
Since nonblind deblurring is an illposed problem, it requires some assumptions or prior knowledge to constrain the solution space. The early approaches, e.g., Wiener deconvolution [30]
, assume that the value of every pixel should follow Gaussian distribution. However, this assumption does not hold for natural images as the distribution of realworld image gradient is heavytailed. To develop an image prior that fits the heavytailed distribution of natural images, the HyperLaplacian prior is proposed
[13]. As solving the image restoration with the HyperLaplacian prior is timeconsuming, Krishnan and Fergus [12] propose an efficient algorithm based on a halfquadratic splitting method.To learn good priors for image restoration, Roth and Black [20] learns a group of field of experts (FOEs) to fit the heavytailed distribution of natural images. The FOE framework is further extended by [23, 22]. However, methods with field of expert usually lead to complex optimization problems. Solving these problems are usually timeconsuming.
The Gaussian Mixture Model (GMM) has also been developed to fit the heavytailed distribution of natural image gradient. Fergus
et al. [7]use a mixture of Gaussians to learn an image gradient prior via variational Bayesian inference. Zoran and Weiss
[35] analyze the image priors in image restoration and propose a patch based prior based on GMM. This work is further extended by Sun et al. [27]. Although good results have been achieved, solving these methods needs heavy computation loads.Recently deep learning has been used for lowlevel image processing such as denoising
[6, 3, 31, 8][5, 29, 18, 9, 19, 10, 34] and edgepreserving filtering [33, 14, 15].For nonblind deblurring, Schuler et al. [25]
develop a multilayer perceptron (MLP) approach to remove noise and artifacts which are produced by the deconvolution process. Xu
et al. [32]use a deep CNN to restore images with outliers. This method uses singular value decomposition (SVD) to reduce the number of parameters in the network. However, it needs to finetune the network for every kernel as it uses SVD of the pseudo inverse kernel as the network initialization. Different from existing CNNbased method, we develop an effective FCNN for iterative nonblind deconvolution. We cascade the FCNN into a multistage architecture to deconvolve blurred images to iteratively preserve the details of the restored images. Moreover, our method does not retain model for each blur kernel. It is much faster than the existing image restoration methods.
3 Proposed Algorithm
In this section, we present our algorithm to learn effective image prior for nonblind image deconvolution. We first review halfquadratic optimization in image restoration and then introduce our method.
3.1 Motivation
Halfquadratic splitting framework has been widely used in nonblind deblurring methods [12, 35, 23, 22]. In this section, we first review this method in image restoration and then motivate our method. The conventional model of image restoration is
(2) 
where are horizontal and vertical gradient operators, is the regularization of image gradient of . With the halfquadratic splitting method, model (2) can be reformulated as
(3) 
where is an auxiliary variable and is a weight. The halfquadratic optimization with respect to (3) is to alternatively solve
(4) 
and
(5) 
We note that (4) is actually a denoising problem while (5) is a deconvolution with respect to . If the solution of
is obtain, the clear image can be efficiently solved by fast Fourier transform (FFT),
(6) 
where and denote the Fourier transform and its inverse transform, respectively, is the complex conjugate operator and
is the hyperparameter in deconvolution.
3.2 Network Architecture
The proposed network architecture for nonblind deconvolution is shown in Figure 1. The input of our network includes blurry image and the corresponding blur kernel. The proposed network first applies the deconvolution operation on the blurry images via a deconvolution module and then performs convolutions to the vertical and horizontal gradients to generate the results with fewer noises. The denoised image gradients are treated as image priors to guide the image deconvolution in the next iteration.
Denoising by FCNN.
We note that though the output of deconvolution from (6) is sharp, it usually contains noises and significant ringing artifacts (see Figure 2(k)). To solve this problem, we develop a FCNN and apply it to the vertical and horizontal gradients to remove noises and ringing artifacts. Applying FCNN to the vertical and gradient horizontal gradients usually leads to different network weight parameters. Similar to [33], we transpose the vertical gradient so vertical gradient and horizontal gradient can share the weights in the training process. Table 1
shows the details of the proposed network. We add a rectified linear unit (ReLU) after every convolution layer as activation function except the last one.
Deconvolution module.
The deconvolution module is used to restore sharp images. It is defined as (5). In the proposed network, it is applied to the gradient denoising outputs from FCNN to guide the image restoration.
(a) Clean image  (b) Blurry image  (c) Intensity domain output  (d) Gradient domain output 
(e) Local region of (a)  (f) Intensity: Initial result  (g) Intensity: 1st iteration  (h) Intensity: 2nd iteration  (i) Intensity: 3rd iteration 
(j) Local region of (b)  (k) Gradient: Initial result  (l) Gradient: 1st iteration  (m) Gradient: 2nd iteration  (n) Gradient: 3rd iteration 
3.3 Loss Function for FCNN Training
Since it is very difficult to train the network in an endtoend manner, we train the weights of FCNN. That is, we first train the network weights and then fix these weights when performing the deconvolution. After the deconvolution module, the trained network weights is then updated. This training procedure is achieved by minimizing the loss function :
(7) 
where is the denoising mapping learned by FCNN, is the FCNN weights, , , is the number of training samples in every batch, is norm and is the ground truth image.
3.4 Hyperparameters Training
In order to get optimal hyperparameters for the deconvolution module (5), we train them in an endtoend manner with fixed FCNN weights. The hyperparameters training process is achieved by minimizing the loss function
(8) 
where is the output of the final deconvolution module.
As the forward propagation of deconvolution module is defined as (6), we can get the gradient in backward propagation by
(9) 
where .
4 Analysis and Discussion
In this section, we analyze the effect of FCNN for iterative deconvolution, demonstrate why the gradient domain is used, and validate the proposed loss function used in the proposed network.
4.1 Effect of FCNN for iterative deconvolution
In the proposed method, we iteratively solve the deconvolution and denoising part. That is, network parameters of FCNN are updated at each iteration. With this manner, the highquality results can be obtained.
Figure 3 shows an example which demonstrates the effectiveness of the FCNN for iterative deconvolution. As shown in Figure 3(a), the result generated by one iteration network contains some artifacts and has a lower PSNR value. In contrast, these artifacts are reduced by the iterative FCNN which accordingly lead to a much clearer image with higher PSNR value (Figure 3(b)). Please noted that one iteration network is different from the first iteration output of the three iterations network such as Figure. 2 (l). We optimize the FCNN weights and deconvolution hyperparameters for the network with only one iteration. This will also apply for the one iteration network in the experiment section.
(a) One iteration network  (b) Three iterations network 
PSNR: 29.26 dB  PSNR: 29.85 dB 
4.2 Gradient Domain v.s. Intensity Domain
The denoising part is mainly used to remove noise and ringing artifacts while keeping textures. We note that the image gradient is able to model the details and structures of images. Thus, we train the network in image gradient domain instead of intensity domain. We train two three iterations network where the restored results are computed based on intensity domain and gradient domain. As shown in Figure. 2, the results reconstructed from intensity domain (the second row) contains several noise and artifacts relative that from gradient domain (the third row).
L2 loss 
L1 loss 
Ground Truth Vertical Gradient Horizontal Gradient 
4.3 Effect of the Loss Function
Most of existing CNN based lowlevel vision methods use the norm based reconstruction error as the loss function [5]. However, the norm is not robust to outliers and usually leads to results contains noise and ringing artifacts. To overcome the limitations of norm based reconstruction error, we use an norm based reconstruction error as the loss function, i.e., (7).
To validate the effect of the proposed loss function, we train the proposed network with the norm based reconstruction error and the norm based reconstruction error using the same settings for the first iteration. As shown in Figure 4, the method with the norm based reconstruction error converges better than that of the norm based reconstruction error. Figure 5 shows that using norm based reconstruction error is able to remove noise and artifacts compared to that of norm based reconstruction error.
5 Experimental Results
We evaluate the proposed algorithm against the stateoftheart methods using benchmark datasets for nonblind deconvolution. All the MATLAB source code and datasets will be made available to the public. More results can be found in the supplementary material.
5.1 Training
Parameter settings.
To train the network, we optimize the hyperparameters and the weights of FCNN iteratively. Specifically, the weights of FCNN are updated iteration by iteration with fixed hyperparameters and then hyperparameters are trained in an endtoend manner with fixed weights of FCNN . We implement the proposed algorithm using the MatConvNet [28]
package. We use Xavier initialization for the FCNN weights of each layer. The initial value in the hyperparameters training stage is randomly initialized. (But keeps the later iteration has smaller hyperparameter than the former one.) Stochastic gradient descent (SGD) is used to train the network. The learning rate in the training of FCNN is 0.01. For the learning rate in the hyperparameter training stage, we set it to be 10 in the last deconvolution module and 10,000 in other modules. The momentum value for both FCNN and hyperparameters training is set to be 0.95. Since hyperparameters are easily stuck into local minimal. We train the network with several hyperparameter initializations and select the best one.
Training dataset.
In order to generate enough blurred images for training, we use BSD500 dataset [1]. and randomly cropped image patches with a size of pixels as the clear images. The blurred kernels are generated according to [4], whose size ranges from 11 to 31 pixels. We generate blurred kernels according to [4], where the size of blur kernels ranges from 11 to 31 pixels. Some examples of our randomly generated kernels are shown in Figure 6. After obtaining these generated blur kernels, we convolve the clear image patches with blur kernels and plus Gaussian noises to obtain the blurry image patches. And we train three networks with 1%, 3% and 5% noises respectively.
Test dataset.
For the test dataset, we use the 80 ground truth clear images from the dataset by Sun et al. [26] and eight blur kernels from the dataset by Levin et al. [13]. Thus, we have 640 blurred images in total. We evaluate all the methods on the blurred images with different Gaussian noise level which includes 1%, 3% and 5%. In addition to using the ground truth kernels from the dataset by Levin et al. [13]
, we also use the estimated blur kernels from the stateoftheart blind deblurring methods
[16] to examine the effectiveness of the proposed method.5.2 Comparisons with the StateoftheArts
We compare the proposed iterative FCNN with other nonblind deblurring algorithms including HL [12], EPLL [35], MLP [25], and CSF [22]. For the proposed method, we also use the proposed algorithm with one iteration and three iterations for comparison. For fairness, we use the online available implementation of these methods and tuned the parameters to generate the best possible results.
We first quantitatively evaluate our method on the dataset with 1% Gaussian noise and use PSNR and SSIM as the metrics. As shown in Table 2, our method outperforms HL [12], MLP [25] and CSF [22] in terms of PSNR and SSIM metrics. Although EPLL method performs slightly better than the proposed methods, this method is not efficient as it needs to solve complex optimization problems. Furthermore, this method usually smooths details as shown in Figure 7(c), while our method generates the results with much clearer textures (Figure 7(f)). We further note that the PSNR and SSIM values of the proposed iterative FCNN are higher than those of the proposed method with only one iteration network, which demonstrates the effectiveness of the iterative FCNN method. In addition, we use the estimated blur kernels from Pan[16] to evaluate the proposed method. The PSNR and SSIM values in Table 2 demonstrate that the proposed method still performs well and can be applied to methods Pan[16] to improve the performance of restored results.
We further evaluate our method on the images with 3% and 5% Gaussian noise. Tables 3 and 4 show the results by different methods. Our methods achieves better performance compared to HL [12], MLP [25] and CSF [22] when the noise level is high. In addition to PSNR and SSIM, our method also generates much clearer images with fine textures as shown in Figure 7.
Running time.
The proposed method performs more efficiently than other stateoftheart methods. Table 5 summarizes the average run time of representative methods with different image resolutions. HL and EPLL run on an Intel Core i7 CPU and MLP, CSF and our method run on a Nvidia K40 GPU.
1% noise PSNR / SSIM 30.45 / 0.84 32.05 / 0.88 31.20 / 0.85 30.89 / 0.85 32.06 / 0.88 
3% noise PSNR / SSIM 27.58 / 0.79 29.34 / 0.84 25.57 / 0.51 28.66 / 0.75 29.85 / 0.84 
5% noise PSNR / SSIM 24.16 / 0.71 26.04 / 0.77 23.91 / 0.48 26.09 / 0.67 27.30 / 0.69 (a) Ground Truth (b) HL[12] (c) EPLL[35] (d) MLP[25] (e) CSF[22] (f) 3 iterations network 
blur kernel  ground truth  Pan[16] 

HL[12]  31.57/0.87  29.94/0.84 
EPLL[35]  33.00/0.89  30.61/0.87 
MLP[25]  31.82/0.86  28.76/0.80 
CSF[22]  31.93/0.87  30.22/0.86 
1 iteration  32.50/0.89  30.38/0.86 
3 iterations  32.82/0.90  30.39/0.87 
blur kernel  ground truth  Pan[16] 

HL[12]  27.42/0.73  26.91/0.72 
EPLL[35]  28.71/0.78  27.75/0.77 
MLP[25]  26.26/0.60  25.04/0.57 
CSF[22]  28.43/0.78  27.11/0.74 
1 iteration  28.71/0.77  27.34/0.75 
3 iterations  29.12/0.79  27.58/0.77 
blur kernel  ground truth  Pan[16] 

HL[12]  25.85/0.67  25.48/0.66 
EPLL[35]  27.00/0.71  26.24/0.71 
MLP[25]  24.62/0.51  22.32/0.45 
CSF[22]  26.92/0.67  24.86/0.65 
1 iteration  27.25/0.72  25.49/0.69 
3 iterations  27.66/0.74  25.77/0.72 
image size  HL  EPLL  MLP  CSF  ours 

[12]  [35]  [25]  [22]  
0.31  209.58  0.80  0.08  0.02  
0.71  953.52  2.98  0.09  0.03  
2.11  N/A  6.73  0.33  0.06 
5.3 Experiments for real blurry images
We also test our three iterations network for one real blurry image from [11]. We add 3% Gaussian noises into the original blurred image and use the network trained with this noise level for this experiment. We use [16] to estimate kernel of the blurry image. Figure. 8 shows that our three iterations network has comparable performance relative to EPLL. CSF cannot remove all the noise especially in flat regions and the result of HL is too blurry.
6 Conclusions
We propose an efficient nonblind deconvolution algorithm based on a fully convolutional neural network (FCNN). The proposed method involves deconvolution part and denoising part, where the denoising part is achieved by a FCNN. The learned features from FCNN is able to help the deconvolution. To remove noises and ringing artifacts, we develop an FCNN for iterative deconvolution, which is able preserve image details. Furthermore, we propose a hyperparameters learning algorithm to improve the performance of image restoration. Our method performs favorably against stateoftheart methods on both synthetic and realworld images. Importantly, our approach is much more efficient because FCNN is used for denoising part which is implemented in GPU in parallel.
References
 [1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. TPAMI, 33(5):898–916, 2011.
 [2] A. Buades, B. Coll, and J. Morel. A nonlocal algorithm for image denoising. In CVPR, pages 60–65, 2005.
 [3] H. C. Burger, C. J. Schuler, and S. Harmeling. Image denoising: Can plain neural networks compete with bm3d? In CVPR, 2012.
 [4] A. Chakrabarti. A neural approach to blind motion deblurring. In ECCV, 2016.
 [5] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image superresolution. In ECCV, 2014.
 [6] D. Eigen, D. Krishnan, and R. Fergus. Restoring an image taken through a window covered with dirt or rain. In ICCV, 2013.
 [7] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. T. Freeman. Removing camera shake from a single photograph. ACM TOG (Proc. SIGGRAPH), 25(3):787–794, 2006.
 [8] V. Jain and S. Seung. Natural image denoising with convolutional networks. In NIPS, 2009.
 [9] J. Kim, J. K. Lee, and K. M. Lee. Accurate image superresolution using very deep convolutional networks. In CVPR, 2016.
 [10] J. Kim, J. K. Lee, and K. M. Lee. Deeplyrecursive convolutional network for image superresolution. In CVPR, 2016.
 [11] R. Köhler, M. Hirsch, B. Mohler, B. Schölkopf, and S. Harmeling. Recording and playback of camera shake: Benchmarking blind deconvolution with a realworld database. In ECCV, 2012.
 [12] D. Krishnan and R. Fergus. Fast image deconvolution using hyperlaplacian priors. In NIPS, 2009.
 [13] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman. Understanding and evaluating blind deconvolution algorithms. In CVPR, 2009.
 [14] Y. Li, J.B. Huang, N. Ahuja, and M.H. Yang. Deep joint image filtering. In ECCV, 2016.
 [15] S. Liu, J. Pan, and M.H. Yang. Learning recursive filters for lowlevel vision via a hybrid neural network. In ECCV, 2016.
 [16] J. Pan, Z. Lin, Z. Su, and M.H. Yang. Robust kernel estimation with outliers handling for image deblurring. In CVPR, 2016.
 [17] W. H. Richardson. Bayesianbased iterative method of image restoration. JOSA, 62(1):55–59, 1972.
 [18] G. Riegler, D. Ferstl, M. Rüther, and H. Bischof. A deep primaldual network for guided depth superresolution. In BMVC, 2016.
 [19] G. Riegler, M. Rüther, and H. Bischof. Atgvnet: Accurate depth superresolution. In ECCV, 2016.
 [20] S. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In CVPR, pages 860–867, 2005.
 [21] U. Schmidt, J. Jancsary, S. Nowozin, S. Roth, and C. Rother. Cascades of regression tree fields for image restoration. TPAMI, 38(4):677–689, 2016.
 [22] U. Schmidt and S. Roth. Shrinkage fields for effective image restoration. In CVPR, 2014.
 [23] U. Schmidt, C. Rother, S. Nowozin, J. Jancsary, and S. Roth. Discriminative nonblind deblurring. In CVPR, 2013.
 [24] U. Schmidt, K. Schelten, and S. Roth. Bayesian deblurring with integrated noise estimation. In CVPR, pages 2625–2632, 2011.

[25]
C. J. Schuler, H. Christopher Burger, S. Harmeling, and B. Scholkopf.
A machine learning approach for nonblind image deconvolution.
In CVPR, 2013.  [26] L. Sun, S. Cho, J. Wang, and J. Hays. Edgebased blur kernel estimation using patch priors. In ICCP, 2013.
 [27] L. Sun, S. Cho, J. Wang, and J. Hays. Good image priors for nonblind deconvolution  generic vs. specific. In ECCV, pages 231–246, 2014.
 [28] A. Vedaldi and K. Lenc. Matconvnet – convolutional neural networks for matlab. In ACM MM, 2015.
 [29] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang. Deep networks for image superresolution with sparse prior. In ICCV, 2015.

[30]
N. Wiener.
Extrapolation, interpolation, and smoothing of stationary time series
, volume 2. MIT press Cambridge, 1949.  [31] J. Xie, L. Xu, and E. Chen. Image denoising and inpainting with deep neural networks. In NIPS, 2012.
 [32] L. Xu, J. S. Ren, C. Liu, and J. Jia. Deep convolutional neural network for image deconvolution. In NIPS, 2014.
 [33] L. Xu, J. S. Ren, Q. Yan, R. Liao, and J. Jia. Deep edgeaware filters. In ICML, 2015.
 [34] X. Yu and F. Porikli. Ultraresolving face images by discriminative generative networks. In ECCV, 2016.
 [35] D. Zoran and Y. Weiss. From learning models of natural image patches to whole image restoration. In ICCV, 2011.
Comments
There are no comments yet.