Blur is a common artifact for images taken by hand-held cameras. It is mostly caused by object motions, hand shake or out-of-focus. The blurry image is often modeled as convolution of a sharp image and a blur kernel. And the target of deblurring is to restore a latent sharp image from the blurry one. Single image deblurring, however, is a highly ill-posed problem, since it contains insufficient information to recover a unique sharp image.
In the past few years, assorted constraints and regulation schemes have been proposed to exclude implausible solutions. Priors, like total variation prior , sparse image prior , heavy-tailed gradient prior  and dark channel prior , are combined with norm image regulation term to suppress ringing artifacts and improve quality. Zhen 
takes advantage of inertia sensor data to gain extra information and estimate spatially varying blur kernels. However, since the blur kernel in reality is more complicated than the model, estimation of blur kernel is inaccurate, which causes ringing artifacts. Furthermore, these methods based on iterative optimization techniques are computationally intensive.
Recently, Convolutional Neural Networks (CNN) and deep learning related techniques have drawn a great attention in computer vision and image processing. Their applications in image deblurring demonstrate promising results. Sun and Schuler  use CNN to estimate the spatially-invariant blur kernel and obtain latent image by tradition pipeline. Chakrabarti  trained a neural network to predict complex Fourier coefficients of motion kernel. Recently kernel-free end-to-end deblurring methods are proposed by Nah et al.  and Kupyn et al. . Nah  adopted the multi-scale network to mimic conventional coarse-to-fine optimization methods, and proposed a new realistic blurry image dataset with ground truth sharp images. The work of Kupyn  trains the popular Generative Adversarial Network (GAN) on the same dataset with fewer parameters, gains higher PSNR values than that of Nah et al.  on the GOPRO dataset, and beats the others on Kohler dataset  using SSIM. Although  performs well based on metric scores, visually, its deblurred result suffers grid artifacts, as illustrated in Fig. 1.
To address this artifact, we utilize the dark channel prior. Dark channel is defined as minimal intensity among three color channels of pixels in a local area. It was first proposed by He et al.  for dehazing problem, based on the statistics that haze-free outdoor images have a smaller dark channel than hazy images. Pan et al.  applied dark channel prior to image deblurring. They theoretically and empirically proved that comparing with blur images, the dark channel of sharp image is more sparse. And their results demonstrate that dark channel prior contributes to suppressing ringing and other artifacts. In order to enforce the sparsity, they utilize a regulation term of norm to count the nonzero elements of dark channel maps. Unfortunately, norm is not differentiable, which makes it hard to utilize in back propagation of neuron networks. Instead of using norm, we adopt norm to directly compute difference of the dark channel maps between groundtruth sharp images and deblurred images.
In this paper, we present a GAN based image deblurring network using dark channel difference as loss function. The proposed technique is not just a straightforward application of GAN. This method focuses on how to combine traditional knowledge with deep learning to make the network achieve better performance. Compared to the previous GAN-based deblurring network, the proposed network has less layers and weights. It leads to less training and testing time, and more importantly achieves favorable results. In addition, the original GOPRO training dataset consists of artificially created blurry images without noise, which are usually different from the real blurry images. To improve the quality of our trained network on more realistic blurry images and increase network robustness, we add random Gaussian noise with variance in a limited range onto the training image patches. The comparison experiments show that our network outperforms Kupyn et al. for both GOPRO test dataset and real noisy blurry images.
2 Related Work
2.1 Conditional General Adversarial Networks
GAN is first proposed by Goodfellow et al.  to train a generative network in an adversarial process. It consists of two networks: a generator and a discriminator . Generator generates a fake sample from input noise
, while discriminator estimates the probability that the fake sample is from training data rather than generated by generator. These two networks are simultaneously trained until discriminator cannot tell if the sample is real or fake. This process can be summarized as a two-player min-max game with the following function:
where denotes distribution over training data and is distribution of input noise
. GAN has been applied to different image restoration problems like super-resolution and texture transfer .
Mirza et al.  extend GAN into a conditional model (eq. (2)), called Conditional Generative Adversarial Nets (CGAN), so that GAN can make use of auxiliary information to direct both generator and discriminator. Isola et at. 
adopt CGAN architecture to achieve general image-to-image translation. In, more than just random noise , similar image is added as input of the generator, where and share part of features. and can be pairs of hazing and clear images about same scene, or different color buildings with same structure. Based on network architecture of , Kupyn et al.  utilize Wasserstein loss  and perceptual loss  to train a CGAN for deblurring problem.
2.2 Dark Channel Prior
For an image , the dark channel of a pixel is defined by He et al.  as
where and are pixel locations, denotes the image patch centered at , and is the -th color channel. As shown in eq. (3), dark channel describes the minimum intensity in an image patch. He et al.  observe that dark channel map in a haze-free image tends to be zero. Pan et al.  use a less restrictive assumption that dark channel map is sparse rather than zero. Inspired by this, they adopt -regulation term to enforce less sparse dark channel in a deblurring process, where norm counts non-zero elements in a dark channel map.
3 Proposed Method
3.1 Network Architecture
The proposed network aims at obtaining a generator to restore sharp image from input blurry image . This generator is trained with a discriminator using pairs of blurry image and ground truth sharp image . This structure is shown in fig.2. Except for the first layer of discriminator and generator, each block in both generator and discriminator consists of a convolutional layer, batch normalization step  and an activation function LeakyReLu  with leaking rate . The first layers are not normalized.
Generator The proposed generator adopts an encoder-decoder framework to achieve image-to-image performance. Similar to 
, the encoder consists of a sequence of convolutional layers with strideand kernel size
. And the decoder has a chain of transposed-convolutional layers with same size of stride and kernel. Encoder represents the input image with a bottleneck vector and decoder recovers an image with same size of input from bottleneck vector. A skip architecture is applied by inserting same size of layers from encoder after each layer of decoder. This skip connection refines the details in output image by combining deep, coarse, semantic information and shallow, fine, appearance information. Dropout is also included in decoder to avoid over-fitting.
Discriminator The proposed discriminator contains a series of convolutional layers with stride and kernel size
. The output of discriminator is a scalar, followed by a sigmoid function.
3.2 Loss Functions
According to eq. (2), we train discriminator and generator alternatively. The loss function of discriminator is same as adversarial loss:
In the deblurring setting, and denote blurry and sharp image, respectively. The generator loss is defined as combination of adversarial loss, content loss and dark channel loss:
where and in our experiments.
Content loss We adopt the traditional content loss to direct the output of generator to ground truth. Although both and norm are commonly used, norm is chosen since it attains less blurring result .
Dark channel loss In order to suppress ringing and grid artifact, dark channel prior is especially chosen. Pan et al.  exploit norm to count non-zero elements in a dark channel map of an image . Since norm is indifferential, norm is utilized instead which calculates the distance of dark channel map between ground truth and deblurred image.
Unlike , we discard the perceptual loss . Kupyn et al.  employ the difference of one feature map in the VGG-19  between ground truth and restored images as perceptual loss. GAN is known for its ability to reserve perceptual feature of an image. Adding an extra perceptual loss seems a noneffective repeat. Our experiment shows that perceptual loss doesn’t improve the result, on the contrary, it leads to worse performance.
GOPRO dataset  is utilized for training and testing our network. It contains 2103 paris of blurry and ground truth images in train dataset, and 1111 pairs in test dataset. Resolution of the image are 720p. The blurry image is generated by averaging a sequence (7-15) of continuous sharp images. Sharp image in the middle of sequence is regarded as ground truth. GOPRO dataset is regarded as benchmark by many deblurring algorithms like  and . Although GOPRO dataset is widely used, it only employs noise-free images. For natural images, however, noise always accompanies with blur. To test our model on more real images, we add Gaussian noise with to original GOPRO_Large dataset and create a new GOPRO-noise dataset with 1111 image pairs. A synthetic dataset in  is adopted for training. Same as combination version of DeblurGAN in , we use both GOPRO train dataset and synthetic dataset to train our network.
4.2 Training Process
The proposed network is trained on NVIDIA GeForce GTX 1080 Ti GPU and tested on Mac Pro with 2.7 GHz Intel Core i5 CPU. Similar to , the input training pair is randomly croped as size of. For each iteration of optimization, 1 step is performed on discriminator , followed by 2 steps on generator to prevent discriminator loss
from zero. The model is trained for 15 epochs within 2 days, comparing with 200 epochs for 6 days in . Furthermore, despite of instability GAN’s training, our method converges to similar result for each and every training task, which demonstrates the robustness of our GAN architecture.
4.3 Result and Comparison
Our test results are mainly compared with state-of-art GAN based deblurring network DeblurGAN . DeblurGAN defeats deep learning networks  and  on GOPRO dataset. Since the author posted the code online111https://github.com/KupynOrest/DeblurGAN, we compare our network with DeblurGAN by directly adopting its uploaded network and latest trained weights. We test our model on GOPRO and GOPRO-noise test datasets.
Fig. 3 illustrates the deblurred results of  and our model. Blurry image in the first row is picked in GOPRO-noise dataset and the blurry one of second row is real natural image with motion blur taken by camera. According to local patches, although  can deal with blur but its results suffer from grid artifacts, while our model with dark channel loss achieves sharper images without grid artifacts. Furthermore, for motion blurry image (second row), the sharp part in input image remains unchanged in our deblurred result, but extra grid artifacts are added to result of .
The quantitative performance of the proposed network on two dataset GOPRO and GOPRO-noise is shown in Tab. 1. In our experiment, the coefficient of dark channel loss (). The results are compared with same network without dark channel loss , same network with extra perpetual loss as well as DeblurGAN . All test images are downsampled by factor of two. The perpetual loss follows what it is in . The proposed model performs best among the comparisons for both noise-free and noisy dataset. DeblurGAN performs less well owing to its grid artifacts. Perceptual loss leads to a worse result. Since GAN is good at preserving perceptual feature already, perceptual loss brings no extra constraints for the network. Comparison with dc=0 demonstrates that dark channel loss contributes to better result.
To address deblurring problem using a CGAN based architecture and to tackle the issue with grid artifacts in GAN based deblurring methods, this paper incorporates a dark channel prior. The dark channel prior is employed by norm rather than in order to make it more friendly for network training. To validate the deblurring result on more nature images, a noise involved dataset is proposed. The proposed network shows a great deblurring performance for both synthetic and real blurry images.
-  Nicolas Dey, Laure Blanc-Feraud, C Zimmer, Z Kam, J-C Olivo-Marin, and J Zerubia. A deconvolution method for confocal microscopy with total variation regularization. In Biomedical Imaging: Nano to Macro, 2004. IEEE International Symposium on, pg. 12231226. (2004).
-  Anat Levin, Rob Fergus, Fredo Durand, and William T Freeman. Image and depth from a conventional camera with a coded aperture. In ACM Transactions on Graphics (TOG), volume 26, pg. 70. (2007).
-  Shan Qi, Jiaya Jia, and Aseem Agarwala. High-quality motion deblurring from a single image. In ACM Trans. Graph., volume 27, pg. 73. (2008).
-  Jinshan Pan, Deqing Sun, Hanspeter Pfister, and MingHsuan Yang. Blind image deblurring using dark channel prior. In CVPR, pg. 16281636, (2016).
-  Ruiwen Zhen and Robert L. Stevenson. Multiimage motion deblurring aided by inertial sensors. Journal of Electronic Imaging, 25(1):013027013027, (2016).
-  Christian J. Schuler, Michael Hirsch, Stefan Harmeling, and Bernhard Schölkopff. Learning to deblur. IEEE Trans. Pattern Anal. Mach. Intell., 38(7):14391451, (2016).
-  Jian Sun, Wenfei Cao, Zongben Xu, and Jean Ponce. Learning a convolutional neural network for non-uniform motion blur removal. In CVPR, pg. 769777, (2015).
-  Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multiscale convolutional neural network for dynamic scene deblurring. pg. 38833891, (2017).
Kupyn, Orest Budzan, Volodymyr Mykhailych, Mykola Mishkin, Dmytro Matas, and Jiri Matas. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. CVPR camera-ready (2018).
-  Rolf Köhler, Michael Hirsch, Betty Mohler, Bernhard Schölkopf, and Stefan Harmeling. Recording and playback of camera shake: Benchmarking blind deconvolution with a realworld database. In ECCV, pg. 2740, (2012).
-  Kaiming He, Jian Sun, and Xiaoou Tang. Single image haze removal using dark channel prior. In CVPR, pg. 19561963, (2019).
-  Tae Hyun Kim, Seungjun Nah, and Kyoung Mu Lee. Dynamic scene deblurring using a locally adaptive linear blur model. arXiv preprint 1603.04265, (2016).
-  Ayan Chakrabarti. A neural approach to blind motion deblurring. In ECCV, pg. 221235. Springer, (2016).
-  Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. Generative adversarial nets. In NIPS, pg. 2672-2680, (2014).
-  Mehdi Mirza, Simon Osindero. Conditional generative adversarial nets. CoRR, abs/1411.1784., (2014).
-  Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, (2016).
-  Chuan Li and Michael Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. ECCV, (2016).
-  Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image translation with conditional adversarial networks. In CoRR, abs /1611.07004, (2016).
-  Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein GAN. ArXiv, arXiv:1701.07875v3, (2017).
-  Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, (2016).
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 (ICML’15), Francis Bach and David Blei (Eds.), Vol. 37. JMLR.org 448-456.(2015).
-  E. Shelhamer, J. Long and T. Darrell, Fully Convolutional Networks for Semantic Segmentation, in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pg. 640-651, (2017).
-  Bing Xu, Naiyan Wang, Tianqi Chen, Mu Li. Empiricalevaluationof rectified activations in convolutional network. arXiv preprint arXiv:1505.00853, 2015.
-  Karen Simonyan, Andrew Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv e- prints, Sept. (2014).
-  Abadi, Martín, Barham, Paul, Chen, Jianmin, Chen, Zhifeng, Davis, Andy, Dean, Jeffrey, Devin, Matthieu, Ghemawat, Sanjay, Irving, Geoffrey, Isard, Michael, Kudlur, Manjunath, Levenberg, Josh, Monga, Rajat, Moore, Sherry, Murray, Derek Gordon, Steiner, Benoit, Tucker, Paul A., Vasudevan, Vijay, Warden, Pete, Wicke, Martin, Yu, Yuan and Zhang, Xiaoqiang. TensorFlow: A system for large-scale machine learning. CoRR abs/1605.08695 (2016) .