Image restoration is a long studied and challenging problem that aims to restore a degraded image to its original form . One way to model the processes of image degradation is convolution with translational invariance 
where is the original image, is the convolution kernel, is the additive noise, is the degraded image, and denotes the number of channels in the images ( for greyscale images and for color images). Image deconvolution is the process of recovering the original image from the observed degraded image , i.e. the inverse process of convolutional image degradation. This work focuses on image deconvolution in two different settings: kernel-known and kernel-unknown (a.k.a. blind deconvolution).
Kernel-known: The preliminary stage of image deconvolution mainly considers the case where the convolution kernel is given , i.e. recovering with knowing in Equation 1. This problem is ill-posed, because simply applying the inverse of the convolution operation on degraded image with kernel , i.e. , gives an inverted noise term , which dominates the solution .
Blind deconvolution: In reality, we can hardly obtain the detailed kernel information and the deconvolution problem is formulated in a blind setting . More concisely, blind deconvolution is to recover without knowing . This task is much more challenging than it is under non-blind settings, because the observed information becomes less and the domains of the variables become larger .
In image deconvolution, prior information on unknown images and kernels (in blind settings) can significantly improve the deconvolved results. A traditional representation for such prior information is handcrafted regularizers in image energy minimization , e.g. total variation (TV) regularization for image sharpness  and regularization for kernel sparsity [38, 43]. However, prior representations like the above-mentioned regularizers have limited ability of expressiveness . Therefore, this work aims to find better prior representations of images and kernels to improve deconvolution performances.
Deep neural architecture has a strong capability to accommodate and express information because of its intricate and flexible structure . Compared to other image prior representations with limited structures (e.g. regularizers), neural nets with such powerful expressiveness seem more capable of capturing higher-level prior of natural images and degradation kernels. Deep image prior (DIP)  is a neural-based image prior representation which achieved good performance in various image restoration problems. The main idea of DIP is to substitute image variable in an energy function by the output of a deep convolutional neural net (ConvNet) with random noise inputs, so that the image prior can be captured by the hyperparameter of the ConvNet, and the output image is determined by the parameter of the ConvNet. One point to emphasize here is that priors expressed by both handcrafted regularizers and DIP are embodied in their own formulations or structures, which does not require large datasets for training. In the existing applications (incl. denoising, inpainting, etc.) of DIP, the degradation processes are considered as known. In this paper, we are the first to show that deep priors perform well in image deconvolution. Furthermore, we show that ConvNets can be utilized as a source of prior knowledge not only for natural images but also for degradation kernels (named as deep kernel prior, DKP), bridging the gap between traditional methods and deep neural nets. Through experiments we demonstrate that our deep image and kernel priors (DIKP) result in a significant improvement over traditional learning-free regularization-based priors in image deconvolution111We do not show any results from supervised deep network techniques because our method is unsupervised and our objective is to prove that our deep priors are better than handcrafted priors in image deconvolution..
2 Related work
The earliest traditional methods of image deconvolution include Richardson-Lucy (RL) method  and Weiner Filtering . Due to their simplicity and efficiency, these two methods are still widely used today, but they may be subject to ringing artifacts . To solve this, many refinements based on handcrafted regularization priors came out.  adopted TV regularizer as prior in kernel-known deconvolution.  proposed a progressive multi-scale optimization method based on RL method, with edge-preserving regularization as the image prior. For degradation kernels, early methods 
only dealt with their simple parametric forms. Later then, natural image statistics were used to estimate kernels[11, 26]. After that, [38, 43] adopted regularizer as kernel prior in blind deconvolution. However, handcrafted priors mentioned above have relatively simple structures, so their expressiveness is rather limited .
This work is inspired by traditional image deconvolution methods by handcrafted priors [36, 43], but trying to use deep image priors instead of handcrafted priors. It uses ConvNet to express the prior information of both natural images and degradation kernels, putting kernel-known and blind deconvolution under the same model. Besides, as discussed in , its ConvNet-based image prior representation links two sets of popular deconvolution methods: learning-based approaches by ConvNet [46, 49, 28] and learning-free approaches by handcrafted prior .
3 Data set and evaluation metrics
As discussed in section 1, capturing image prior by either regularization or deep neural net structures is learning-free. Therefore, data set explored in this work is only used for testing. Experiments and performance evaluation are conducted on a data set with standard test images shown in Figure 2. Those images, along with their preprocessing and evaluation mentioned in the following, are in line with standard practice and widely used in denoising , TV deblurring , etc., which guarantees the reliability of our results.
3.1 Observed data generation and kernels
To preprocess the image data and obtain degraded observations, we use the degradation model formulated as Equation 1 to transfer the original standard test image to the observed image , illustrated by the diagram in Figure 1. The noise matrix
is i.i.d. Gaussian with respect to each entry, and the noise strength (i.e. standard deviation)is fixed at to reduce experimental variables. To explore different kinds of degradation models, three common kernels for different kinds of degradation, Gaussian kernel , defocus  and motion blur  are used to generate the data set.
Gaussian: The kernel for degradation caused by atmospheric turbulence can be described as a two-dimensional Gaussian function [19, 33], and the entries of the unscaled kernel are given by the formula 
where is the center of , and determines the width of the kernel (i.e. standard deviation of the Gaussian). In this work, and are set to .
Defocus: Out-of-focus is another issue in optical imaging. Knowledge of the physical process that causes out-of-focus provides an explicit formulation of the kernel 
where denotes the radius of the kernel, which is set to in this work.
Motion blur: This happens if an image being recorded changes in a single exposure when taking a photograph. For example, when taking a picture, moving objects being taken at high speed or lens shake will blur the picture. In noiseless case, the convolution processes of motion blur with amplitude and shifting angle are given by the formula 
in which the shape of the kernel is a line segment as Figure 3 shows. In this work, the blur amplitude and shifting angle are set as and .
3.2 Evaluation metrics
We use the Mean Square Error (MSE) between the degraded image variable and the observation
to measure the energy function  and to track parameter iterations in the first experiment (see subsection 5.2). Using this metric, to minimize the energy is to find the image that, when degraded, is the same as the observation .
where is the maximum possible pixel value of the image, e.g. if images in double-precision floating-point data type, if in -bit data type. In this work, we use double-precision floating-point data type, i.e. .
In subsection 5.3, we compare the gradient distributions among output images and standard test images. To measure the similarity between a gradient frequency distribution and one by standard test images , we use the Kullback-Leibler (KL) divergence 
where denotes a bin corresponding to a range of gradient values, is the whole bin set covering all possible gradient values. From the definition, the similarity between two distributions and their KL divergence are negatively correlated.
where indicates the energy term associated with the data, and is the prior term. A general explanation of the energy term is the numerical difference between the given image data and the image variable processed by given degradation. For image deconvolution, the degradation operator is convolution, therefore the energy is designed as . The energy term can also be designed for other tasks in image restoration, such as inpainting , super-resolution  and image denoising . Methods adopted in this work are all based on the deconvolution energy model and its mutants.
4.1 Baseline models with regularization prior
The gradient magnitude of a two-dimensional function is defined and formulated as the following 
the discrete formulation of which for an image is given by the following matrix
where the square and the square root calculations are entry-wise, and is the discrete partial derivative operator (see [16, Chap. 7] and [3, Sec. 2] for its formal definition and its specified usage in this paper, respectively).
In image processing, discrete gradient magnitudes are proven to be a strong prior to natural images [38, 16]. The sum of such magnitudes in a single image is a regularization representation of the image prior, i.e. total variation norm
It is also known that norm is capable of expressing the sparsity of matrices , defined as
In most instances, degradation convolution kernels are sparse . Thus sparsity regularization is a strong prior to convolution kernels in blind settings.
The baseline models in this work are energy minimization with TV and regularization priors, of which the details in the two main settings are as follows.
Kernel-known: The baseline model with known is formulated as the following energy minimization model with TV regularization prior
where is the TV regularization parameter. To solve the TV regularization system efficiently, we adopt a fast gradient-based algorithm named MFISTA  , which has performed remarkable time-efficiency and convergence property in TV regularization.
Blind deconvolution: The baseline system in blind setting introduces a new sparsity prior compared to the non-blind baseline above, which is formulated as
where is the regularization parameter. This TV- double-prior system can be solved using TNIP-MFISTA algorithm proposed in . To optimize both the image and the kernel, this algorithm adopts fix-update iterations between MFISTA and an regularization algorithm named Truncated Newton Interior Point method (TNIP) .
4.2 Deconvolution with DIKP
DIKP aim to capture the priors of images/kernels by the structures of generative deep neural nets. Taking image variable as an example, it re-parameterises the image as the neural net output , defined as the following surjection
of the input noise probability density function, denotes the weight space determined by the network structure, and is the solution space of , containing the prior information. The neural net maps the random noise network input and the network weights to the output . Ideally, by adjusting the network structure to its optimum, the solution space only contains images on desired prior information.
From the perspective of mechanics, the desired prior is expressed by the network structure, and the weights explores solutions on the prior. The random input noise is a high-dimensional Gaussian. The main reason to take a random noise as the network input is to increase the robustness  to overcome degeneracy issues. On the other hand, high-dimensional Gaussian vectors are essentially concentrated uniformly in a sphere . Therefore the input space can be approximated as a single point, and the surjection can be rewritten with the input space eliminated
which maps only a selection of parameters on the network, to an output image . In the rest of the report, denotes output image by deep image prior with weight .
4.2.1 Energy functions of DIKP deconvolution
Traditional energy minimization (formulated as Equation 2) for image deconvolution explores the whole image space as the domain. By re-parameterising the image term into the neural net output , the solution space contains the prior information expressed by the structure of , instead of the prior term . Thereby with deep image prior the general energy model by Equation 2 turns into
By optimizing network weights on a ideal structure, an image is optimized conditioned on the desired prior.
Kernel-known image deconvolution objective with deep image prior is derived directly from Equation 5, by applying the deconvolution energy function
where is the observed kernel. The minimizer is obtained by Adam optimizer  with random initialization.
Blind deconvolution: In blind settings, the convolution kernel is assumed to be unobservable. Thereby the kernel is parameterised by another deep neural net structure containing prior information regarding degradation kernels. After parameterisation on kernel matrix in Equation 6, the blind deconvolution objective with deep image prior is formulated as the following system
where and have different ConvNet structures since the prior information of natural images and kernels are apparently different. To obtain the minimizers and , we use Adam to update the two variables simultaneously.
To explore to what extent deep priors can capture prior knowledge of natural images in deconvolution models, we compare the energy convergence property during DIKP deconvolution optimization between natural images and noise images; compare the gradient distributions among standard test images and images from both baseline model and DIKP. This part of experiments aims to evaluate DIKP’s expressiveness on natural images, therefore it is only conducted in kernel-known setting, i.e. DKP is deactivated. The second part of our experiment aims to find out whether our proposed DIKP deconvolution models improve the performance of image deconvolution in both kernel-known and blind settings, compared with the baselines. In our results, PSNR comparison is conducted for quantitative analysis on deconvolution performance, and qualitative analysis is based on the presented images.
5.1 Experiment Setup
Convolution: Convolution processes in this paper, including data generation and energy calculations, are subject to reflexive boundary condition . Specifically, for color images, all channels share the same kernel .
Baseline: In kernel-known setting, the TV regularization parameter is set to , within a reasonable range for image deconvolution according to . In blind setting, the regularization parameters are set to and as the same in , among the experiments of which such setting achieved the best results.
ConvNet architecture as DIKP: As suggested for super-resolution setting in , we use hourglass architecture (shown in Figure 5) as the main body of DIKP, whose hyperparameter settings are shown as follows
upsample stride size
upsample stride size; Sigmoid to output. For kernels (if blind): ; ; ; ; ; upsample stride size ; Softmax to output.
We put Sigmoid and Softmax on ConvNet outputs for images and kernels respectively, because image pixels range from and kernel pixels sum to . The reason for setting upsample stride size to for kernel generation is to prevent degeneration due to their small size (
). It is worth mentioning that we apply add-noise regularization to the neural network, i.e. we disturb the noise inputwith an additive Gaussian at the beginning of each iteration. This technique aims to increase model robustness to perturbation . Although this regularization has a negative impact on the optimization process, we find that the network can still converge the energy to with a sufficient number of iterations and improve deconvolution performance.
5.2 Bias in convergence
Even though the complex structure of the neural network in a DIKP model allows the solution space to have a variety of features regarding natural images, it is still possible for the DIKP model to express interference information other than natural images , e.g. noise. Therefore, we introduce noise into our experiments, using our DIKP kernel-known model on natural images (incl. greyscale and color images) and noise respectively. By comparing the convergence property of the energy functions on the two in the optimization process, we can know whether our model can block such interference information in its solution space.
In our control experiment, we decide to use Gaussian white noise and uniform noise, generated from Gaussianand uniform . Figure 6 shows the optimization curves of energy values with respect to iterations in DIKP kernel-known deconvolution, where each plot corresponds to each degradation kernel. In spite of the Gaussian kernel, energy value convergence shows obvious differences between natural images and noise in DIKP deconvolution with defocus and motion blur kernels. More specifically, we observe that curves by the noise are clearly above those by natural images, and sudden leaps take place for energy values by noise in both plots. We speculate, the cause of this observation is that, the ConvNet structures in DIKP are unstable to parameter fluctuations for noise generation, which also explains how DIKP deconvolution blocks noise information. For the Gaussian, although in Figure 6 we cannot see a wild difference between noise and natural images, in Figure 7 we can still observe that the energy value by the uniform noise converges slower than that by natural images in early iterations, which also indicates that DIKP model blocks uniform noise in Gaussian degraded deconvolution.
The DIKP deconvolution in the control experiments with noise indeed shows biases to natural images from the perspective of energy function convergence, which means in most cases, DIKP are capable of blocking interference and irrelevant information in image deconvolution.
5.3 Image gradient distributions
Previous image statistics studies [44, 34] have shown that natural image gradients follow heavy-tailed distributions, which provide a natural prior for natural images. Starting from this, we consider evaluating the gradient distributions of our model-generated images with a “standard” distribution which can be assumed as the natural prior.
With notations in subsection 4.1, the gradients of image can be defined as matrices (horizontal) and (vertical) , where each element is a gradient value. In this experiment, we calculate the image gradient value distributions in image sets, standard test images, images by the baseline model and images by the DIKP model. The estimated probability distribution from frequency for each set is denoted by , and , where is assumed to be the “standard” distribution. Therefore between the distributions by the model-generated image sets, the one with greater similarity to the “standard” distribution is more in line with the natural prior.
Since the values of image gradients are continuous because of their double-precision floating-point data type, we split the range of gradient values into disjoint bins and count the number of gradient values that fall in each bin as the frequency. Figure 8 plots the logarithm probability distribution for each image set. Since the plot is in log scale, we can infer that all the three distributions have the heavy-tailed property, and their log-probability curves are similar in shape to each other. The peak close-up in the distribution shows a decreasing order of baseline-DIKP-standard in terms of log-probability, the gradient values in which lie around . This shows that the density of the baseline and DIKP model where the gradient values are close to is larger than the standard images, and further speaking, the DIKP model performs closer to the standard than the baseline in this range. However, the close-up in the middle of peak and tail gives an order of standard-baseline-DIKP, which indicates the exact opposite to the above peak-range results. The results above are in expectation because the TV regularizer in the baseline tends to reduce image gradient values due to the property of TV norm  and thereby gives high frequency where gradients are close to , and low frequency outside of peak range, which also illustrates DIKP’s better performance in high frequency gradients.
Overall, the KL divergence between gradient distributions of DIKP-generated images and standard test images is , while for the baseline, . This indicates that DIKP have a greater similarity to the “standard” than the baseline in terms of gradient distribution. The result is foreseeable because although the baseline performs closer to the standard than DIKP in the middle range, DIKP perform closer to the standard in the peak with much higher frequency.
5.4 Performance on deconvolution
We run our baselines and DIKP models on degraded images ( degradation kernels on standard test images) in both kernel-known and blind settings. Then we compute the PSNR between generated results and original standard test images, and visualize some of the results for quantitative and qualitative comparison respectively.
Shown in Table 1 and Table 2 are PSNR comparisons between baseline and deep priors for kernel-known and blind deconvolution respectively. Overall, our DIKP deconvolution models always perform better than baseline models in terms of average PSNR on different degradation kernels. In kernel-known setting, DIKP even give a larger PSNR value on every single degraded image. Particularly, when the kernel type is set as motion blur, the baseline gives unexpectedly bad results as shown by the PSNR values marked in red in Table 1. We suspect this is because TV regularizer overfits the gradient prior on the motion deblur, so that the non-edge regions of the image tend to be in the same pixel value (see Figure 9). When the kernel is set to Gaussian or defocus, the performance is improved by around in terms of PSNR as we expect. In blind setting, DIKP improve the PSNR performance by around , which is significantly beyond the performance of the baseline. However, baseline gives higher PSNR values than the deep image prior for a few pictures and kernel types, such as Lena degraded by Gaussian or defocus. A possible reason is that the gradient values in Lena are relatively small, so that TV regularization gives better results on this specific image.
Figure 10 visualizes the comparison between images restored from Gaussian degraded Lena and defocused house.c in kernel-known setting. From the pictures and their close-ups, we see that DIKP perform better in detail recovery. For example, the hair in Figure 10 has only a clear outline, while the details shown in Figure 10 are more abundant as well as the trees shown in Figure 10 compared with Figure 10. One possible explanation is that TV regularizer over-optimizes the sharpness of images, resulting in good performance only in outlines but not in detail.
In spite of the two kernels above, DIKP achieve remarkable results especially in motion blur deconvolution. Figure 9 visualizes the comparison between images restored from motion blurred C.man in both settings. As mentioned previously, kernel-known baseline gives an unsatisfactory result (Figure 9), where only the basic outline of the cameraman can be observed, and all other details inside the image are lost, while kernel-known DIKP restore the image almost perfectly as shown in Figure 9. For blind motion deblurring on C.man, The result (Figure 9) given by baseline still has motion blur, and the shape of its kernel is completely different from motion blur, while DIKP remove motion blur efficiently and the shape of its kernel is much closer to motion blur than the baseline (see Figure 9), which also verifies ConvNet’s expressiveness on degradation kernels.
We investigate deep ConvNet’s expressiveness on the prior information of natural images and degradation kernels in DIKP image deconvolution, and present its performance in both kernel-known and blind settings. More importantly, we propose DIKP-based energy minimization pipelines for image deconvolution in the two settings, and achieve performance which is far beyond our baselines [2, 43]. Our motivation is to adopt DIKP with more complex structures to express image prior information based on the idea of traditional learning-free optimization methods, and at the same time to improve image deconvolution performance by traditional learning-free methods. Through the first two experiments, we prove that the ConvNet structures of DIKP capture strong prior information on natural images in terms of generation types and gradient distributions. In the final experiment, we show the significant improvement by DIKP models compared with the baselines in terms of both PSNR values and visual effects, especially for motion-blurred images. However, we verify DIKP’s expressiveness on degradation kernels only by an adjusted hourglass structure. It is hard to associate kernel features and deep neural structures intuitively. Therefore, future endeavours in this topic should focus on the structures of DIKP generating kernels, trying other hyperparameters on hourglass, or other ConvNet structures, e.g. texture nets . Besides, as applied in , the formulation of energy functions may be adjusted with gradient terms to become more suitable for this task.
Acknowledgements. We thank Yusheng Tian for helpful changes and Prof. Steve Renals for organizing this project.
-  (2014) A study on the importance of image processing and its applications. IJRET: International Journal of Research in Engineering and Technology 3. Cited by: §1.
-  (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing 18 (11), pp. 2419–2434. Cited by: §3, §4.1, §5.1, §6.
-  (2004) An algorithm for total variation minimization and applications. Journal of Mathematical imaging and vision 20 (1-2), pp. 89–97. Cited by: §4.1.
Recent developments in total variation image restoration.
Mathematical Models of Computer Vision17 (2). Cited by: §5.3.
-  (1998) Total variation blind deconvolution. IEEE transactions on Image Processing 7 (3), pp. 370–375. Cited by: §1, §1.
-  (2000) High-order total variation-based image restoration. SIAM Journal on Scientific Computing 22 (2), pp. 503–516. Cited by: §4.1.
-  (2007) Video denoising by sparse 3d transform-domain collaborative filtering. In 2007 15th European Signal Processing Conference, pp. 145–149. Cited by: §3.
-  (2006) Richardson–lucy algorithm with total variation regularization for 3d confocal microscope deconvolution. Microscopy research and technique 69 (4), pp. 260–266. Cited by: §2.
-  (1986) A note on the gradient of a multi-image. Computer vision, graphics, and image processing 33 (1), pp. 116–125. Cited by: §5.3.
-  (1996) Recovery of blocky images from noisy and blurred data. SIAM Journal on Applied Mathematics 56 (4), pp. 1181–1198. Cited by: §4.1.
-  (2006) Removing camera shake from a single photograph. In ACM transactions on graphics (TOG), Vol. 25, pp. 787–794. Cited by: §2.
-  (2010) A convergent overlapping domain decomposition method for total variation minimization. Numerische Mathematik 116 (4), pp. 645–685. Cited by: §1, §4.
-  (2001) The elements of statistical learning. Vol. 1, Springer series in statistics New York, NY, USA:. Cited by: §4.1.
-  (1974) Super-resolution through error energy reduction. Optica Acta: International Journal of Optics 21 (9), pp. 709–720. Cited by: §4.
-  (1977) Digital image processing(book). Reading, Mass., Addison-Wesley Publishing Co., Inc.(Applied Mathematics and Computation (13), pp. 451. Cited by: §4.1.
-  (2006) Deblurring images: matrices, spectra, and filtering. Vol. 3, Siam. Cited by: §1, §3.1, §3.1, §3.1, §3.1, §4.1, §4.1, §5.1.
-  (1987) Deblurring gaussian blur. Computer Vision, Graphics, and Image Processing 38 (1), pp. 66–80. Cited by: §3.1.
-  (2008) Scope of validity of psnr in image/video quality assessment. Electronics letters 44 (13), pp. 800–801. Cited by: §3.2.
-  (1989) Fundamentals of digital image processing. Englewood Cliffs, NJ: Prentice Hall,. Cited by: §3.1.
-  (2006) High dimensional statistical inference and random matrices. arXiv preprint math/0611589. Cited by: §4.2.
-  (2009) DCT-based local motion blur detection. In International Conference on Instrumentation, Communication, Information Technology, and Biomedical Engineering 2009, pp. 1–6. Cited by: §3.1.
-  (2007) An efficient method for compressed sensing. In Image Processing, 2007. ICIP 2007. IEEE International Conference on, Vol. 3, pp. III–117. Cited by: §4.1.
-  (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §4.2.1.
-  (1997) Information theory and statistics. Courier Corporation. Cited by: §3.2.
-  (1996) Blind image deconvolution. IEEE signal processing magazine 13 (3), pp. 43–64. Cited by: §1.
-  (2007) Blind motion deblurring using image statistics. In Advances in Neural Information Processing Systems, pp. 841–848. Cited by: §2.
-  (2009) Classification via group sparsity promoting regularization. Cited by: §1, §2.
-  (2016) Image denoising using very deep fully convolutional encoder-decoder networks with symmetric skip connections. CoRR abs/1603.09056. External Links: Cited by: §2, Figure 5.
-  (2007) Adding noise to improve noise robustness in speech recognition. In Eighth Annual Conference of the International Speech Communication Association, Cited by: §4.2, §5.1.
-  (2001) Digital signal processing: principles algorithms and applications. Pearson Education India. Cited by: §2.
-  (1992) Blur identification by the method of generalized cross-validation. IEEE Transactions on Image Processing 1 (3), pp. 301–311. Cited by: §2.
-  (1972) Bayesian-based iterative method of image restoration. JOSA 62 (1), pp. 55–59. Cited by: §2.
-  (2018) Imaging through turbulence. CRC press. Cited by: §3.1.
Fields of experts: a framework for learning image priors.
IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 2, pp. 860–867. Cited by: §5.3.
-  (1988) Real analysis. Vol. 32, Macmillan New York. Cited by: footnote 2.
-  (1992) Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60 (1-4), pp. 259–268. Cited by: §2, §4.
-  (1990) Survey of recent developments in digital image restoration. Optical Engineering 29 (5), pp. 393–405. Cited by: §1.
-  (2008) High-quality motion deblurring from a single image. In Acm transactions on graphics (tog), Vol. 27, pp. 73. Cited by: §1, §2, §2, §4.1, §4.1, §6.
-  (2003) Euler’s elastica and curvature-based inpainting. SIAM journal on Applied Mathematics 63 (2), pp. 564–592. Cited by: §4.
-  (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1, §5.2.
-  (2016) Texture networks: feed-forward synthesis of textures and stylized images.. In ICML, Vol. 1, pp. 4. Cited by: §6.
-  (2018) Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9446–9454. Cited by: §1, §2, §3.2, §5.1.
-  (2017-06) An iterative method for image deblurring based on total variation and compressed sensing. Bachelor’s Thesis, School of Mathematical Sciences, Fudan University, 220 Handan Rd., Yangpu District, Shanghai, China. Cited by: §1, §2, §2, §4.1, §5.1, §6.
-  (2007) What makes a good model of natural images?. In CVPR, Cited by: §5.3.
Extrapolation, interpolation and smoothing of stationary time series-with engineering applications’ mit press. Cited by: §2.
Deep convolutional neural network for image deconvolution. In Advances in Neural Information Processing Systems, pp. 1790–1798. Cited by: §1, §2.
-  (1997) Identification of blur parameters from motion blurred images. Graphical models and image processing 59 (5), pp. 310–320. Cited by: §3.1.
-  (2008) Progressive inter-scale and intra-scale non-blind image deconvolution. In Acm Transactions on Graphics (TOG), Vol. 27, pp. 74. Cited by: §2.
-  (2017) Learning deep cnn denoiser prior for image restoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3929–3938. Cited by: §2.