1 Introduction
Image inverse problem center around the recovery of an unknown image based on given corrupted measurement . It is an illposed problem because a specific corrupted image
can correspond to a crop of possible highquality images. The problem has been extensively explored in the past several decades while deep convolutional neural networks (ConvNets) currently set the stateoftheart
[11], such as denoising [2], or singleimage superresolution [4]. The commonly suggested and very effective path to the inverse problem is as follows: Given many example of pairs of an original image and its corrupted version, one could learn a deep network to match the degraded image to its source [7], for example, [14, 16, 5, 9, 6, 8, 17, 13, 10, 3].Ulyanov et al. [11] proposed a new strategy, namely Deep Image Prior (DIP), for a single image inverse problem where common strategies are on longer feasible because only one corrupted image (without the original image) is available for model training. Mataev et al. [7] further improved the performance of the DIP by adding an extra regularization (Regularization by Denoising).
Although Ulyanov et al. [11] and Mataev et al. [7] proofed that DIP and its variations are very effective machines for handling various inverse problems, we have to figure out a stopping method before applying DIPs to realworld problems where human supervision is not available. Currently, DIPs stop when humans assess their outputs as good enough or reach their maximum iteration times [11, 7]. The stopping method should output a measurement that indicates how well DIPs have reconstructed the interested image. So, the training algorithm can stop itself when the measurement reaches the maximum.
In this work, we propose a stopping method, namely Orthogonal Stopping Criterion (OSC), which adds a pseudo noise to the corrupted image and measure the pseudonoise component in the recovered image of each iteration based on the orthogonality between signal and noise. The growthrate derivate of the measurement will reach its maximum when DIPs start focusing on reconstructing the pseudo noise, which means the training should be stopped because DIPs resist "bad" solutions and descends much more quickly towards naturallylooking images [11]. We use DIP as the baseline^{1}^{1}1https://github.com/DmitryUlyanov/deepimageprior and have demonstrated the performance of OSC for several problems such as denoising, superresolution, inpainting.
2 Methodology
The inverse tasks such as denoising, superresolution and inpainting can be expressed as energy minimization problem of equation (1), where is a taskdependent data term, is the noisy/lowresolution/occluded image, is the reconstructed image, and is a regularizer [11].
(1) 
In this work, we handle the inverse tasks by equation (2) where is the pseudo noise and all that we get during the minimization are represented as set . The minimization is stopped according to equation (3), where measures the growthrate derivate of the pseudonoise component in . The pseudonoise component is highly correlate to , because is orthogonal to all components in including ground truth and corruptions. Since the reconstruction of needs much more iterations than the naturallylooking image in , the interested naturallylooking image will be reconstructed by DIP before the growthrate derivate of the pseudonoise component reaches its maximum, as long as the reconstruction difficulty of is harder than the naturallylooking image and easier than (or equal to) other corruptions.
(2) 
(3) 
Given a series of reconstructed images , we get the pseudo noise component by equation (4), where is the number of elements in and indicates the image which is reconstructed in the th iteration.
(4) 
We get the index of the best image by equation (5) where finds the curvature of the curve. Figure 1 has shown the results of F16 denoising experiment, including the
curve, the Peak Signal to Noise Ratio (PSNR) curve of DIP, the PSNR curve of OSC, and curvature curves. All curves are normalized according to their own minimum and maximum except the PSNR curve of OSC which uses the minimum and the maximum of the DIP PSNR curve. The curvature curve records the growthrate derivate of the
curve. To get the curvature curve, for a specific , we find 3 points on the curve to define the new coordinate system shown in dash line, which are , , where , , , , , , , , . defines the length of curve for curvature calculation. is the averaging window size. After mapping the curve between and to the dashline coordinate system, we fit a parabola to the curve and use the parameter of the quadratic item as the curvature at index which is an approximation of the growthrate derivate of the pseudonoise component. From Figure 1, although the curvaturemaximum PSNR is not the maximum one during the whole OSC iteration, it’s close enough that the ratios of the curvaturemaximum PSNR to the maximum one are more than 95% in the most of our experiments. It is clear in Figure 1 that the existence of the pseudo noise will harm the maximum PSNR but it is insignificant. The OSC method has been listed in Algorithm 1.(5) 
3 Experiments
We tested OSC for denoising, superresolution, and inpainting using same configration as [11, 12] ^{2}^{2}2https://github.com/DmitryUlyanov/deepimageprior. In the following experiments, , ,
is a 0 mean 1/25 standard deviation Gaussian pseudo noise for default. All OSC experiments are same as DIP’s except the using of the pseudo noise. DIP experiments are stopped at suggested iteration or when PSNR reaches maximum. OSC experiments are stopped when the curvature reaches maximum.
3.1 Denoising and generic reconstruction
For denoising, we train OSC to minimize equation (2) using where is the reconstructed image, is a noisy observation. The pseudonoise component is calculated by equation (4).
Figure 2 shows the restoration of a JPEGcompressed image where we repeat the experiment using DIP and OSC. Figure 2 (b) is the image at the suggested stop iteration, (c) is obtained by OSC. The image automatically selected by OSC is better than the DIP result without the supervision of humans.
Figure 3 shows the denoising results of DIP and OSC, where (c) is selected by PSNR, (d) is the result of suggested iteration which is selected based on human inspection, (e) the result of OSC. The result of OSC is close to the PSNRmaximum image, and better than the suggested iteration.
We have done the denoising experiments on Kate, Snail, F16 images and the results are shown in table 1. The JPEG corrupted Snail image has been used as the ground truth. DIP (PSNR) gives out the maximum PSNR and DIP (Iteration) shows the PSNR of 3000 iteration which is the default value in the DIP code. The PSNR values of images selected by OSC are listed in the 4th row followed by the maximum PSNR that we have gotten during OSC iterating. Accuracy is the ratio of 4th row to 5th row. As shown in table 1, OSC results are comparable to DIP which needs the supervision of humans. The Max PSNR is close to DIP (PSNR) which means that the addition of pseudo noise has little influence on the noisy image reconstruction.
Image  Kate  Snail  F16 

DIP (PSNR)  31.39  27.30  30.82 
DIP (Iteration)  31.27  26.7  29.29 
OSC  30.73  26.44  29.80 
Max PSNR  31.19  27.42  30.33 
Accuracy  98.53%  96.43%  98.25% 
3.2 Superresolution
For superresolution, we train OSC to minimize equation (2) using where is the reconstructed image, a downsampled observation, is Lanczos down sample method which is used by [12]. We tested DIP and OSC on Set5 [1] and Set14 [15] with down scales 4 and/or 8.
Figure 4 has shown the examples of 4x image superresolution, where (c) is stopped at the PSNRmaximum point, (d) is stopped at the suggested 2000 iteration, (c) is generated by OSC. OSC results are very close to the PSNRmaximum version of DIP, which means that OSC has found the nearoptimal solution for superresolution inverse problem automatically.
Table 2 shows the 4x superresolution results of Set5 where maximum PSNRs of DIP are in the 1st row, PSNRs of 2000 iterations DIP are in the 2nd row, OSC results are in the 3rd row followed by maximum PSNRs of OSC and accuracy in the last. Similarly, the results of 4x and 8x superresolution on Set14 are shown in table 3 and table 4. DIP was stopped at 8000 iterations in the 8x superresolution experiment. From table 2, table 3 and table 4, we believe that OSC is very good at finding optimal stopping iteration for superresolution problems because the accuracy is higher than 95% for all testing images.
Image  Baby  Bird  Butterfly  Head  Woman 

DIP (PSNR)  30.66  30.33  24.93  28.90  27.50 
DIP (Iteration)  29.78  29.63  24.69  28.42  26.93 
OSC  30.43  29.47  24.41  28.05  26.14 
Max PSNR  30.75  29.83  24.67  28.71  27.32 
Accuracy  98.96%  98.79%  98.95%  97.70%  95.68% 
Image  Baboon  Barbara  Bridge  Coastguard  Comic  Face  Flowers  Foreman  Lenna  Man  Monarch  Pepper  Ppt3  Zebra 

DIP (PSNR)  20.45  23.95  23.25  24.56  21.00  29.00  25.04  28.29  29.56  25.19  29.48  28.50  22.99  24.49 
DIP (Iteration)  20.27  23.78  23.13  24.39  20.86  28.38  24.48  27.83  29.03  24.81  28.74  27.87  22.67  24.07 
OSC  20.36  22.75  23.22  24.31  20.37  28.74  23.11  27.46  28.32  24.32  29.12  28.31  22.88  24.09 
Max PSNR  20.37  23.89  23.25  24.34  20.98  28.82  24.85  27.65  29.23  25.12  29.28  28.40  23.23  24.28 
Accuracy  99.95%  95.23%  99.87%  99.88%  97.09%  99.72%  93.00%  99.31%  96.89%  96.82%  99.45%  99.68%  98.49%  99.22% 
Image  Baboon  Barbara  Bridge  Coastguard  Comic  Face  Flowers  Foreman  Lenna  Man  Monarch  Pepper  Ppt3  Zebra 

DIP (PSNR)  19.38  22.35  21.13  22.60  18.42  27.29  21.36  24.08  26.68  22.55  23.96  25.96  18.78  19.62 
DIP (Iteration)  19.36  22.33  21.10  22.58  18.37  27.10  21.34  23.89  26.59  22.52  23.92  25.86  18.67  19.58 
OSC  19.05  21.46  20.95  22.38  18.03  25.63  20.99  23.08  25.67  21.41  23.72  24.98  18.69  18.85 
Max PSNR  19.36  22.34  21.07  22.66  18.39  27.18  21.39  23.80  26.57  22.47  23.92  25.93  18.76  19.62 
Accuracy  98.40%  96.06%  99.43%  98.76%  98.04%  94.30%  98.13%  96.97%  96.61%  95.28%  99.16%  96.34%  99.63%  96.08% 
3.3 Inpainting
For inpainting, we train OSC to minimize equation (2) using where is the reconstructed image, is a corrupted observation, is Hadamard’s product, is a binary mask of the missing pixels in , is the height and is the width of the image. is the pseudo noise generated by equation (6), where , , are channel, row, column index respectively, , ,
is the random uniform distribution. The pseudonoise component is calculated by equation (
7), where , is the number of elements in the image.(6) 
(7) 
Figures 5 and 6 shows the results of regional recovery. Figure 7 shows the results of two inpainting approaches. Table 5 lists PSNRs of the experiments. The OSC results are very close to DIP maximum PSNR which means the OSC method is fully capable of finding optimal stopping iteration automatically.
Image  Kate  Library  Vase  Barbara 

DIP (PSNR)  40.19  19.22  29.14  31.91 
DIP (Iteration)  39.14  19.08  27.76  30.90 
OSC  33.74  18.64  28.67  30.97 
Max PSNR  35.33  18.72  28.71  31.07 
Accuracy  95.50%  99.57%  99.86%  99.68% 
4 Conclusion
In this work, we have developed Orthogonal Stopping Criterion (OSC) which can endow Deep Image Prior (DIP) the power of automation. The automatic stopping mechanic is essential to DIP in realworld applications because the Peak Signal to Noise Ratio (PSNR) and human supervision are both unavailable or hard to reach. By adding pseudo noise to the corrupted image, OSC can find the nearoptimal result automatically which is very close to the one with maximum PSNR in our experiments. Additionally, the pseudo noise has little influence on the maximum PSNR which has been verified by the experiments. The ratios of OSC PSNR to the maximum are higher than 95% in 38 out of 40 experiments. Many of them are even higher than 99%. Although, the results of DIP are comparable to OSC, they are selected based on PSNR or human inspection. In all, we believe that OSC is an indispensable part of DIPbased single image inverse systems.
References
 [1] (2012) Lowcomplexity singleimage superresolution based on nonnegative neighbor embedding. Cited by: §3.2.
 [2] (2018) Nonlocalityreinforced convolutional neural networks for image denoising. IEEE Signal Processing Letters 25 (8), pp. 1216–1220. Cited by: §1.

[3]
(2019)
Dynamic scene deblurring with parameter selective sharing and nested skip connections.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pp. 3848–3856. Cited by: §1.  [4] (2018) Fast and accurate single image superresolution via information distillation network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 723–731. Cited by: §1.
 [5] (2018) Universal denoising networks: a novel cnn architecture for image denoising. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3204–3213. Cited by: §1.
 [6] (2018) Nonlocal recurrent network for image restoration. In Advances in Neural Information Processing Systems, pp. 1673–1682. Cited by: §1.
 [7] (2019) DeepRED: deep image prior powered by red. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 0–0. Cited by: §1, §1, §1.
 [8] (2018) Srfeat: single image superresolution with feature discrimination. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 439–455. Cited by: §1.
 [9] (2018) Neural nearest neighbors networks. In Advances in Neural Information Processing Systems, pp. 1087–1098. Cited by: §1.
 [10] (2018) Scalerecurrent network for deep image deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8174–8182. Cited by: §1.
 [11] (2018) Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9446–9454. Cited by: §1, §1, §1, §1, §2, §3.
 [12] (2018) Deep image prior. In Submitted to IJCV, pp. 1–22. Cited by: §3.2, §3.
 [13] (2018) A fully progressive approach to singleimage superresolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 864–873. Cited by: §1.
 [14] (2019) Deep learning for single image superresolution: a brief review. IEEE Transactions on Multimedia. Cited by: §1.
 [15] (2010) On single image scaleup using sparserepresentations. In International conference on curves and surfaces, pp. 711–730. Cited by: §3.2.
 [16] (2018) FFDNet: toward a fast and flexible solution for cnnbased image denoising. IEEE Transactions on Image Processing 27 (9), pp. 4608–4622. Cited by: §1.
 [17] (2018) Image superresolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301. Cited by: §1.
Comments
There are no comments yet.