Single image super-resolution (SISR) aims to reconstruct a plausible high-resolution (HR) image from its low-resolution (LR) counterpart. As a fundamental vision task, it has been widely applied in video enhancement, medical imaging and surveillance imaging. Mathematically, the HR image and LR image are related by a degradation model
where represents two-dimensional convolution of with blur kernel , denotes the -fold downsampler, and is usually assumed to be additive, white Gaussian noise (AWGN) usr . The goal of SISR is to restore the corresponding HR image of the given LR image, which is a classical ill-posed inverse problem.
Recently, SR has been continuously advanced by various deep learning-based methods rcan ; casg . Although these methods have exhibited promising performance, there is a common limitation: they are too ’general’ and not image-specific. Firstly, these methods rely heavily on external information. They are exhaustively trained via LR-HR image pairs synthesized by predefined blur kernels, ignoring the real degradations of test images (i.e. non-blind SR). When the degradations of test images are different from the predefined ones, they may suffer a significant performance drop. Secondly, their model weights are fixed during testing. Since they are trained offline, test images with various degradations will be super-resolved by the same set of weights. However, different test images are usually depredated by different degradations. If the model performs well on certain degradations, it is likely to perform badly on others. Thus, training a single model for a wide range of degradations may lead to sub-optimal results. For example, as shown in Figure 1, ESRGAN esrgan , and RCAN rcan are trained via bicubically synthesized LR-HR pairs. They have excellent performance on bicubically downscaled images but incur adaptation problems when dealing with images degraded by different kernels. Therefore, these methods may only perform well under very limited cases: the blur kernels of test images are similar and all include in the predefined kernels. Unfortunately, these cases are rare in real applications.
Towards these issues, a straightforward idea is to customize a model for each test image. Some ‘zero-shot’ methods zssr ; kernel_gan have tried to get rid of datasets synthesized by predefined kernels. They highlight the similarity of recurring patches across multi-scales in the LR image, and train models via the test image and its downscaled version. Although these methods may be suitable for regions where the recurrences are salient, the limited training samples, without any external HR information, largely restrict their performance. Instead, we propose an online super-resolution (ONSR) method, which not only involves the test LR image in model optimization as the “zero-shot” methods, but also leverages the benefits of external learning-based methods. Specifically, we design two branches, namely internal branch (IB) and external branch (EB). IB utilizes the inherent information of the test LR image and learns its specific degradation. With the aid of the learned degradation, EB could utilize external HR images to render general priors and train a specific SR model. Without relying on predefined kernels, ONSR could still make full use of external HR images, and customize a specific model for each test LR image.
In summary, our main contributions are as follows:
Towards the various and unknown blur kernels in blind SR, we propose an online super-resolution (ONSR) method. It could customize a specific model for each test LR image and thus could have more robust performance in different cases.
We design two branches, namely internal branch (IB) and external branch (EB). They could work together to better incorporate the general priors from external images and specific degradation of the test image.
Extensive experiments on both synthesized and real-world images show that ONSR can generate more visually favorable SR results and achieve state-of-the-art performance on blind SR.
2 Related Works
2.1 Non-Blind Super-Resolution
Most learning-based SR approaches focus on non-blind SISR, in which case the blur kernel and noise level are known beforehand. These methods are externally supervised optimized via LR-HR pairs synthesized by predefined blur kernelssimusr
. With the flourish of deep learning, convolutional neural networks (CNNs) are successfully adopted for single image super-resolutionsrcnn . After the proposal of residual learning resnet , which simplifies the optimization of deep CNNs, SR networks tend to become even deeper, and the representation capability is significantly improved. Attention mechanism rcan and feature aggregation hdrn are also adopted to further boost the performance. Besides, some non-blind methods srmd ; usr simultaneously use the predefined blur kernel and synthetic LR-HR data to advance the SR performance. However, these methods only work well for certain degradations. The results may deteriorate dramatically when there exists a domain gap between training samples and the real test image. Instead, our method focuses on blind SR, in which case the degradation from HR to LR images is unavailable.
2.2 Blind Super-Resolution
Blind SR assumes that the degradations of test images are unknown and various, which is more applicable to real images. This problem is much more challenging, as it is difficult for a single mode to generalize to different degradations. In wang2017ensemble and wang2020blind , the final results are ensembled from models that are capable of handling different cases. Thus the ensembled results could be more robust to different degradations. But there are infinite number of degradations, we can not train a model for each of the them. Other methods try to utilize the internal prior of the test image itself. In dpn , the model is finetuned via similar pairs searched from the test image. In nonpara ; kernel_gan and zssr
, the blur kernel is firstly estimated by maximizing the similarity of recurring patches across multi-scale of the LR image, and then used to synthesize LR-HR training samples. However, the internal patches of the test image are limited, which heavily restricts the performances of these methods. Differently, our ONSR still estimate the blur kernel from the test image, but it optimize the SR network via HR images. In this way, ONSR can simultaneously take the benefits of internal external priors in the LR and HR images.
2.3 Offline & Online Training in Super-Resolution
Most deep learning-based SR methods are offline optimized. These models are trained via a large number of synthesized paired LR-HR samples hdrn ; ikc , and the model weights are fixed during testing. Thus, their model weights are completely determined by external data, without considering the inherent information of the test image. LR images that may be degraded by various kernels are super-resolved by the same set of model weights. The domain gap between training and testing data may impair the performance. Contrary to offline training, online training can get the test LR image involved in model optimization. For example, ZSSR zssr is an online trained SR method. It is optimized by the test LR image and its downscaled version. Therefore, it can customize the network weights for each test LR image, and could have more robust performance over different images. However, the training samples of most online trained models are limited to only one test image, which heavily restricts their performance. Instead, our ONSR can utilize the external HR images during the online training phase. And in this way, it could better incorporate general priors of the external data and the inherent information of the test LR image.
As we have discussed above, previous non-blind SR methods are usually offline trained (as shown in Figure 2(a)) simusr , which means LR images with various degradations are super-resolved with the same set of weights, regardless of the specific degradation of the test image. Towards this problem, a straightforward idea is to adopt an online training algorithm, i.e. adjust the model weights for each test LR image with different degradations. A similar idea namely “zero-shot” learning is used in ZSSR. As shown in Figure 2(b), ZSSR is trained with the test LR image and its downscaled version. However, this pipeline has two in-born drawbacks: 1) with a limited number of training samples, it only allows relatively simple network architectures in order to avoid overfitting, thus adversely affecting the representation capability of deep learning; 2) no HR images are involved. It is difficult for the model to learn general priors of HR images, which is also essential for SR reconstruction Ulyanov2018DeepIP .
The drawbacks of ZSSR motivate us to think: a better online updating algorithm should be able to utilize both the test LR image and external HR images. The former provides inherent information about the degradation method, and the latter enables the model to exploit better general priors. Therefore, a “general” SR model can be adjusted to process the test LR image according to its “specific” degradation, which we call: from “general” to “specific”.
Accoring to the framework of MAP (maximum a posterior) ren2020neural , the blind super-resolution can be formulated as:
where is the fidelity term. and model the priors of sharp image and blur kernel. and are trade-off regularization parameters. Although many delicate handcrafted priors, such as the sparsity of the dark channel darkchannel , -regularized intensity LO , and the recurrence of the internal patch recurrence , have been suggested for and
, these heuristic priors could not cover more concrete and essential characteristics of different LR images. To circumvent this issue, we design two modules,i.e. the reconstruction module and the degradation estimation module , which can capture priors of and in a learnable manner. We substitute by , and write the degradation process as , then the problem becomes:
The prior terms are removed because they could also be captured by the generative networks and Ulyanov2018DeepIP .
This problem involves the optimization of two neural networks,i.e. and . Thus, we can adopt an alternating optimization strategy:
In the first step, we fix and optimize , while in the second step we fix and optimize .
So far only the given LR image is involved in this optimization. However, as we have discussed in Sec 3.1, the limited training sample may be not enough to get sufficiently optimized, because there are usually too many learnable parameters in . Thus, we introduce the external HR images in the optimization of . In the step, we degrade the by to . Then and could form a paired sample that could be used to optimize . Thus, the alternating optimization process becomes:
in which, is optimized by external datasets, while is optimized by the given LR image only. At this point, we have derived the proposed method from the perspective of alternating optimization. This may help better understand OSNR.
3.3 Online Super-Resolution
As illustrated in Figure 3, our online SR (ONSR) consists of two branches, i.e. internal branch (IB) and external branch (EB). Both of the two branches have two modules, i.e. reconstruction module and degradation estimation module . aims to map the given LR image from the LR domain to the HR domain , i.e. reconstructing an SR image . While aims to estimate the specific degradation of the test LR image.
In IB, only the given LR image is involved. As shown in Figure 3, the input of IB are patches randomly selected from the test LR image. The input LR patch is firstly super resolved by to an SR patch. Then this SR patch is further degraded by to a fake LR patch. To guarantee that the fake LR can be translated to the original LR domain, It is supervised by the original LR patch via L1 loss. The paired SR and LR patches could help to learn the specific degradation of the test image. The optimization details will be further explained in Section 3.4.
In EB, only external HR images are involved. The input of EB are patches randomly selected from different external HR images. Conversely, the external patch is firstly degraded by to a fake LR patch, . As the weights of are shared between IB and EB, the external patches are actually degraded by the learned degradation. Thus, the paired HR and fake LR patches could help learn to super resolve LR images with specific degradations.
According to the above analysis, the loss functions of IB and EB can be formulated as:
Since information in the single test LR image is limited, to help better learn the specific degradation, we further adopt the adversarial learning strategy. As shown in Figure 3, we introduce a discriminator . is used to discriminate the distribution characteristics of the LR image. It could force to generate fake LR patches that are more similar to the real ones. Thus more accurate degradations could be learned by . We use the original GAN formulation as follows,
3.4 Separate Optimization
Generally, most SR networks are optimized by the weighted sum of all objectives. All modules in an SR network are treated indiscriminately. Unlike this commonly used joint optimization method, we propose a separate optimization strategy. Specifically, is optimized by the objectives that are directly related to the test LR image, while is optimized by objectives that are related to external HR images. The losses for these two modules are as follows,
where controls the relative importance of the two losses. We will investigate the influence of in Section 4.4.5.
We adopt this separate optimization strategy for two reasons. Firstly, as the analysis in Section 3.2 that and are alternate optimized in ONSR, separate optimization may make these modules easier to converge usr . Secondly, aims to learn the specific degradation of the test image, while needs to learn the general priors from external HR images. Thus it is more targeted for them to be separately optimized. We experimentally prove the superiority of separate optimization in Sec 4.4.4. The overall algorithm is shown in Algorithm 1.
3.5 Network Instantiation
Most existing SR structures can be used in and integrated into ONSR. In this paper, we mainly use Residual-in-Residual Dense Block (RRDB) proposed in ESRGAN esrgan . RRDB combines the multi-level residual network and dense connections, which is easy to be trained and has promising performance on SR. consists of 23 RRDBs and an upsampling module. It is initialized using the pre-trained network parameters. The pretrained model could render additional priors of external data, and also provide a comparatively reasonable initial point to accelerate optimization.
, since blurring and downsampling are linear transforms, we designas a deep linear network. Theoretically, a single convolutional layer should be able to represent all possible downsampling blur methods in Eq. 1. However, according to Arora2018OnTO , linear networks have infinitely many equal global minimums. It makes the gradient-based optimization faster for deeper linear networks than shallower ones. Thus, we employ three convolutional layers with no activations and a bicubic downsampling layer in . Similarly, to obtain a reasonable initial point, kernel_gan is supervised by bicubically downsampled data at the beginning. Our bicubic downsampling layer can serve the same purpose but simpler and more elegant. Besides, to accelerate the convergence of
, we use isotropic Gaussian kernels with a standard deviation of 1 to initialize all convolutional layers, as shown in Figure4. Considering that images with larger downsampling factor are usually more seriously degraded, we set the size of the three convolutional layers to , , for scale factor , and , , for scale factor .
is a VGG-style network simonyan2014very to perform discrimination. The input size of is .
4.1 Experimental Setup
Datasets. We use 800 HR images from the training set of DIV2K div2k as the external HR dataset and evaluate the SR performance on DIV2KRK kernel_gan . LR images in DIV2KRK are generated by blurring and subsampling each image from the validation set (100 images) of DIV2K with randomly generated kernels. These kernels are isotropic or anisotropic Gaussian kernels with random lengths independently distributed for each axis, rotated by a random angle . To deviate from a regular Gaussian kernel, uniform multiplicative noise (up to of each pixel value of the kernel) is further applied.
Evaluation Metrics. To quantitatively compare different methods, we use PSNR, SSIM wang2004image , Perceptual Index (PI) blau2018the and Learned Perceptual Image Patch Similarity(LPIPS) zhang2018the . Lower PI and LPIPS indicate higher perceptual quality.
Training Details. We randomly sample 10 patches of from the LR image and 10 patches of from different HR images for each input minibatch, where denotes the scaling factor. ADAM optimizer with is used for optimization. We set the learning rates to for the discriminator . For scale , the learning rate for both and are , and for scale , both are . We set the online updating step to 500 for each image, and the LR image is tested every 10 steps. To accelerate the optimization, we initialize ONSR with the bicubically pretrained model of RRDB, which is publicly available.
|Type2: BlindSR||Cornillere et al.cornillere2019blind||29.42||0.8459||4.8343||0.1957||-||-||-||-|
|Ji et al.Ji2020RealWorldSV||-||-||-||-||25.41||0.6890||8.2348||0.5219|
4.2 Super-Resolution on Synthetic Data
We compare ONSR with other state-of-the-art (SotA) methods on the synthetic dataset DIV2KRK. We present two types of algorithms for analysis: 1) Type1 includes ESRGAN esrgan , RRDB esrgan , RCAN rcan and ZSSR zssr , which are non-blind SotA SR methods trained on bicubically downsampled images. 2) Type2 are blind SR methods including KernelGAN+ZSSR kernel_gan , dSRVAE dsrvae , Ji et al.Ji2020RealWorldSV and Cornillere et al.cornillere2019blind .
Quantitative Results. In Table 1
, SotA non-bind SR methods have remarkable performance under the bicubic downsampling setting, while suffering severe performance drop on DIV2KRK due to the domain gap. RCAN is even worse than the naive bicubic interpolation. ESRGAN and RRDB share the same architecture as, but ONSR outperforms them by a large margin about 2.1dB and 2dB for scales and respectively. This improvement may be attributed to online updating. Although Type2, i.e. blind SR methods achieve significantly better quantitative results than non-blind SR methods, they still cannot generalize well to different degradations. KernelGAN+ZSSR improves over previous methods, but the performance is still inferior to ONSR by a large margin.
Qualitative Results. In Figure 5 and 6, we intuitively present visual comparisons of these methods on scales and respectively. SotA non-blind SR methods tend to produce blurry edges and undesirable artifacts, such as the window contours in image 085. Similarly, blind SR methods also tend to generate over-smooth patterns. While the results of our method are clearer and more visually natural.
4.3 Super-Resolution on Real-World Data
Besides the above experiments on synthetic test images, we also conduct experiments on real images, which are more challenging due to the complicated and unknown degradation in real-world scenarios. Since there are no ground-truth HR images, we only provide the visual comparison. As shown in Figure 7, the letter “X” restored by RRDB, ESRGAN and ZSSR is blurry or has unpleasant artifacts. For RCAN, there even exists color difference from the original frame. The result of IKC is better, but the super-resolved image of our ONSR has more shaper edges and higher contrast, as well as more visually natural. We also apply these methods to YouTube raw video frames. From Figure 8, the generated SR frames from most methods are seriously blurred or contain numerous mosaics. While ONSR can produce visually promising images with clearer edges and fewer artifacts.
4.4 Ablation Study
4.4.1 Study on the initialization of
In this section, we experimentally investigate the influence of the initialization of . We initialize with three different methods: 1) with no pre-trained model, 2) with the bicubically pretrained model (i.e. RRDB), 3) with the pretrained model (i.e. RRDB-G) as that in ikc . In ikc , the SR module of IKC is pre-trained with image pairs that are synthesized with isotropic Gaussian blur kernels of different widths. In the same manner, we pre-train another RRDB-G model to initialize the SR module of our method (denoted as ONSR-G). From Figure 9, we can see that: 1) the SR results of initialized by the pre-trained model are more visually reasonable. It indicates the pretrained model can provide a better initial point, and guide to achieve more significant performance. 2) A more powerful pretrained SR module can better initialize and accelerate the convergence, thus achieving better performance.
As shown in Table 1 and Table 2, the performance of RRDB-G is better than the bicubically pre-trained RRDB and achieves comparable performance to the strong blind SR baseline model: IKC. Based on the pretrained RRDB-G, ONSR-G also outperforms IKC, about PSNR: , SSIM: for scale factors and . Moreover, our ONSR-G can further improve the performance of RRDB-G with the online updating scheme. Thus, it is necessary to involve the test LR image in the model optimization. The online updating scheme can effectively exploit the inherent information of the test LR image and combine it with the external priors to adjust the “general” SR model to better deal with “specific” degradations. We also provide visual comparisons in Figure 10.
4.4.2 Study on different
In this subsection, we experimentally prove that the online updating works well for different . We replace the architecture of with different existing SR models. We use two SotA supervised SR models RDN rdn and RCAN rcan as respectively.
As shown in Table 3, only with the bicubically pretrained models, both RDN and RCAN can not adapt to LR images of different degradations. However, our online updating scheme can further adjust these models (denoted as ON-RDN and ON-RCAN) to specific degradations in test images. Thus, the performance of these models is greatly improved. Moreover, the experiments also suggest that the effectiveness of online updating is robust to different architectures of .
4.4.3 Study on different modules
To explain the roles of different modules (i.e. IB, EB and ) played in ONSR, we design four other methods termed IBSR, EBSR, IB-EBSR and IB-EB-GSR(as shown in Figure 11) to compare their performance on DIV2KRK.
IBSR. IBSR only has an internal branch to exploit the internal properties of the test LR image for degradation estimation and SR reconstruction, which is optimized online.
EBSR. Contrary to IBSR, EBSR only has an external branch to capture general priors of external HR images, which is optimized offline. After offline training, we use the fixed module to test LR images.
IB-EBSR. IB-EBSR has both internal branch and external branch but no GAN modules.
IB-EB-GSR. IB-EB-GSR has both and to explore the underlying distribution characteristics of the test LR and external HR images.
The quantitative comparisons on DIV2KRK are shown in Table 4. As one can see, IB-EBSR outperforms both IBSR and EBSR by a large margin. It indicates that both IB and EB are important for the SR performance. The performance of IB-EBSR could be further improved if is introduced. It suggests that adversarial training can help to be better optimized. However, when and are both added in IB-EB-GSR, the performance is inferior to ONSR. In IB-EB-GSR, the initial SR results of are likely to have unpleasant artifacts or distortions. Besides, the external HR image can not provide directly pixelwise supervision to . Therefore, the application of may hinder the better optimization of IB-EB-GSR.
4.4.4 Study on separate optimization
In this section, we experimentally compare the Separate Optimization and Joint Optimization. In separate optimization, and are alternately optimized via the test LR image and external HR images respectively. While in joint optimization, both modules are optimized together. As shown in Table 5, Separate Optimization surpasses the Joint Optimization in all metrics for scale factors and .
We also compare the convergence of these two optimization strategies. We plot the PSNR and SSIM results of the two strategies every steps. As shown in Figure 12, the results of Separate Optimization always higher and grow faster than that of Joint Optimization. It indicates that Separate Optimization could not only help the network converge faster, but also help it converge to a better point. This property of Separate Optimization allows us to make a trade-off between SR effectiveness and efficiency by setting different training iterations.
4.4.5 Study on
As we mentioned in the main submission, the weight for GAN loss needs to be tuned so that the degradation of the test LR image could be better estimated and the SR image could be better restored. From Table 6, is the best choice to help optimize the network. Also, as shown in Figure 13, with the increase of from to , or when , i.e. no adversarial training, the SR results become either more blurred or contain more artifacts.
4.5 Non-Blind Setting
To investigate the upper boundary of ONSR, we also make comparisons with other methods (in Table 7) on non-blind setting, i.e. the blur kernel is known and participates in the network optimization. For ONSR, we substitute by ground-truth degradation.
Datasets. Referring to usr , the performance are evaluated on BSD68 Martin2001ADO . 12 representive and diverse blur kernels are used to synthesize the corresponding test LR images, including 4 isotropic Gaussian kernels with different widths, 4 anisotropic Gaussian kernels from srmd , and 4 motion blur kernels from Boracchi2012ModelingTP ; Levin2009UnderstandingAE .
Quantitative Results. As reported in Table 7, ONSR outperforms all other methods on the 12 blur kernels by a large margin, which indicates the robustness of ONSR. Besides, considering GT blur kernels are provided, our online updating scheme is efficient to adjust the model to different degradations, without training on large-scale paired samples.
4.6 Speed Comparison
4.6.1 Speed on image-specific problem
In DIV2KRK, the degradation of each image is different and unknown, which is the image-specific problem. Online blind SR methods are more suitable for this case. Thus, we compare the runtime of ONSR with a typical SotA online SR method: KernelGAN+ZSSR kernel_gan . We use the official codes of KernelGAN+ZSSR to test the average running time on DIV2KRK with scaling factor . For ONSR, we set the training steps to 100 for each image, and the LR image is tested every 10 steps. The average running time of the networks is evaluated on the same machine with an NVIDIA 2080Ti GPU. As shown in Table 8, The PSNR of ONSR is higher than KernelGAN+ZSSR, while the speed is nearly 4 times faster than KernelGAN+ZSSR.
4.6.2 Speed on degradation-specific problem
|IKC||31.67 / 3.423||28.31 / 4.984||27.37 / 3.147||25.33 / 18.276|
|ONSR-G||31.75 / 0.471||28.34 / 0.847||27.48 / 0.467||25.97 / 2.489|
We call the problem that multiple images have the same degradation as the degradation-specific problem. ikc proposed a test kernel set for degradation-specific problem, namely Gaussian8. It consists of eight selected isotropic Gaussian blur kernels, and the ranges of kernel width are set to [1.80, 3.20]. We synthesize test LR images by degrading HR images in the common benchmark datasets (i.e. Set5 set5 , Set14 set14 , BSD100 bsd100 , Urban100 urban100 ) with Gaussian8. Thus each dataset contains eight degradations.
In this case, we randomly select of LR images to online update the model for each degradation. Then the optimal model weights are fixed to process the rest images with the corresponding degradation. As shown in Table 9, ONSR can be significantly accelerated. ONSR outperforms IKC on all datasets, while the speed is nearly 7 times faster than IKC.
5 Conclusion and Future Work
In this paper, we argue that most nowadays SR methods are not image-specific. Towards the limitation, we propose an online super-resolution (ONSR) method, which could customize a specific model for each test image. In detail, we design two branches, namely internal branch (IB) and external branch (EB). IB could learn the specific degradation of the test image, and EB could learn to super resolve images that are degraded by the learned degradation. IB involves only the LR image, while EB uses external HR images. In this way, ONSR could leverage the benefits of both inherent information of the test LR image and general priors from external HR images. Extensive experiments on both synthetic and real-world images prove the superiority of ONSR in blind SR problem. These results indicate that customizing a model for each test image is more practical in real applications than training a general model for all LR images. Moreover, the speed of ONSR may be further improved by designing more light-weight modules for faster inference or elaborating the training strategy to accelerate convergence. Faster speed can help it to be more practical when processing large amounts of test images, such as videos of low resolution, which is also the focus of our future work.
- (2) Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, Y. Fu, Image super-resolution using very deep residual channel attention networks, in: Proceedings of the European conference on computer vision (ECCV), 2018, pp. 286–301.
- (3) Y. Yang, Y. Qi, Image super-resolution via channel attention and spatial graph convolutional network, Pattern Recognition 112 (2021) 107798.
X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, C. Change Loy, Esrgan: Enhanced super-resolution generative adversarial networks, in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
- (5) A. Shocher, N. Cohen, M. Irani, “zero-shot” super-resolution using deep internal learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3118–3126.
- (6) S. Bell-Kligler, A. Shocher, M. Irani, Blind super-resolution kernel estimation using an internal-gan, in: NeurIPS, 2019.
- (7) N. Ahn, J. Yoo, K.-A. Sohn, Simusr: A simple but strong baseline for unsupervised image super-resolution, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 474–475.
- (8) C. Dong, C. C. Loy, K. He, X. Tang, Learning a deep convolutional network for image super-resolution, in: European conference on computer vision, Springer, 2014, pp. 184–199.
- (9) K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- (10) K. Jiang, Z. Wang, P. Yi, J. Jiang, Hierarchical dense recursive network for image super-resolution, Pattern Recognition 107 (2020) 107475.
- (11) K. Zhang, W. Zuo, L. Zhang, Learning a single convolutional super-resolution network for multiple degradations, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3262–3271.
- (12) L. Wang, Z. Huang, Y. Gong, C. Pan, Ensemble based deep networks for image super-resolution, Pattern recognition 68 (2017) 191–198.
- (13) Y. Wang, L. Wang, H. Wang, P. Li, H. Lu, Blind single image super-resolution with a mixture of deep networks, Pattern Recognition 102 (2020) 107169.
- (14) Y. Liang, R. Timofte, J. Wang, S. Zhou, Y. Gong, N. Zheng, Single-image super-resolution-when model adaptation matters, Pattern Recognition (2021) 107931.
- (15) T. Michaeli, M. Irani, Nonparametric blind super-resolution, in: Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 945–952.
- (16) J. Gu, H. Lu, W. Zuo, C. Dong, Blind super-resolution with iterative kernel correction, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1604–1613.
- (17) D. Ulyanov, A. Vedaldi, V. Lempitsky, Deep image prior, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9446–9454.
- (18) D. Ren, K. Zhang, Q. Wang, Q. Hu, W. Zuo, Neural blind deconvolution using deep priors, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3341–3350.
- (19) J. Pan, D. Sun, H. Pfister, M.-H. Yang, Blind image deblurring using dark channel prior, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1628–1636.
- (20) J. Pan, Z. Hu, Z. Su, M.-H. Yang, Deblurring text images via l0-regularized intensity and gradient prior, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2901–2908.
- (21) T. Michaeli, M. Irani, Blind deblurring using internal patch recurrence, in: European conference on computer vision, Springer, 2014, pp. 783–798.
S. Arora, N. Cohen, E. Hazan, On the optimization of deep networks: Implicit acceleration by overparameterization, in: International Conference on Machine Learning, PMLR, 2018, pp. 244–253.
- (23) K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, CoRR abs/1409.1556.
- (24) E. Agustsson, R. Timofte, Ntire 2017 challenge on single image super-resolution: Dataset and study, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 126–135.
- (25) Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE transactions on image processing 13 (4) (2004) 600–612.
- (26) Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, L. Zelnik-Manor, The 2018 pirm challenge on perceptual image super-resolution, in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, O. Wang, The unreasonable effectiveness of deep features as a perceptual metric, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
- (28) V. Cornillere, A. Djelouah, W. Yifan, O. Sorkine-Hornung, C. Schroers, Blind image super-resolution with spatially variant degradations, ACM Transactions on Graphics (TOG) 38 (6) (2019) 1–13.
Z.-S. Liu, W.-C. Siu, L.-W. Wang, C.-T. Li, M.-P. Cani, Unsupervised real image super-resolution via generative variational autoencoder, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 442–443.
- (30) X. Ji, Y. Cao, Y. Tai, C. Wang, J. Li, F. Huang, Real-world super-resolution via kernel estimation and noise injection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 466–467.
- (31) Y. Zhang, Y. Tian, Y. Kong, B. Zhong, Y. Fu, Residual dense network for image super-resolution, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2472–2481.
- (32) B. Lim, S. Son, H. Kim, S. Nah, K. Mu Lee, Enhanced deep residual networks for single image super-resolution, in: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 136–144.
- (33) K. Zhang, W. Zuo, S. Gu, L. Zhang, Learning deep cnn denoiser prior for image restoration, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3929–3938.
- (34) D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, IEEE, 2001, pp. 416–423.
- (35) G. Boracchi, A. Foi, Modeling the performance of image restoration from motion blur, IEEE Transactions on Image Processing 21 (8) (2012) 3502–3517.
- (36) A. Levin, Y. Weiss, F. Durand, W. T. Freeman, Understanding and evaluating blind deconvolution algorithms, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2009, pp. 1964–1971.
- (37) M. Bevilacqua, A. Roumy, C. Guillemot, M. L. Alberi-Morel, Low-complexity single-image super-resolution based on nonnegative neighbor embedding.
- (38) R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representations, in: International conference on curves and surfaces, Springer, 2010, pp. 711–730.
- (39) D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, Vol. 2, IEEE, 2001, pp. 416–423.
- (40) J.-B. Huang, A. Singh, N. Ahuja, Single image super-resolution from transformed self-exemplars, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 5197–5206.