CISRDCNN: Super-resolution of compressed images using deep convolutional neural networks

09/19/2017 ∙ by Honggang Chen, et al. ∙ Sichuan University 0

In recent years, much research has been conducted on image super-resolution (SR). To the best of our knowledge, however, few SR methods were concerned with compressed images. The SR of compressed images is a challenging task due to the complicated compression artifacts, while many images suffer from them in practice. The intuitive solution for this difficult task is to decouple it into two sequential but independent subproblems, i.e., compression artifacts reduction (CAR) and SR. Nevertheless, some useful details may be removed in CAR stage, which is contrary to the goal of SR and makes the SR stage more challenging. In this paper, an end-to-end trainable deep convolutional neural network is designed to perform SR on compressed images (CISRDCNN), which reduces compression artifacts and improves image resolution jointly. Experiments on compressed images produced by JPEG (we take the JPEG as an example in this paper) demonstrate that the proposed CISRDCNN yields state-of-the-art SR performance on commonly used test images and imagesets. The results of CISRDCNN on real low quality web images are also very impressive, with obvious quality enhancement. Further, we explore the application of the proposed SR method in low bit-rate image coding, leading to better rate-distortion performance than JPEG.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 12

page 16

page 17

page 18

page 23

page 24

page 25

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Single image super-resolution (SISR) refers to estimate a high-resolution (HR) image from a single low-resolution (LR) observation, which is of great significance to many image processing and analysis systems. However, the SISR problem is very challenging duo to the ill-posed condition. In other words, a LR image corresponds to a set of HR images, while most of them are not expected. In general, the reconstructed HR image should be visually pleasant and close to the real one as much as possible.

The SISR problem has been widely researched over the past 20 years and plenty of algorithms have been proposed. Roughly speaking, interpolation-based

[1, 2, 3, 4, 5, 6, 7, 8, 9], reconstruction-based [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], and learning-based [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43] algorithms are the three main classes of SISR methods. Generally, the interpolation-based super-resolution (SR) approaches estimate unknown HR pixels using their neighborhoods (the known LR pixels) according to local structure properties. For the reconstruction-based methods, the observation model of LR image and prior knowledge of HR image are integrated to formulate an energy function, and thus the SR task can be converted to an optimization problem. The prior knowledge, which greatly affects SR performance, is the research focus for this kind of methods. The commonly used priors include gradient [10, 11, 12], sparsity [13, 14, 15], nonlocal self-similarity [14, 15, 16, 17, 18, 19, 20], etc. Many reconstruction-based SR methods use two or more priors to combine their complementary properties. The pre-trained mapping between LR images and HR images is usually adopted to guide the SR process in learning-based methods. According to the core of learning-based methods, it can be roughly divided into the following five subclasses further, i.e., neighbor embedding-based [21, 22, 23], example-based [24, 25, 26, 27], sparse coding-based [28, 29, 30, 31, 32], regression-based [33, 34, 35, 36]

, and deep learning-based

[37, 38, 39, 40, 41, 42, 43]. With fast execution speed and outstanding restoration quality, deep learning-based methods show great potential for SR problem. Meanwhile, some researchers attempted to combine different kinds of SR methods, thus integrating their merits [44, 45].

In some practical applications, such as mobile communication and internet, limited by storage capacity and transmission bandwidth, images and videos are generally downsampled and compressed to reduce data volume. In these cases, the observations usually suffer from both of the downsampling and compression degradations, which makes the SR problem more difficult. Although much research has been done on SISR problem and plenty of effective SISR methods have been proposed over the past few decades, few methods were concerned with compressed images [46, 47, 48, 49, 50, 51]. Roughly, there are two kinds of frameworks for compressed images SR. Some researchers converted this task to an optimization problem via compression process modeling and prior knowledge regularization. In SRCDFOE [46], the compression distortion is seen as the spatially correlated Gaussian noise, and the Markov random field and total variation are used to regularize the estimated HR images. To realize decompression and SR simultaneously, the DCSRMOTV [47] incorporates multi-order total variation model into JPEG image acquisition model. For this type of methods, the main pain point is how to realize the accurate modeling of compression process. In addition, it is difficult to balance compression artifacts reduction (CAR) and details preservation. Another commonly used strategy is to decompose this task into two subproblems (i.e., CAR and SR) and use a cascading framework to address them. For example, Xiong et al. [48] combined adaptive regularization and learning-based SR to reduce compression noise and compensate details, respectively. Kang et al. [49] proposed a sparse coding-based SR method for compressed images, in which the patches with/without compression artifacts are processed differently. Using a denoised training dataset, Lee et al. [50] presented a dual-learning-based algorithm for compression noise reduction and SR. More recently, Zhao et al. [51] constructed a three-steps-process framework for compressed images SR, which is composed of BM3D filtering-based compression noise reduction, local encoding-based patch classification, and mapping-based reconstruction. However, for most of this kind of algorithms, the compression noise reduction and upsampling are treated as two independent stages. Consequently, the resultant images of existing methods are apt to still contain compression noise or be over smoothed. On the whole, the research on SR of compressed images is lacking and there is still much room for performance improvement.

Figure 1: Illustration for JPEG compressed image SR on test image Zebra (SR factor: 2, QF: 10). (a) Original image. (b) JPEG compressed LR image. (c) Result of Bicubic on (b). (d) Result of CISRDCNN on (b). Obviously, our result (d) is more visually pleasant than (b) and (c). Please zoom in to view details and make comparisons.

The core issue of compressed images SR is how to reduce compression noise and preserve details as much as possible when enhancing image resolution. On the one hand, it is hard to remove compression artifacts in super-resolved images without a CAR or denoising stage. What is worse, the compression noise in LR may be significantly magnified in HR. On the other hand, the CAR and SR operations should not be separated as part of the details removed in CAR stage are useful for SR. On the basis of the above insights, an end-to-end trainable deep convolutional neural network (CNN) is designed to perform SR on compressed images, and we name it CISRDCNN. The CISRDCNN takes the compressed LR image as input and outputs the resultant HR image directly, without any preprocessing or postprocessing stage. Fig. 1 gives an example of the result of CISRDCNN, and we can see that our result is much more visually pleasant than the LR input and the resultant image of Bicubic interpolation. The framework of the proposed CISRDCNN is illustrated in Fig. 2, and our contributions in this work are mainly in the following aspects:

  • We propose a deep CNN-based SR framework for compressed images, which reduces compression artifacts and enhances image resolution simultaneously.

  • To preserve the functions of different modules in CISRDCNN and achieve joint optimization of CAR and SR, a special strategy is used to train the proposed network, i.e., individual training and joint optimization.

  • Extensive experiments show that the proposed CISRDCNN achieves outstanding SR performance on simulation experiments as well as the test on real low quality web images.

  • We explore the application of the proposed CISRDCNN in low bit-rate image coding, and the experimental results demonstrate that it can improve the rate-distortion performance of JPEG in a wide coding bit-rate range.

  • The proposed method can be easily extended to other compression standards, such as JPEG 2000, H.264, HEVC, etc. In addition, this work provides some insights on the SR of low quality LR images (e.g., noisy and blurry), and more attention would be attracted to concern this kind of problems.

The rest of this paper is organized as follows. Section 2 briefly reviews related works. Section 3 presents the proposed CISRDCNN. Extensive experiments are shown in Section 4. Finally, Section 5 concludes this paper.

Figure 2: The flowchart of CISRDCNN. Top: the architecture of CISRDCNN. Bottom: the illustration of reconstruction process.

2 Related work

2.1 Problem formulation of compressed image SR

Let and be HR and LR images, respectively. Then the conventional SR problem can be formulated as

(1)

where represents the synthetic operator of downsampling and blurring, and denotes the additive noise. The aim of SR is to obtain a high quality estimation of from .

In this work, we address the problem of compressed image SR. Therefore, the compression process should be concerned. Let be the composite operator of compression and decompression, and thus the Eq. 1 is changed to

(2)

where is the compressed LR image. Note that we neglect the additive noise in this work as our focus is compression artifacts. Correspondingly, estimating from the compressed LR observation is our goal. For convenience and clarity of representation, we still let , so we have . Hence, only suffers from blurring and downsampling, but suffers from blurring, downsampling, and compression degradations. The compression process would cause information lost, especially at high compression ratio. Intuitively, it is much harder to recover from than from . Therefore, we try to use the intermediate observation to assist the reconstruction process in the proposed CISRDCNN. Note that the intermediate observation just exists in training phase, while the only input in testing phase is .

Many compression methods have been proposed for still images, nevertheless, JPEG still remains one of the most widely used standards. Hence, in this work, we take the JPEG as an example to test the performance of the proposed CISRDCNN.

2.2 Deep neural networks for image SR

Deep neural networks have been widely used to address image restoration problems, including SR [37, 38, 39, 40, 41, 42, 43, 52, 53], denoising [53, 52], CAR [52, 53, 54, 55], deblurring [56], dehazing [57], etc. In this section, we review some relevant deep neural networks-based SR methods.

In [37], Dong et al. proposed a CNN-based SR framework (SRCNN), which is composed of three convolutional layers. The three layers realize patch extraction, non-linear mapping, and reconstruction, respectively. The SRCNN has drawn wide attention for its excellent performance and simple network architecture. Later on, Dong et al. [40] presented an accelerated version of SRCNN, which is named FSRCNN. The FSRCNN incorporates the upsampling operation into the network and has a hourglass-shape structure, thus achieving remarkable restoration quality and fast execution speed. By contrast, the SRCNN and FSRCNN are relatively shallow. Kim et al. [41]

designed a 20-layer CNN (VDSR), which produces great performance enhancement over SRCNN. For more stable training and better performance, residual-learning and gradient clipping are used in VDSR. More recently, Zhang et al.

[53] proposed a similar network (DnCNN), which combines more advances on deep learning, including Residual Learning [58]

, batch normalization

[59]

, and Rectifier Linear Unit (ReLU)

[60]. The DnCNN shows great effectiveness in several general image denoising problems, including SR and CAR. Overall, the deep neural networks-based SR methods always result in compelling performance, and most of them are efficient in testing phase.

To the best of our knowledge, nevertheless, very few research has been done on deep neural networks-based SR methods for compressed images. The aim of this work is to propose an effective SR method for compressed images using the considerable advances on deep CNN. We try to design a deep network that realizes CAR and SR jointly.

2.3 Advances on deep neural networks

In recent years, many advances have been achieved on deep learning. In the following, we introduce some representative achievements related to this work, i.e., residual learning [58], batch normalization [59], and ReLU [60].

2.3.1 Residual learning

In [58], He et al. firstly proposed the residual learning strategy to address the performance degradation problem caused by the increase of network depth. The main assumption of residual learning is that the learning of residual mapping is much easier than the original mapping. With the residual learning framework, deeper network can be designed and well trained, thus achieving better performance. Actually, similar idea has also been used in many learning-based SR methods, in which the residual image between ground truth and initialization (generally, the interpolated image) is predicted, e.g., ScSR [29], ANR [33], and A+ [34].

2.3.2 Batch normalization

In order to ease the internal covariate shift, Ioffe et al. [59] presented the batch normalization. Internal covariate shift refers to the change in the distribution of each layer’s input, which slows down the training process and makes it harder. To address this problem, Ioffe et al. proposed to normalize layer inputs. More specifically, in each layer, a normalization step and a scale and shift step are incorporated before the nonlinearity. To realize batch normalization, two extra parameters are added per activation, and these parameters can be learned in network training stage. The batch normalization brings a lot of benefits, such as strong robustness to initialization, fewer training steps, and better performance.

2.3.3 ReLU

ReLU is a commonly used activation function in deep neural networks, which outputs

for non-positive inputs and retains positive inputs [60]. The definition of ReLU is:

(3)

The ReLU alleviates the gradient vanishing problem to some extent, thus making the training of deep neural networks easier.

For fast and stable training procedure and excellent restoration quality, the proposed CISRDCNN integrates residual learning, batch normalization, and ReLU. Unlike the DnCNN [53] that employs a single residual unit, the CISRDCNN uses two residual units due to the fact that the input and output of CISRDCNN are different in resolution. More details about CISRDCNN will be introduced in Section 3.

3 The proposed CISRDCNN

In this section, we present the CISRDCNN in detail. As illustrated in Fig. 2, the proposed CISRDCNN consists of three modules: deblocking module (DBCNN), upsampling module (USCNN), and quality enhancement module (QECNN). Firstly, the DBCNN removes compression artifacts in input and generates a better input for USCNN. Secondly, the USCNN magnifies its input to expected resolution, and thus no extra interpolation procedure is needed. Finally, the QECNN is integrated to improve the quality of upsampled image further. Although the three modules have their respective functions, they are not independent and the whole network is end-to-end trainable. Overall, the differences between CISRDCNN and relevant CNN-based SR methods (e.g., SRCNN [37], FSRCNN [40], VDSR [41], DnCNN [53]) are mainly in the following aspects:

  • The CISRDCNN is for compressed images, and thus compression noise is taken into consideration carefully. Nevertheless, most of the learning-based SR methods do not apply to noisy images.

  • The CISRDCNN is composed of three functional modules, however, it is end-to-end trainable. In this way, the functions of the three modules can be preserved to some extent; meanwhile, the whole network can be optimized to produce minimum prediction error.

  • The CISRDCNN is trained in a particular way. It firstly trains the three functional modules separately to achieve their respective goals, then optimizes the whole network jointly with the fine-tuning strategy.

In CISRDCNN, these specific design and improvements for compressed images enable more accurate estimation of the ground truth. In the following, more details about the architecture and training strategy of CISRDCNN are presented.

3.1 Network architecture

For the convenience of representation, the depths of DBCNN, USCNN, and QECNN are denoted as , , and , respectively.

3.1.1 Dbcnn

The DBCNN is composed of two types of layers. The first convolutional layers use filters of size , and the batch normalization and ReLU are placed behind these convolutional layers as [53]. The last layer (the -th layer) generates the restored image using filter of size . Since the input and output of DBCNN are very similar, learning the residual image is more suitable. Hence, we adopt residual learning strategy in this module. More specifically, an identity connection is used to pass the input of DBCNN to its output. Note that all of the batch normalization and ReLU are not presented in Fig. 2 for brevity.

3.1.2 Uscnn

The first convolutional layers of USCNN are the same, using filters of size and followed by batch normalization and ReLU. The last layer is a deconvolutional layer, which performs upsampling operation. The deconvolutional layer produces one upsampled image using filter of size .

3.1.3 Qecnn

The architecture of QECNN is similar to DBCNN. Therefore, we do not introduce the QECNN in detail to avoid redundancy.

Figure 3: The flowchart of CISRDCNN training procedure.

3.2 Network training strategy

Let be the training image pairs. As introduced in Section 2.1, denotes a HR sample, denotes the corresponding LR sample that only suffers from blurring and downsampling, and represents the compressed version of .

As shown in Fig. 3, the training of CISRDCNN is mainly composed of four steps. Firstly, the set is used to train the deblocking network DBCNN. As we adopt the residual learning strategy, our goal is to learn a residual mapping that predicts the residual image

. Consequently, the loss function of DBCNN is defined as

(4)

where denotes the trainable parameter set in DBCNN, and .

Secondly, we train the upsampling network USCNN. Once the training of DBCNN is finished, we can get the estimation of (denoted as ) from its compressed observation . The training set for USCNN is . That is, USCNN aims to learn a function that maps to . Formally, the loss function of USCNN is defined as

(5)

where denotes the trainable parameter set in USCNN.

Thirdly, we train the quality enhancement network QECNN. Similarly, the HR version of is estimated using the learned DBCNN and USCNN, and the estimation is denoted as . Correspondingly, the training set for QECNN is . In QECNN, we also adopt the residual learning, and thus the goal is to learn a residual mapping that predicts the residual image . Hence, the loss function of QECNN is defined as

(6)

where denotes the trainable parameter set in QECNN, and .

Finally, the CISRDCNN is optimized in an end-to-end manner. The learned parameters of DBCNN, USCNN, and QECNN are used to initialize CISRDCNN firstly, and then we use the training sample set to optimize the whole network with the fine-tuning strategy. The loss function for the end-to-end optimization procedure is defined as

(7)

where denotes the trainable parameter set in CISRDCNN.

In CISRDCNN, the three modules are with specific functions, i.e., deblocking, upsampling, and quality enhancement. With the above training strategy, the goal of each module can be achieved, while the final joint optimization procedure minimizes prediction error. On the other hand, training a deep network directly is hard. Initializing the deep network with learned parameters is beneficial to obtaining stable training procedure and fast convergence speed. Note that for different compression quality factors (QFs), the networks can also be trained from the learned model with the fine-tuning strategy, rather than training from scratch.

Figure 4: Test images (Set10), from left to right and top to bottom: Butterfly (), House (), Parrot (), Woman (), Circuit (), Leaves (), Foreman (), Zebra (), Peppers (), Ppt3 ().
Test Images Butterfly House Parrot Woman Circuit Leaves Foreman Zebra Peppers Ppt3 Average
Quality Factor = 10
Bicubic 22.691 27.392 26.234 26.284 23.373 21.455 27.759 25.654 28.489 23.258 25.259
A+ [34] 23.182 27.915 26.642 27.046 23.832 21.905 28.801 26.375 29.333 23.159 25.819
FSRCNN [40] 23.908 28.676 27.214 27.522 24.636 22.974 29.302 26.784 29.809 24.577 26.540
VDSR [41] 24.193 29.439 26.502 27.873 24.444 22.676 29.735 26.838 30.144 24.749 26.659
CONCOLOR-VDSR [61, 41] 23.454 28.292 26.918 27.323 24.397 22.336 29.555 26.801 29.795 24.202 26.307
ARCNN-VDSR [54, 41] 24.115 28.388 27.153 27.526 24.586 23.222 29.602 26.668 29.846 24.748 26.585
SRCDFOE [46] 23.174 28.093 26.653 26.876 23.896 21.840 28.374 26.271 29.351 23.722 25.825
LJSRDB [49] 22.667 27.591 26.329 26.336 23.393 21.451 28.126 25.765 28.798 23.198 25.365
Proposed CISRDCNN 24.534 30.034 27.545 28.177 25.131 23.761 30.271 27.335 30.381 25.188 27.236
Quality Factor = 20
Bicubic 23.888 29.122 27.637 27.923 25.000 23.074 29.740 27.243 30.452 24.511 26.859
A+ [34] 24.570 29.888 27.971 28.796 25.708 23.894 31.072 28.192 31.411 24.891 27.639
FSRCNN [40] 25.374 30.528 28.589 29.219 26.433 24.907 31.642 28.479 31.705 26.203 28.308
VDSR [41] 25.491 31.292 28.164 29.450 26.167 24.844 31.865 28.366 31.966 26.387 28.399
CONCOLOR-VDSR [61, 41] 24.628 29.344 28.093 28.804 25.779 23.969 31.683 28.109 31.415 25.941 27.777
ARCNN-VDSR [54, 41] 25.429 29.536 28.531 28.990 26.235 24.951 31.647 28.241 31.652 26.470 28.168
SRCDFOE [46] 24.423 29.809 27.976 28.509 25.515 23.588 30.343 27.819 31.141 25.194 27.432
LJSRDB [49] 23.858 28.991 27.615 27.832 24.686 22.777 29.922 27.148 30.427 24.455 26.771
Proposed CISRDCNN 25.931 31.652 28.983 30.026 26.947 26.102 32.246 28.951 32.194 27.179 29.021
Quality Factor = 30
Bicubic 24.508 29.660 28.510 28.807 25.738 23.962 30.724 28.132 31.355 25.130 27.653
A+ [34] 25.289 30.431 29.066 29.742 26.568 24.980 32.194 29.162 32.342 25.838 28.561
FSRCNN [40] 26.089 31.103 29.554 30.198 27.310 26.148 32.610 29.326 32.594 27.141 29.207
VDSR [41] 26.304 31.878 29.354 30.500 27.293 26.145 32.991 29.465 32.860 27.752 29.454
CONCOLOR-VDSR [61, 41] 25.316 29.992 29.085 29.630 26.455 24.972 32.405 28.975 32.252 27.091 28.617
ARCNN-VDSR [54, 41] 26.257 29.816 29.518 30.221 27.048 26.080 32.586 29.376 32.663 27.637 29.120
SRCDFOE [46] 25.108 30.351 28.926 29.413 26.248 24.601 31.275 28.792 31.982 26.050 28.275
LJSRDB [49] 24.530 29.321 28.548 28.752 24.853 23.286 30.621 27.987 31.238 24.843 27.398
Proposed CISRDCNN 26.646 32.214 29.890 30.918 27.847 27.305 33.257 29.922 33.074 28.481 29.955
Table 1:
PSNR (dB) scores of different methods on Set10 (SR factor: 2, QF: 10/20/30).
Test Images Butterfly House Parrot Woman Circuit Leaves Foreman Zebra Peppers Ppt3 Average
Quality Factor = 10
Bicubic 0.6790 0.7499 0.7742 0.7684 0.6956 0.6680 0.7771 0.7996 0.7550 0.7876 0.7454
A+ [34] 0.7180 0.7870 0.8054 0.8098 0.7317 0.7172 0.8224 0.8342 0.7914 0.8231 0.7840
FSRCNN [40] 0.7437 0.8084 0.8250 0.8319 0.7694 0.7953 0.8450 0.8513 0.8081 0.8721 0.8150
VDSR [41] 0.7575 0.8232 0.8279 0.8406 0.7741 0.7976 0.8551 0.8562 0.8150 0.8831 0.8230
CONCOLOR-VDSR [61, 41] 0.7422 0.8057 0.8255 0.8317 0.7705 0.7549 0.8486 0.8556 0.8118 0.8697 0.8116
ARCNN-VDSR [54, 41] 0.7494 0.8050 0.8278 0.8317 0.7678 0.8023 0.8477 0.8466 0.8089 0.8720 0.8159
SRCDFOE [46] 0.7213 0.7936 0.8102 0.8113 0.7404 0.7164 0.8191 0.8369 0.7971 0.8451 0.7891
LJSRDB [49] 0.6888 0.7706 0.7921 0.7827 0.7029 0.6728 0.8013 0.8204 0.7781 0.8039 0.7614
Proposed CISRDCNN 0.7699 0.8310 0.8410 0.8494 0.7937 0.8326 0.8640 0.8655 0.8222 0.8958 0.8365
Quality Factor = 20
Bicubic 0.7369 0.7999 0.8211 0.8178 0.7596 0.7404 0.8273 0.8479 0.8029 0.8299 0.7984
A+ [34] 0.7704 0.8263 0.8414 0.8500 0.7996 0.8059 0.8652 0.8718 0.8283 0.8748 0.8334
FSRCNN [40] 0.7916 0.8410 0.8547 0.8666 0.8234 0.8570 0.8818 0.8819 0.8380 0.9056 0.8542
VDSR [41] 0.8014 0.8478 0.8576 0.8735 0.8288 0.8676 0.8868 0.8856 0.8431 0.9202 0.8612
CONCOLOR-VDSR [61, 41] 0.7835 0.8301 0.8535 0.8634 0.8159 0.8166 0.8863 0.8815 0.8369 0.9036 0.8471
ARCNN-VDSR [54, 41] 0.7902 0.8370 0.8548 0.8616 0.8186 0.8550 0.8795 0.8762 0.8365 0.9046 0.8514
SRCDFOE [46] 0.7694 0.8317 0.8427 0.8488 0.7980 0.7939 0.8582 0.8701 0.8280 0.8820 0.8323
LJSRDB [49] 0.7444 0.8164 0.8321 0.8265 0.7638 0.7409 0.8446 0.8562 0.8141 0.8433 0.8082
Proposed CISRDCNN 0.8135 0.8533 0.8678 0.8822 0.8429 0.8949 0.8937 0.8938 0.8476 0.9356 0.8725
Quality Factor = 30
Bicubic 0.7656 0.8200 0.8440 0.8440 0.7889 0.7792 0.8527 0.8698 0.8225 0.8539 0.8241
A+ [34] 0.7966 0.8416 0.8638 0.8715 0.8263 0.8441 0.8847 0.8896 0.8431 0.8973 0.8559
FSRCNN [40] 0.8135 0.8513 0.8749 0.8846 0.8468 0.8845 0.8953 0.8966 0.8501 0.9232 0.8721
VDSR [41] 0.8243 0.8574 0.8798 0.8929 0.8544 0.8965 0.9021 0.9023 0.8556 0.9404 0.8806
CONCOLOR-VDSR [61, 41] 0.8047 0.8507 0.8734 0.8823 0.8359 0.8505 0.8979 0.8960 0.8490 0.9257 0.8666
ARCNN-VDSR [54, 41] 0.8165 0.8509 0.8748 0.8853 0.8412 0.8840 0.8948 0.8961 0.8504 0.9249 0.8719
SRCDFOE [46] 0.7946 0.8428 0.8625 0.8688 0.8216 0.8318 0.8756 0.8879 0.8411 0.9034 0.8530
LJSRDB [49] 0.7720 0.8269 0.8543 0.8538 0.7842 0.7745 0.8664 0.8738 0.8304 0.8673 0.8304
Proposed CISRDCNN 0.8327 0.8633 0.8853 0.8999 0.8638 0.9158 0.9081 0.9082 0.8588 0.9515 0.8887
Table 2:
SSIM scores of different methods on Set10 (SR factor: 2, QF: 10/20/30).
Test Images Butterfly House Parrot Woman Circuit Leaves Foreman Zebra Peppers Ppt3 Average
Quality Factor = 10
Bicubic 1.404 0.867 0.858 0.983 1.671 1.828 0.847 1.133 0.898 1.260 1.175
A+ [34] 1.684 0.996 1.052 1.208 1.932 2.162 1.060 1.400 1.092 1.363 1.395
FSRCNN [40] 1.887 1.138 1.164 1.329 2.178 2.554 1.170 1.546 1.227 1.660 1.585
VDSR [41] 2.048 1.262 1.206 1.427 2.222 2.565 1.268 1.601 1.312 1.729 1.664
CONCOLOR-VDSR [61, 41] 2.115 1.268 1.298 1.401 2.222 2.555 1.345 1.626 1.310 1.716 1.686
ARCNN-VDSR [54, 41] 1.982 1.183 1.211 1.347 2.183 2.676 1.229 1.540 1.247 1.696 1.629
SRCDFOE [46] 1.688 1.007 1.040 1.161 1.878 2.034 0.996 1.361 1.104 1.413 1.368
LJSRDB [49] 1.478 0.903 0.922 1.024 1.699 1.888 0.920 1.253 0.968 1.276 1.233
Proposed CISRDCNN 2.268 1.413 1.395 1.552 2.464 3.070 1.430 1.763 1.435 1.901 1.869
Quality Factor = 20
Bicubic 2.030 1.248 1.284 1.437 2.374 2.572 1.298 1.640 1.355 1.719 1.696
A+ [34] 2.336 1.451 1.502 1.697 2.746 3.080 1.560 1.926 1.595 1.930 1.982
FSRCNN [40] 2.544 1.556 1.568 1.813 2.985 3.342 1.618 2.066 1.679 2.171 2.134
VDSR [41] 2.706 1.650 1.655 1.937 3.014 3.562 1.733 2.145 1.786 2.323 2.251
CONCOLOR-VDSR [61, 41] 2.710 1.627 1.683 1.873 2.889 3.320 1.831 2.085 1.725 2.279 2.202
ARCNN-VDSR [54, 41] 2.562 1.533 1.564 1.770 2.907 3.340 1.620 2.020 1.653 2.206 2.118
SRCDFOE [46] 2.283 1.401 1.439 1.603 2.591 2.785 1.418 1.836 1.531 1.916 1.880
LJSRDB [49] 2.189 1.285 1.400 1.475 2.352 2.574 1.376 1.751 1.408 1.742 1.755
Proposed CISRDCNN 2.903 1.767 1.804 2.093 3.276 4.047 1.855 2.283 1.884 2.529 2.444
Quality Factor = 30
Bicubic 2.450 1.506 1.617 1.787 2.824 3.058 1.592 1.997 1.664 2.036 2.053
A+ [34] 2.751 1.728 1.853 2.046 3.196 3.609 1.864 2.276 1.912 2.296 2.353
FSRCNN [40] 2.942 1.805 1.915 2.165 3.452 3.877 1.894 2.402 1.985 2.542 2.498
VDSR [41] 3.163 1.934 2.039 2.316 3.583 4.102 2.040 2.532 2.103 2.791 2.660
CONCOLOR-VDSR [61, 41] 3.096 1.905 2.023 2.201 3.297 3.810 2.042 2.400 2.003 2.686 2.546
ARCNN-VDSR [54, 41] 3.006 1.815 1.917 2.171 3.363 3.885 1.893 2.400 1.993 2.628 2.507
SRCDFOE [46] 2.683 1.655 1.754 1.944 3.038 3.289 1.687 2.174 1.814 2.277 2.232
LJSRDB [49] 2.599 1.484 1.727 1.819 2.664 2.958 1.633 2.043 1.666 1.996 2.059
Proposed CISRDCNN 3.310 2.032 2.155 2.434 3.751 4.559 2.130 2.609 2.189 2.915 2.808
Table 3:
IFC scores of different methods on Set10 (SR factor: 2, QF: 10/20/30).

4 Experimental results

The experimental settings are introduced firstly, and then extensive results are presented to verify the effectiveness of CISRDCNN in this section, including the test on real low quality web images. In addition, we take the low bit-rate coding as an example to show the application of the proposed CISRDCNN.

4.1 Experimental settings

Main parameters of CISRDCNN: in our implementation, we set , , and .

Training data: following [41], the 291 imageset that consists of 200 images from BSDS500 111Available: http://www.eecs.berkeley.edu/Research/Projects/CS/vision/
grouping/segbench .
and 91 images from Yang et al. [28] is used to train CISRDCNN. To increase the number of samples and improve SR performance, we also adopt data augmentation techniques. To generate LR observations, these HR images are downsampled using the imresize function (kernel: bicubic, downsampling factor: 2) in Matlab firstly, and then the downsampled images are compressed using JPEG.

Test images: Fig. 4 shows the ten test images (named Set10) used in our experiment, which are widely used to evaluate SR methods in literature. For color images, only the luminance components are processed.

Test datasets: four commonly used datasets in SR problem are used to test the performance of different methods, including Set5 [22], Set14 [29], B100 footnote1, and Urban100 [26].

Degradation model: for the simulation experiments, the original HR image is downsampled using the imresize function (kernel: bicubic, downsampling factor: 2) in Matlab firstly, and then the downsampled image is compressed using JPEG with different QFs.

Comparison baselines: the comparison baselines include Bicubic, A+ [34], FSRCNN [40], VDSR [41], CONCOLOR-VDSR [61, 41], ARCNN-VDSR [54, 41]222ARCNN [54] and CONCOLOR [61] are typical and effective compression artifacts reduction methods., SRCDFOE [46], and LJSRDB [49]. For A+ [34], FSRCNN [40], and VDSR [41], we retrained their models according to our experimental settings. The CONCOLOR-VDSR [61, 41] and ARCNN-VDSR [54, 41] are cascading methods, which consist of the state-of-the-arts of deblocking and SR methods. The SRCDFOE [46] and LJSRDB [49] are two SR algorithms for JPEG compressed images.

Performance evaluation: resultant images of different methods are evaluated objectively and subjectively. For the simulation experiments, the PSNR, SSIM [62], and IFC [63] are adopted to perform objective evaluation. For the SR of real world compressed images, we use the no-reference quality metric for SR proposed in [64] to evaluate results objectively.

4.2 Super-resolution results on synthetic LR images

4.2.1 Objective evaluation

Due to the limited space, we only present the objective scores of different methods at QF = 10/20/30 in this subsection. It can be seen from the results reported in Table 1, Table 2, and Table 3 that the CISRDCNN consistently produces the highest PSNR/SSIM/IFC values. Overall, the VDSR [41] generates the second-best results. The FSRCNN [40] and ARCNN-VDSR [41, 54] achieve similar performance, and both of them are slightly inferior to the VDSR [41]. The A+ [34], CONCOLOR-VDSR [41, 61] and SRCDFOE [46] are superior to Bicubic, but the gains are limited to some extent. Compared with Bicubic, the LJSRDB [49] produces worse results in some cases. For A+ [34], the severe compression noise in LR images causes significant performance degradation as it is sensitive to noise. The SRCDFOE [46] and LJSRDB [49] are unified frameworks for JPEG compressed images, however, they do not handle compression noise well. By contrast, more obvious improvement is produced by the proposed CISRDCNN. For example, at QF = 10, the CISRDCNN achieves average 1.977 dB/0.0911/0.694 PSNR/SSIM/IFC gains over Bicubic, and 0.577 dB/0.0135/0.205 over VDSR [41]. Note that the VDSR [41] is one of the state-of-the-art SR methods. Compared with the SR methods for JPEG compressed images, i.e., the SRCDFOE [46] and LJSRDB [49], the average PSNR/SSIM/IFC gains are up to 1.411 dB/0.0474/0.501 and 1.871 dB/0.0751/0.636, respectively. Similar results can be observed at QF = 20 and QF = 30. In sum, the CISRDCNN achieves state-of-the-art performance.

Figure 5: Super-resolution results of Butterfly generated by different methods (SR factor: 2, QF: 10). (a) Original image (PSNR (dB), SSIM, IFC). (b) Bicubic (22.691, 0.6790, 1.404). (c) A+ [34] (23.182, 0.7180, 1.684). (d) FSRCNN [40] (23.908, 0.7437, 1.887). (e) VDSR [41] (24.193, 0.7575, 2.048). (f) CONCOLOR-VDSR [61, 41] (23.454, 0.7422, 2.115). (g) ARCNN-VDSR [54, 41] (24.115, 0.7494, 1.982). (h) SRCDFOE [46] (23.174, 0.7213, 1.688). (i) LJSRDB [49] (22.667, 0.6888, 1.478). (j) Proposed CISRDCNN (24.534, 0.7699, 2.268). Please zoom in to view details and make comparisons.
Figure 6: Super-resolution results of Ppt3 generated by different methods (SR factor: 2, QF: 20). (a) Original image (PSNR (dB), SSIM, IFC). (b) Bicubic (24.511, 0.8299, 1.719). (c) A+ [34] (24.891, 0.8748, 1.930). (d) FSRCNN [40] (26.203, 0.9056, 2.171). (e) VDSR [41] (26.387, 0.9202, 2.323). (f) CONCOLOR-VDSR [61, 41] (25.941, 0.9036, 2.279). (g) ARCNN-VDSR [54, 41] (26.470, 0.9046, 2.206). (h) SRCDFOE [46] (25.194, 0.8820, 1.916). (i) LJSRDB [49] (24.455, 0.8433, 1.742). (j) Proposed CISRDCNN (27.179, 0.9356, 2.529). Please zoom in to view details and make comparisons.
Figure 7: Super-resolution results of House generated by different methods (SR factor: 2, QF: 30). (a) Original image (PSNR (dB), SSIM, IFC). (b) Bicubic (29.660, 0.8200, 1.506). (c) A+ [34] (30.431, 0.8416, 1.728). (d) FSRCNN [40] (31.103, 0.8513, 1.805). (e) VDSR [41] (31.878, 0.8574, 1.934). (f) CONCOLOR-VDSR [61, 41] (29.992, 0.8507, 1.905). (g) ARCNN-VDSR [54, 41] (29.816, 0.8509, 1.815). (h) SRCDFOE [46] (30.351, 0.8428, 1.655). (i) LJSRDB [49] (29.321, 0.8269, 1.484). (j) Proposed CISRDCNN (32.214, 0.8633, 2.032). Please zoom in to view details and make comparisons.

4.2.2 Subjective evaluation

Part of the resultant images are presented to compare visual quality. To comprehensively show the performance of all methods, we deliberately illustrate the results at different QFs. Specifically, Fig. 5 shows the results of Butterfly at QF = 10. Fig. 6 shows the results of Ppt3 at QF = 20. Fig. 7 shows the results of House at QF = 30. For better view and comparison, two local regions are highlighted in each figure. The results of Bicubic, A+ [34], FSRCNN [40], SRCDFOE [46], and LJSRDB [49] contain obvious artifacts, especially at low QFs. The VDSR [41], CONCOLOR-VDSR [61, 41], ARCNN-VDSR [54, 41] can remove most of the compression artifacts, nevertheless, the results are blurred somewhat. Comparatively, the results of CISRDCNN are more visually pleasant, with fewer artifacts and clearer structures. For instance, the text in image Ppt3 (Fig. 6) and the eave in image House (Fig. 7). In sum, benefitting from the strong ability of deep CNN and the specific design for compressed images, the CISRDCNN realizes joint optimization of compression noise reduction process and SR process, thus leading to state-of-the-art performance.

The results in this subsection provide some insights for the further research on compressed images SR. The comparison set in this experiment is composed of different kinds of methods, including conventional SR method (A+ [34], FSRCNN [40], VDSR [41]), cascading SR method (CONCOLOR-VDSR [61, 41], ARCNN-VDSR [54, 41]), unified SR framework (SRCDFOE [46], LJSRDB [49]), and joint optimized SR method (CISRDCNN). According to their performance, we can obtain the following conclusion: the CAR stage is necessary, but it should not be independent of the SR stage. The CAR stage is beneficial to reducing compression artifacts, however, it is hard to control the degree of artifacts reduction. Therefore, joint optimization of CAR and SR is significant. These insights may also apply to the SR of noisy images and blurred images, which will be studied in our future work.

4.3 Robustness to quality factors

In this subsection, the robustness of CISRDCNN to compression QFs is tested. To conduct this experiment, a series of CNN models are trained at different QFs. Fig. 8 presents the average PSNR gains of CISRDCNN over Bicubic at different QFs on Set10. It can be observed that CISRDCNN achieves obvious PSNR gain in a wide range of QFs, even at low compression ratio. Hence, the CISRDCNN is robust to QFs, and it applies to compressed images in different quality.

Figure 8: The average PSNR gain (dB) of the proposed CISRDCNN over Bicubic at different QFs on Set10.

4.4 Experimental results on image datasets

In order to evaluate the stability and robustness of CISRDCNN on different kinds of images, we conduct experiments on four standard imagesets, including Set5 [22], Set14 [29], B100 footnote1, and Urban100 [26]. For the images in B100, we crop them to generate test images of size . Similarly, the images in Urban100 are cropped to generate small test images of size . Due to the limited space, we only take QF = 10 as an example in this experiment, and the Bicubic, VDSR [41], ARCNN-VDSR [54, 41], and SRCDFOE [46] are selected as baselines. The average PSNR/SSIM/IFC results are reported in Table 4. It can be observed that the CISRDCNN consistently outperforms all of the compared baselines.

We further draw the distributions of PSNR/SSIM/IFC gains of CISRDCNN over the baselines in Fig. 9. One can easily see that the CISRDCNN outperforms competitors for most of the test images in the four commonly used imagesets. The results shown in this subsection demonstrate the robustness and stability of CISRDCNN.

DataSets Set5 Set14 B100 Urban100
PSNR (dB)
Bicubic 26.602 25.037 24.285 22.999
VDSR [41] 27.812 25.901 24.870 23.932
ARCNN-VDSR [54, 41] 27.827 25.856 24.899 23.929
SRCDFOE [46] 27.243 25.451 24.599 23.340
Proposed CISRDCNN 28.154 26.127 25.019 24.369
SSIM
Bicubic 0.7239 0.6433 0.5863 0.6224
VDSR [41] 0.7931 0.6853 0.6179 0.6859
ARCNN-VDSR [54, 41] 0.7878 0.6803 0.6153 0.6774
SRCDFOE [46] 0.7661 0.6663 0.6025 0.6512
Proposed CISRDCNN 0.8039 0.6926 0.6238 0.7043
IFC
Bicubic 1.036 0.989 0.817 1.142
VDSR [41] 1.421 1.263 0.988 1.542
ARCNN-VDSR [54, 41] 1.398 1.240 0.983 1.510
SRCDFOE [46] 1.207 1.113 0.900 1.285
Proposed CISRDCNN 1.546 1.354 1.040 1.751
Table 4:
Comparisons of average PSNR (dB)/SSIM/IFC scores on datasets (SR factor: 2, QF: 10).
Figure 9: Distributions of PSNR (dB)/SSIM/IFC gains of CISRDCNN over the compared methods, on all images in Set5 [22], Set14 [29], B100footnote1, and Urban100 [26] (SR factor: 2, QF: 10). (a) Distribution of PSNR (dB) gain. (b) Distribution of SSIM gain. (c)Distribution of IFC gain. The statistical steps for PSNR (dB)/SSIM/IFC gains are set to be 0.1/0.0025/0.025, respectively.

4.5 Empirical study on computational time

In this subsection, we compare the running time and PSNR of different methods (QF = 10). This experiment is conducted on a desktop computer (Win7, Inter Core i5 CPU 3.3GHz, 12G memory, Matlab 2014a 64bit) 333The LJSRDB [49] is running on another computer as the code only can run in Matlab 32bit version, so we do not present the running time of this algorithm in Fig. 10. . The running time and PSNR of each method are average values of all the ten test images in Fig. 4. As depicted in Fig. 10, the proposed CISRDCNN achieves state-of-the-art performance with acceptable computational time444It is important to note that we use the Matlab test code of FSRCNN (available: http://mmlab.ie.cuhk.edu.hk/projects/FSRCNN.html), which is much slower the implementation used in [40].. In addition, the execution time of CISRDCNN can be greatly accelerated with a powerful GPU.

Figure 10: Average PSNR (dB)/SSIM/IFC scores (SR factor: 2, QF: 10) vs. running time (s). (a) PSNR (dB) vs. Running Time (s). (b) SSIM vs. Running Time (s). (c) IFC vs. Running Time (s).
(a) LR (Rose)
(b) Bicubic
(c) CISRDCNN
(d) LR (Child)
(e) Bicubic
(f) CISRDCNN
(g) LR (Anime)
(h) Bicubic
(i) CISRDCNN
Figure 11: Super-resolution results of real low quality web images (SR factor: 2). The first column [(a)(d)(g)]: real low quality web images. The second column [(b)(e)(h)]: the results of Bicubic. The third column [(c)(f)(i)]: the results of proposed CISRDCNN. Please zoom in to view details and make comparisons.

4.6 Super-resolution on real low quality web images

We further test the effectiveness of CISRDCNN on real low quality web images, which usually suffer from downsampling and compression due to the limited bandwidth and storage capacity. The test images used in this experiment are downloaded from internet 555Available: http://image.baidu.com .. As presented in Fig. 11, we can observe that the CISRDCNN achieves obvious perceptual quality enhancement over the original images and the interpolation results of Bicubic, with fewer artifacts and clearer structures.

Further, the no-reference image quality evaluation index for SR proposed in [64] is used to quantitatively compare these resultant images, and the scores are illustrated in Table 5. It can be seen that the CISRDCNN generates higher values than Bicubic on all of the three test images, which also indicates that the resultant images of CISRDCNN are of higher quality. The results in this subsection verify that the proposed CISRDCNN is also applicable to the compressed image in the real world.

Test Images Rose Child Anime
Bicubic 2.927 3.650 3.044
CISRDCNN 4.941 3.902 6.050
Table 5:

No-reference image quality assessment on the SR results of low quality web images using the evaluation metric for SR proposed in

[64] (SR factor: 2).
Figure 12: The flowchart of CISRDCNN-based low bit-rate coding method.
Figure 13: Rate-distortion curves of JPEG and the proposed CISRDCNN-LBRC. (a) Butterfly. (b) Woman. (c) Circuit. (d) Leaves. (e) Foreman. (f) Peppers.
Figure 14: Perceptual quality comparison of JPEG and the proposed CISRDCNN-LBRC on Woman. (a) Original image (PSNR (dB)). (b) JPEG at 0.205 bpp (27.551). (c) CISRDCNN-LBRC at 0.205 bpp (31.359). (d) JPEG at 0.388 bpp (32.377). (e) CISRDCNN-LBRC at 0.383 bpp (33.710). Please zoom in to view details and make comparisons.
Figure 15: Perceptual quality comparison of JPEG and the proposed CISRDCNN-LBRC on Circuit. (a) Original image (PSNR (dB)). (b) JPEG at 0.289 bpp (24.849). (c) CISRDCNN-LBRC at 0.286 bpp (28.147). (d) JPEG at 0.579 bpp (28.728). (e) CISRDCNN-LBRC at 0.561 bpp (30.089). Please zoom in to view details and make comparisons.
Figure 16: Perceptual quality comparison of JPEG and the proposed CISRDCNN-LBRC on Leaves. (a) Original image (PSNR (dB)). (b) JPEG at 0.364 bpp (22.947). (c) CISRDCNN-LBRC at 0.355 bpp (27.305). (d) JPEG at 0.840 bpp (28.840). (e) CISRDCNN-LBRC at 0.821 bpp (31.105). Please zoom in to view details and make comparisons.
Figure 17: Perceptual quality comparison of JPEG and the proposed CISRDCNN-LBRC on Foreman. (a) Original image (PSNR (dB)). (b) JPEG at 0.203 bpp (28.415). (c) CISRDCNN-LBRC at 0.200 bpp (33.550). (d) JPEG at 0.554 bpp (35.922). (e) CISRDCNN-LBRC at 0.552 bpp (36.654). Please zoom in to view details and make comparisons.

4.7 Application in low bit-rate image coding

At low bit-rates, the existing compression methods (e.g., JPEG and JPEG 2000) always produce visually unpleasant compression artifacts. In this subsection, we take the JPEG as an example to show how to use the proposed CISRDCNN to construct a low bit-rate coding framework (CISRDCNN-LBRC), thus enhancing the rate-distortion performance of JPEG. The starting point is to reduce data volume but preserve main structure of the original image via placing a downsampling operator before JPEG encoder. Correspondingly, the CISRDCNN module is placed behind the JPEG decoder to perform upsampling. As shown in Fig. 12, the presented CISRDCNN-LBRC consists of four parts: downsampling, JPEG encoder, JPEG decoder, and CISRDCNN.

The test images Butterfly, Woman, Circuit, Leaves, Foreman, and Peppers are selected as examples to test the effectiveness of CISRDCNN-LBRC. Note that we use the luminance components of the six test images to conduct experiments in this subsection. For a fair comparison, the JPEG is used as the baseline in this experiment. The rate-distortion curves of JPEG and CISRDCNN-LBRC are presented in Figs. 13. It can be seen that the rate-distortion performance CISRDCNN-LBRC is obviously superior to the JPEG at low bit-rates. From another point of view, the CISRDCNN-LBRC can save lots of coding bits.

To compare the perceptual quality of the decoded images, we show the results of CISRDCNN-LBRC and JPEG at different bit-rates. Due to the limited space, only the results of Woman, Circuit, Leaves, and Foreman are presented in Fig. 14 to Fig. 17. We can observe that the CISRDCNN-LBRC generates fewer artifacts and preserves main structures better. For instance, the fingers in image Woman (Fig. 14) and the collar in image Foreman (Fig. 17). Overall, the CISRDCNN-LBRC performs better than JPEG at low bit-rates in terms of both objective and subjective evaluation.

5 Conclusion

In this paper we propose a SR algorithm for compressed images. Unlike the existing SR methods for compressed images, we treat this task as two relevant subproblems, i.e., CAR and SR. Further, a deep network is designed to realize joint optimization of the two subproblems. We take the compression standard JPEG as an example to test the effectiveness of the proposed CISRDCNN, and experiments on both synthetic images and real low quality web images show that it produces state-of-the-art SR results. Moreover, we show the application of the proposed SR method in low bit-rate image coding, which improves the rate-distortion performance of JPEG. Intuitively, the proposed SR method and the low bit-rate coding framework can be easily extended to other image and video compression standards, e.g., JPEG 2000, H.264, and HEVC. In addition, this work provides some insights on the SR of low quality LR images (e.g., noisy and blurry), which will attract other researchers to concern this kind of problems.

However, due to the high complexity of training process and the lake of high performance computing devices, the parameters of the proposed framework are not well optimized, such as the number of layers and filters, the size of kernels, etc. In future, we will study on the settings of main parameters, which may lead to better performance and lower complexity.

Acknowledgment

Funding: this work was supported by the National Natural Science Foundation of China [grant number 61471248]; and the National Postdoctoral Program for Innovative Talents of China [grant number BX201700163]; and the Post-Doctoral Research and Development Foundation of Sichuan University [grant number 2017SCU12003].

The authors would like to thank the authors of [34, 40, 41, 49, 54, 61, 62, 63, 64] for providing their codes.

References

References

  • [1] X. Li, M. T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process. 10 (10) (2001) 1521-1527.
  • [2]

    X. Zhang, X. Wu, Image interpolation by adaptive 2-d autoregressive modeling and soft-decision estimation, IEEE Trans. Image Process. 17 (6) (2008) 887-896.

  • [3] Z. Wei, K. K. Ma, Contrast-guided image interpolation, IEEE Trans. Image Process. 22 (11) (2013) 4271-4285.
  • [4] W. Dong, L. Zhang, G. Shi, X. Li, Sparse representation based image interpolation with nonlocal autoregressive modeling, IEEE Trans. Image Process. 22 (4) (2013) 1382-1394.
  • [5] Y. Romano, M. Protter, M. Elad, Single image interpolation via adaptive nonlocal sparsity-based modeling, IEEE Trans. Image Process. 23 (7) (2014) 3085-3098.
  • [6] F. Cao, M. Cai, Y. Tan, Image interpolation via low-rank matrix completion and recovery, IEEE Trans. Circuits Syst. Video Technol. 25 (8) (2015) 1261-1270.
  • [7]

    J. J. Huang, W. C. Siu, T. R. Liu, Fast image interpolation via random forest, IEEE Trans. Image Process. 24 (10) (2015) 3232-3245.

  • [8] W. Yang, J. Liu, M. Li, Z. Guo, Isophote-constrained autoregressive model with adaptive window extension for image interpolation, IEEE Trans. Circuits Syst. Video Technol. 2016. DOI: 10.1109/TCSVT.2016.2638864.
  • [9] S. Zhu, B. Zeng, L. Zeng, M. Gabbouj, Image interpolation based on non-local geometric similarities and directional gradients, IEEE Trans. Multimedia. 18 (9) (2016) 1707-1719.
  • [10]

    J. Sun, J. Sun, Z. Xu, H. Y. Shum, Image super-resolution using gradient profile prior, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2008, pp. 1-8.

  • [11] L. Wang, H. Wu, C. Pan, Fast image upsampling via the displacement field, IEEE Trans. Image Process. 23 (12) (2014) 5123-5135.
  • [12] Q. Yan, Y. Xu, X. Yang, T. Q. Nguyen, Single image super-resolution based on gradient profile sharpness, IEEE Trans. Image Process. 25 (5) (2016) 2168-2183.
  • [13] W. Dong, L. Zhang, G. Shi, X. Wu, Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image Process. 20 (7) (2011) 1838-1857.
  • [14] W. Dong, L. Zhang, G. Shi, X. Li, Nonlocally centralized sparse representation for image restoration, IEEE Trans. Image Process. 22 (4) (2013) 1620-1630.
  • [15] S. Mandal, A. Bhavsar, A. K. Sao, Noise adaptive super-resolution from single image via non-local mean and sparse representation, Signal Process. 132 (2017) 134-149.
  • [16] K. Zhang, X. Gao, D. Tao, X. Li, Single image super-resolution with non-local means and steering kernel regression, IEEE Trans. Image Process. 21 (11) (2012) 4544-4556.
  • [17] H. Chen, X. He, Q. Teng, R. Chao, Single image super resolution using local smoothness and nonlocal self-similarity priors, Signal Process., Image Commun. 43 (2016) 68-81.
  • [18] C. Ren, X. He, Q. Teng, Y. Wu, T. Q. Nguyen, Single image super-resolution using local geometric duality and non-local similarity, IEEE Trans. Image Process. 25 (5) (2016) 2168-2183.
  • [19] C. Ren, X. He, T. Q. Nguyen, Single image super-resolution via adaptive high-dimensional non-local total variation and adaptive geometric feature, IEEE Trans. Image Process. 26 (1) (2017) 90-106.
  • [20] W. Gong, Y. Tang, X. Chen, Q. Yi, W. Li, Combining edge difference with nonlocal self-similarity constraints for single image super-resolution, Neurocomputing. 249 (2017) 157-170.
  • [21] H. Chang, D. Y. Yeung, Y. Xiong, Super-resolution through neighbor embedding, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2004, pp. 275-282.
  • [22] M. Bevilacqua, A. Roumy, C. Guillemot, M. L. Alberi-Morel, Low complexity single-image super-resolution based on nonnegative neighbor embedding, in: Proceedings of the British Machine Vision Conference (BMVC), 2012, pp. 135.1-135.10.
  • [23] R. He, Z. Zhang, Locally affine patch mapping and global refinement for image super-resolution, Pattern Recognit. 44 (2011) 2210-2219.
  • [24] Z. Xiong, D. Xu, X. Sun, F. Wu, Example-based super-resolution with soft information and decision, IEEE Trans. Multimedia. 15 (6) (2013) 1458-1465.
  • [25] M. C. Yang, Y. C. Wang, A self-learning approach to single image super-resolution, IEEE Trans. Multimedia. 15 (3) (2013) 498-508.
  • [26] J. B. Huang, A. Singh, N. Ahuja, Single image super-resolution from transformed self-exemplars, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2015, pp. 5197-5206.
  • [27] T. Li, X. He, Q. Teng, X. Wu, Rotation expanded dictionary-based single image super-resolution, Neurocomputing. 216 (2016) 1-17.
  • [28] J. Yang, J. Wright, T. S. Huang, Y. Ma, Image super-resolution via sparse representation, IEEE Trans. Image Process. 19 (11) (2010) 2861-2873.
  • [29] R. Zeyde, M. Elad, M. Protter, On single image scale-up using sparse-representations, in: Proceedings of the International Conference on Curves & Surfaces, 2010, pp. 711-730.
  • [30] S. Wang, L. Zhang, Y. Liang, Q. Pan, Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 2216-2223.
  • [31] L. Shang, S. Liu, Y. Zhou, and Z. Sun, Modified sparse representation based image super-resolution reconstruction method, Neurocomputing. 228 (2017) 37-52.
  • [32] S. Gu, W. Zuo, Q. Xie, D. Meng, X. Feng, L. Zhang, Convolutional sparse coding for image super-resolution, in: Proceedings of the International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 1823-1831.
  • [33] R. Timofte, V. De Smet, L. Van Gool, Anchored neighborhood regression for fast example-based super-resolution, in: Proceedings of the International Conference on Computer Vision (ICCV), IEEE, 2013, pp. 1920-1927.
  • [34] R. Timofte, V. De Smet, L. Van Gool, A+: Adjusted anchored neighborhood regression for fast super-resolution, in: Proceedings of the Asian Conference on Computer Vision (ACCV), 2014, pp. 111-126.
  • [35] Y. Zhang, Y. Zhang, J. Zhang, Q. Dai, CCR: Clustering and collaborative representation for fast single image super-resolution, IEEE Trans. Multimedia. 18 (3) (2016) 405-417.
  • [36] W. Yang, Y. Tian, F. Zhou, Q. Liao, H. Chen, C. Zheng, Consistent coding scheme for single-image super-resolution via independent dictionaries, IEEE Trans. Multimedia. 18 (3) (2016) 313-325.
  • [37] C. Dong, C. C. Loy, K. He, X. Tang, Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2) (2016) 295-307.
  • [38] Y. Liang, J. Wang, S. Zhou, Y. Gong, N. Zheng, Incorporating image priors with deep convolutional neural networks for image super-resolution, Neurocomputing. 194 (2016) 340-347.
  • [39] D. Liu, Z. Wang, B. Wen, J. Yang, W. Han, T. S. Huang, Robust single image super-resolution via deep networks with sparse prior, IEEE Trans. Image Process. 25 (7) (2016) 3194-3207.
  • [40] C. Dong, C. C. Loy, X. Tang, Accelerating the super-resolution convolutional neural network, in: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 391-407.
  • [41] J. Kim, J. K. Lee, K. M. Lee, Accurate image super-resolution using very deep convolutional networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 1646-1654.
  • [42] Y. Li, J. Hu, X. Zhao, W. Xie, and J. Li, Hyperspectral image super-resolution using deep convolutional neural network, Neurocomputing. 266 (2017) 29-41.
  • [43] L. Wang, Z. Huang, Y. Gong, C. Pan, Ensemble based deep networks for image super-resolution, Pattern Recognit. 68 (2017) 191-198.
  • [44] J. Liu, W. Yang, X. Zhang, Z. Guo, Retrieval compensated group structured sparsity for image super-resolution, IEEE Trans. Multimedia. 19 (2) (2017) 302-316.
  • [45] H. Chen, X. He, L. Qing, Q. Teng, Single image super-resolution via adaptive transform-based nonlocal self-similarity modeling and learning-based gradient regularization, IEEE Trans. Multimedia. 19 (8) (2017) 1702-1717.
  • [46] J. Xiao, C. Wang, X. Hu, Single image super-resolution in compressed domain based on field of expert prior, in: Proceedings of the International Congress on Image and Signal Processing (CISP), IEEE, 2012, pp. 607-611.
  • [47] S. Ono, I. Yamada, Optimized JPEG image decompression with super-resolution interpolation using multi-order total variation, in: Proceedings of the International Conference on Image Processing (ICIP), IEEE, 2013, pp. 474-478.
  • [48] Z. Xiong, X. Sun, F. Wu, Robust web image/video super-resolution, IEEE Trans. Image Process. 19 (8) (2010) 2017-2028.
  • [49] L. W. Kang, C. C. Hsu, B. Zhuang, C. W. Lin, C. H. Yeh, Learning-based joint super-resolution and deblocking for a highly compressed image, IEEE Trans. Multimedia. 17 (7) (2015) 921-934.
  • [50] O. Lee, J. W. Lee, D. Y. Lee, J. O. Lee, Joint super-resolution and compression artifact reduction based on dual-learning, in: Proceedings of the International Conference on Visual Communications and Image Processing (VCIP), IEEE, 2016, pp. 1-4.
  • [51] Y. Zhao, W. Jia, L. Li, L. Cao, X. Liu, Filtered mapping based method for compressed web image super-resolution, IEEE Access. 5 (2017) 12682-12695.
  • [52] H. Zhao, O. Gallo, I. Frosio, J. Kautz, Loss functions for image restoration with neural networks, IEEE Trans. Comput. Imag. 3 (1) (2017) 47-57.
  • [53] K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a gaussian denoiser: residual learning of deep CNN for image denoising, IEEE Trans. Image Process. 26 (7) (2017) 3142-3155.
  • [54] C. Dong, Y. Deng, C. C. Loy, X. Tang, Compression artifacts reduction by a deep convolutional network, in: Proceedings of the International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 576-584.
  • [55] J. Guo, H. Chao, Building dual-domain representations for compression artifacts reduction, in: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 628-644.
  • [56] J. Sun, W. Cao, Z. Xu, J. Ponce, Learning a convolutional neural network for non-uniform motion blur removal, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2015, pp. 769-777.
  • [57] B. Cai, X. Xu, K. Jia, C. Qing, D. Tao, DehazeNet: An end-to-end system for single image haze removal, IEEE Trans. Image Process. 25 (11) (2016) 5187-5198.
  • [58] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 770-778.
  • [59]

    S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: Proceedings of the International Conference on Machine Learning (ICML), 2015, pp. 448-456.

  • [60]

    A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Proceedings of the Neural Information Processing Systems Conference (NIPS), 2012, pp. 1097-1105.

  • [61] J. Zhang, R. Xiong, C. Zhao, Y. Zhang, S. Ma, W. Gao, CONCOLOR: Constrained non-convex low-rank model for image deblocking, IEEE Trans. Image Process. 25 (3) (2016) 1246-1259.
  • [62] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600-612.
  • [63] H. R. Sheikh, A. C. Bovik, G. de Veciana, An information fidelity criterion for image quality assessment using natural scene statistics, IEEE Trans. Image Process. 14 (12) (2005) 2117-2128.
  • [64] C. Ma, C. Y. Yang, X. Yang, M. H. Yang, Learning a no-reference quality metric for single-image super-resolution, Comput. Vis. Image Understand. 158 (2017) 1-16.