1 Introduction
Due to constraints in device and sensor technology, an image may not be captured in the required high resolution Park2003. Moreover, for highspeed transmission, the captured image is sometimes downsampled to lower the bandwidth required wu2009low. The transmitted low resolution image is then required to be processed at the receiver to generate the image in higher resolution. Hence, approaches are needed to construct an image of the desired (higher) resolution from an image at lower resolution. Image superresolution systems are designed for this purpose Park2003. Superresolution has a critical role in applications where it is necessary to perform indepth analysis of local regions for detection and recognition purposes yue2016image.
Most superresolution strategies involve interpolation as a pivotal step to increase the resolution of an image. But interpolation is prone to artifacts such as blurring and ringing effects. Blurring artifact is due to attenuation at high frequencies which correspond to edges. The ringing artifact is due to the oscillation of signal at sharp edges because of the truncation of high frequency components Gonzalez2002.
Here, we discuss different interpolation techniques starting from the standard and wellknown ones to recently proposed techniques in different categories more or less in a chronological order. The standard and wellknown interpolation techniques are bilinear, bicubic and Lanczos, which introduce artifacts, especially at the edges Keys1981; Wolberg1990. Due to computational simplicity and fast performance siu2012review, these techniques are used by the standard photo editing software like Photoshop^{®}^{1}^{1}1https://helpx.adobe.com/in/photoshop/using/resizingimage.html. Moreover, in many recently developed super resolution techniques, the input image is initially interpolated kim2016accurate; kim2016deeply or the chrominance channels are interpolated Dong2013; yang2017deep using one of the basic techniques. To handle the artifacts in the interpolation, different edge directed interpolations are proposed in the literature Jensen1995; Li2001. The technique proposed in Jensen1995 interpolates based on an edge fitting operator to maintain visual integrity. The technique proposed in Allebach1996 interpolates based on the direction of edges, and consists of data correction and edge directed rendering operations. The new edge directed interpolation (NEDI) proposed in Li2001 preserves low resolution covariance values at the higher resolutions based on resolution invariant property Li2001 of edge orientation. Such a property ensures interpolation along the edge direction avoiding blurring. NEDI is one of the widely referred techniques in the domain of interpolation.
Discrete wavelet transform (DWT) mallat1999wavelet decomposes an input image into four frequency subbands having low frequency component, and horizontal, vertical and diagonal edge components, separately. So, DWT based interpolation techniques in Carey1999a; nguyen2000wavelet; Demirel2011; Tian2011 can handle the edges efficiently. Regularity preserving image interpolation proposed in Carey1999aestimates the regularity of edges by measuring the decay of wavelet transform coefficients across scales. The algorithm also preserves the underlying regularity by extrapolating new subbands to be used in the generation of the high resolution image. The DWTbased technique mentioned in Demirel2011 decomposes an input image at finer to coarser levels and the subbands generated at each level are interpolated for estimation. The estimated high frequency subbands using DWT are then modified by using high frequency subbands obtained through the stationary wavelet transform (SWT) nason1995stationary. Then, inverse DWT (IDWT) is applied to get a sharper high resolution image. Prediction of the wavelet coefficients and parameter optimization by ant colony optimization is proposed in Tian2011 for a DWTbased interpolation. In Tian2011, a threecomponent exponential mixture (TCEM) model is used to analyze magnitude and sign information of wavelet coefficients. Ant colony optimization (ACO) dorigo2010ant technique is used for optimal parameter estimation of the TCEM model.
Sparse representation model (SRM) for image superresolution is being widely explored Zhang2008; Mallat2010; Yang2010; Dong2011; Dong2013. The SRM based interpolation approach of Mallat2010, sparse mixing estimator (SME), interpolates by estimating a sparse image and by mixing directional interpolators over oriented blocks in a wavelet frame. SME computes mixing coefficients by minimizing norm which is weighted by signal regularity in each block. On the other hand, the sparse basis setbased technique of Zhang2008; Dong2011 learns a map from a relevant training set. Thus the missing information is retrieved or generated from training data. However, in Ebrahimi2007, the authors suggest extraction of information from the input image because it provides limited but more relevant information than that from a universal database Freedman2011. A selfexemplar SRM (without external training) is proposed in the NARMSRMNL approach of Dong2013
, where nonlocal autoregressive modeling is used. The earlier autoregressive model has been used
Zhang2008 to exploit the local similarity for interpolation, whereas, NARMSRMNL exploits both the local and nonlocal similarity in an adaptive manner and interpolates a pixel as a linear combination of its nonlocal neighbors. Although the algorithm promises improved performance, it suffers from a huge computational burden due to patch clustering and PCA subdictionary learning.Selfsimilarity based superresolution (SR) algorithms generate higher resolution images by exploiting statistical prior. These algorithms utilise the internal statistics of LRHR patch pair Freedman2011; Singh2014; Huang2015. The technique proposed in Freedman2011 searches extremely localized regions for the example patches. However, realistic reproduction of finedetailed cluttered regions is not achieved, and those regions appear somewhat faceted. Instead of searching for example patches from the input image, the technique proposed in Singh2014 searches for patches in different subbands of the input image. Recently, in Huang2015, the authors proposed a geometric patch transformation model which includes affine and perspective transformations during interpolation by patch matching. However, patch matching and dictionary learning based algorithms are always computationally expensive.
Recently, few end to end deep neural networks have been proposed for superresolution. SRCNN
dong2014learning; dong2016imageuses the convolutional neural network (CNN)
krizhevsky2012imagenet for image superresolution. The technique is later extended to DRCN kim2016deeply by increasing the number of layers of the network and incorporating recursive learning. The techniques proposed in kim2016accurate; yang2017deep use residual learning in deep and very deep networks. EDSR lim2017enhanced is an enhanced version of deep residual networks in singlescale architecture. The performance of EDSR is shown to be superior compared to other learning based superresolution approaches. The superresolution technique SRGAN ledig2017photo uses the generative adversarial network (GAN) goodfellow2014generative to generate perceptually high quality photorealistic natural output images. As stated in Singh2014; cruz2018single, such methods are well suited to perform on images of types similar to that in its training set.Restoration of edges (which is high frequency components) is crucial for the improvement of image visual quality shao2008edge; vishnukumar2014edge. Recently, authors in Acharya2017a proposed the socalled composite predictive technique to efficiently restore the high frequency and very high frequency components of the image. The technique extracts these components from the input low resolution image by exploiting the local statistics of an image. Later, in Acharya2017, the authors proposed a technique where fuzzy local adaptive Laplacian postprocessing is used to enhance high frequency details in upsampled images for interpolation.
The above discussion shows that image structure information at the desired high resolution image is related to that in the low resolution image. Such a relation is considered in many recent interpolation approaches as structural similarity deng2014structural. A higher resolution image can be generated by modeling the relation between its structure information and that in its low resolution image (given input). Such a modeling may not assume structural similarity. The relationship in structure information must be exploited, as it is the only relevant information about the high resolution image available from the input image for constructing the high frequency details.
In this paper, we propose a technique that performs interpolation by infusing high frequency signal components computed leveraging ‘process similarity’. For this, we consider the decomposition of an image at a higher resolution into a lower resolution image and high frequency details. Process similarity refers to the similarity between such a decomposition process of the image at one resolution and that at another resolution. This process similarity allows us to model the relation between structure information of the input low resolution image and that of the desired high resolution image. The modeling can be based on the available relation between structure information of the input image and its low resolution approximation obtained by the decomposition (See Figure 2). To accomplish this, discrete wavelet transform (DWT) and stationary wavelet transform (SWT) are used. The high frequency components and lower resolution approximations obtained through the said transforms are used to get the structural relation between the input image and its low resolution approximation. The structural relation is represented by model parameters obtained through particle swarm optimization (PSO). The parameters are then used to fuse the high frequency components that are subsequently employed to estimate the input image from its low resolution approximation. The PSO works to minimize the error between the actual input image and its estimate. Once the model parameters are obtained, they are then used to generate the desired high resolution image from the input image, just like the estimation process of input image from its low resolution approximation. Note that, the structural relation modeling in our approach should not be considered as a modeling under the wellknown concept of selfsimilarity yang2010exploiting in super resolution. Presence of a structural relation between an image at two different resolutions does not necessarily mean that their structures are similar, that is, structural relation does not imply structural selfsimilarity.
As mentioned earlier, the concept of process similarity allows modeling of structural relation based on the input image and its low resolution approximation. Therefore, the structural information used for interpolation is obtained from the input image itself, which is more likely to carry relevant information required to generate the image at higher resolution compared to a database of images similar to the input image. Moreover, the high frequency components generated for the infusion are directional sensitive (vertical, horizontal and diagonal). To incorporate location consistency and value exactness (See Section 2), our approach considers both DWT and SWT in the modeling. These are the reasons why our technique when compared to the existing, produces results having lower artifacts, and improves both the subjective and objective resemblance between the output high resolution and the input low resolution images. Finally, the structural relation is represented by model parameters obtained through computationally inexpensive particle swarm optimization. With respect to the algorithms compared, our approach is less expensive in terms of memory and speed requiring only multiscale decompositions up to two levels, multiscale reconstruction and optimization using PSO. We consider comparisons of our approach based on visual inspection as well as PSNR, SSIM, Wang2004 and FSIM Zhang2011 measures with the widely accepted edge directed NEDI Li2001, nonlocal autoregressive modeling based NARM Dong2013, recent selfexemplar based SRTSE Huang2015, the latest fuzzy predictive composite scheme (FPCSLAL) Acharya2017
, and deep learning based EDSR
lim2017enhanced and SRGAN ledig2017photo. A preliminary version of this work has been published in dhara2017, where a few other existing approaches were considered for result analysis. They, in general, perform inferior to the existing approaches considered here for comparison. As discrete and stationary wavelet transforms, and particle swarm optimization are the core methods employed in our approach, a brief summary of them is provided in A and B.The overall contributions of the paper are as follows :

We propose acrossscale process similarity based on discrete wavelet transform (DWT) and stationary wavelet transform (SWT).

We model the structural relation between the input image and its low resolution approximation by exploiting the complementary nature of the DWT and SWT, and using particle swarm optimization (PSO).

Our approach is the fastest (in terms of CPU time) while producing satisfactory results.
Rest of the paper is organized as follows. The motivation of our proposal is explained in Section 2, our proposed technique is elaborated in Section 3 and the comparative results of interpolation for superresolution using various techniques including the proposed approach are given in Section 4. Finally, Section 5 concludes the paper.
2 Motivation
A superresolution system comprises of different submodules, namely interpolation, deblurring, artifact removal, denoising, etc. In the system, the pivotal step is interpolation which estimates values for increasing the resolution of the input image Park2003. Interpolation is an illposed problem as there is no unique solution for the estimation Dong2013. Different techniques in literature gather and utilize information relevant to the content of the high resolution image to be generated in different ways.
Symbol  Description 
Image  
Scaling function  
Wavelet function  
k times the resolution of an image X  
Scaling factor  
M, N  M is the row size and N is the column size of an image 
or  Approximation component after wavelet decomposition 
or  Vertical details after wavelet decomposition 
or  Horizontal details after wavelet decomposition 
or  Diagonal details after wavelet decomposition 
Optimal weights as model parameters 
To perform interpolation, in this paper, we consider human nature regarding anticipation in perception Bubic2010. Humans expect and thus predict a certain entity by projecting the knowledge about a contextually similar entity from their past experience. In our interpolation approach, we consider the ‘past experience’ to be the process that estimates the low resolution input from its lower resolution approximation. The approach performs the ‘projection of past experience’ by using the aforesaid process to generate the unknown higher resolution image from the low resolution input. Owing to the said human nature, it would not be inappropriate to assume that a human would do interpolation in a manner similar to which we consider to design our algorithm.
Now, according to analysis and synthesis of signals using wavelets Gonzalez2002; Daubechies1990 (A) an image of size at a higher scale (say ) can be generated through inverse wavelet transform (defined in equation (1)) when we have its approximation (LL), vertical details (LH), horizontal details (HL), and diagonal details (HH) in scale.
(1) 
where is the scaling function and , and are wavelet functions in horizontal, vertical and diagonal direction, respectively, of scale. The approximation (LL) and the detail coefficients in horizontal (HL), vertical (LH) and diagonal (HH) directions are generated by wavelet decomposition of the signal as shown in (1) using the scaling and wavelet functions.
(2) 
(3) 
If the aforesaid wavelet based analysis and synthesis is considered for interpolation, then, the input low resolution image can be decomposed into its immediate lower resolution approximation (LL) using (2) and high frequency details using (3). Now, one can assume that only the lower approximation is available for synthesizing back the input image, and not the high frequency details. In such a case, one can try to estimate the unavailable high frequency details from the input image’s lower approximation. The estimated high frequency details can then be used in (1) to synthesize an image, which will be an estimate of the input image. The estimation of the unavailable high frequency details should obviously be done to minimize the error between the actual input image and its estimate. Such an estimation process can be considered analogous to modeling the ‘past experience’. Thus the ‘projection of past experience’ will be analogous to using the process to estimate the input image from its lower approximation for generating (or estimating) the desired high resolution image from the input low resolution image. That is, the input image should be considered as the lower resolution approximation (LL) of the desired high resolution image and the derived estimation model of unavailable high frequency details should be used to compute the high frequency details required to generate the desired high resolution image. With the high frequency details estimated, (1) can be used for the aforesaid generation, achieving interpolation.
The approach described above is what we employ in our technique and refer to it as the process similarity based interpolation. In the term, process signifies the ‘past experience’ (available knowledge) and similarity allows the ‘projection of past experience’.
Note that, the use of the process similarity by us is possible due to the resemblance in wavelet decomposition processes at different scales Carey1999a. The fundamental idea behind our proposal is that the higher resolution image is generated from a lower resolution image in a similar way irrespective of the scale. This can be depicted from the illustration shown in Figure 1. Let, (size: MN) denotes the resolution of the input image . So, the output image having the resolution (size: 2M 2N) is assumed to be generated from the image of resolution in a similar way to the generation of the image having resolution from the image of resolution (size: M/2N/2). It is obvious that the input image at resolution and the image at resolution are readily available to optimally model the estimation () of image at from the image at . Here, we consider input image () of resolution as the ideal one to be estimated from the image of resolution to capture and minimize the loss of information during interpolation. Once the modeling is performed, it can be applied recursively to generate an image of any higher resolution (size: MN, =, ) from a given input image of resolution . Note that, by definition, resolution and size of an image may not correspond to each other. As assuming resolution and size to be the same does not affect our analysis, we use them interchangeably for simplicity.
As mentioned earlier, we consider both the discrete wavelet transform (DWT) and the stationary wavelet transform (SWT) in modeling the process similarity. SWT is considered because of its scaleinvariance property nason1995stationary. This allows us to have exact correspondence between spatial locations in an image and its SWT decomposed approximation (LL). Scaleinvariance property does not hold for DWT decomposition, as it involves downsampling. On the other hand, an application of DWT on an image yields an approximation (LL) with faithful value representation. This is because the basis functions used for DWT decomposition are not only similar across scales Carey1999a but also work at the same translations across scales Gonzalez2002; Daubechies1990. This is evident from the wellknown recursion expression relating DWT basis functions at two consecutive scales:
(4) 
where is scaling coefficient, and we see that a translation of the basis function is related to a translation of the downsampled basis function , signifying equal amount of translation. Such a correspondence does not hold for SWT decomposition.
Note that, although the above discussion of SWT and DWT considers the decomposed approximation (LL), the same holds for the decomposed detail coefficients (HL, LH, and HH) as well. Thus, for SWT decomposition, location correspondence between an image and its decomposed coefficients is accurate. However, the values in the decomposed coefficients might not faithfully represent that in the image. So, we can consider that the location correspondence is our wanted component and the inexact value as the unwanted one in SWT usage. For DWT decomposition, it is the other way around, and hence, we can consider that the location inconsistency is our unwanted component and the value faithfulness as the wanted one.
3 Proposed process similarity based interpolation
Our image interpolation approach uses process similarity (See Section 2) to model the structural relation between scales in the image under consideration. The model is then used to generate the image at the required higher resolution. As elaborated in Section 1, acrossscale subband selfsimilarity as in Singh2014; piao2007image must not be confused with acrossscale process similarity, as the latter gives structural relation whereas the former assumes subband similarity. Our proposed algorithm is shown in Figure 2. It consists of two modules: modeling and generation. In the modeling stage, the input low resolution image is used to estimate a model capturing the structural relation between scales in the form of certain optimal weights . In the generation stage, these optimal weights are used to generate the higher resolution image from the input low resolution image. The modules of the proposed technique are elaborated below.
3.1 Module 1: Modeling
3.1.1 Part 1: High frequency components from discrete (DWT) and shiftinvariant (SWT) wavelet decomposition
As shown in Figure 2, in the proposed approach, DWT decomposes the signal to half () the input resolution . Four subbands, approximation image LL, vertical details LH, horizontal details HL, and diagonal details HH are obtained with the latter three being the high frequency subbands. Further, the approximation LL is decomposed by both SWT and DWT which will generate subbands at resolutions and , respectively. The corresponding high frequency subbands at these two different resolutions are weighted by parameters and then fused using acrossscale addition to generate three components (LH, HL, and HH), which are used in parameter optimization. For acrossscale addition, upsampling followed by Gaussian filtering (zerophase filter) of the lower resolution () components is performed before elementwise addition at the higher resolution ().
The parameter values determined from the optimization (See Section 3.1.2) are tuned to model the structural relation due to process similarity. That is, we consider components at resolutions and which are used to estimate the image at , similar to how we will use components at and to estimate the desired image at resolution . Recursive implementation of this operation can generate an image at a desired high resolution with scaling factor =, .
The model parameters (weights ) are optimized to generate the LH, HL, and HH components at resolution such that error between the generated image at and the input image is minimized. Obviously, the image at is generated using the optimal LH, HL, and HH along with the LL at through IDWT.
3.1.2 Part 2: Parameter optimization
As mentioned in Section 2, our acrossscale process similarity based high resolution image generation process uses value faithfulness provided by DWT and location invariance provided by SWT. We use DWT and SWT to get three high frequency subbands that are obtained through acrossscale addition.
During acrossscale addition, the three components each from DWT and SWT are weighted by model parameters which would represent the structural relation between scales. Among the model parameters, the weights correspond to the three components from DWT and the weights correspond to those from SWT. To represent the structural relation between scales appropriately, these model parameters need to be optimized (). To do so, as mentioned in Section 3.1.1, we first decompose the input image into lower resolution () components and then use the components to optimally get an estimate of the input image. The optimization minimizes the error between the actual input image at resolution and its estimate. Particle swarm optimization (PSO) is a standard and widely used evolutionary optimization algorithm for various applications zhang2015comprehensive. PSO is very efficient in reaching an optimal solution utilizing limited memory, and faster computation kirchmaier2013swarm with ‘primitive mathematical operations’ chen2010saliency. Moreover, the optimization algorithm does not require any prior information about the solution. Global and local explorations of the particles ensure effective and faster optimization. The simple yet robust algorithm can optimize the solution through parallel processing. Therefore, we use PSO (as discussed in B) whose details are given in Section 4.1.
Note that the presence of acrossscale process similarity allows these optimal weight parameters to represent the structural relation. A lower optimal weight value corresponding to a component from DWT suggests that the error contributed by location correspondence inaccuracy in it is dominating the correctness contributed by its value faithfulness. A higher optimal weight value would signify the opposite. On the other hand, a lower optimal weight value corresponding to a component from SWT suggests that the error contributed by value inexactness in it is dominating the correctness contributed by its accurate location correspondence. Again, a higher optimal weight value would signify the opposite.
Therefore, the optimal weights essentially represent the suitability of each of the six high frequency bands that are used to generate the image with resolution from the image with resolution . Due to process similarity, as elaborated under the generation stage, we use these weights to generate an unavailable higher resolution image from the input image at resolution .
3.2 Module 2: Generation
In the generation stage, the model parameters which represent the structural relation, will be used on the input image to generate the image at the desired higher resolution. The approach employs the optimal weights determined in the modeling stage (See Section 3.1). The generation block of our algorithm is shown in Figure 2 within the dashed (  ) box. As mentioned earlier, our high resolution image generation process uses DWT and SWT.
Initially, during the generation, the input image of resolution is decomposed by DWT and SWT operation. We have four subbands LL, LH, HL, and HH of resolution and each due to DWT and SWT, respectively. Due to the presence of process similarity, we now use the optimal weights determined in the modeling stage to get the high frequency components to be used in the generation. During fusion by acrossscale addition of corresponding high frequency components from DWT and SWT, the six subbands are multiplied with the corresponding optimal weights. The fusion generates high frequency components LH, HL, and HH at resolution . Then IDWT is applied, where the available input image acts as the approximation LL and the three estimated components LH, HL, and HH as the required high frequency details.
Further, to generate an image of a desired resolution , where =, ), one would require to apply the image generation stage of the proposed interpolation approach times recursively.
4 Experimental Results and Discussions
Experiments are performed to compare the proposed method with six existing methods and the results are reported here. The performance of our approach is compared to that of edge directed NEDI Li2001, nonlocal autoregressive modeling based NARM Dong2013, selfexemplar based SRTSE Huang2015, fuzzy predictive composite scheme (FPCSLAL)Acharya2017, and deep learning based EDSR lim2017enhanced, SRGAN ledig2017photo. EDSR is a recent technique which performs well for a variety of images and represents the stateoftheart. The other five approaches are popular in their respective categories, with a couple of them (SRGAN, FPCSLAL) being recently reported ones. The comparisons here are made both quantitatively and qualitatively in a multifaceted analysis, where the computation time is also considered. Note that, the codes of NARM, SRTSE, FPCSLAL, EDSR, and SRGAN were provided by the corresponding authors. We have used the trained models corresponding to the particular decimations for both EDSR and SRGAN as exactly provided by the corresponding authors.
4.1 Implementation details:
Our algorithm is based on wavelet decomposition, and hence it is required to choose a wavelet function (/scaling function). As our approach analyzes the representation of an image in its approximation and detail high frequency components to perform interpolation, we choose two different wavelet functions for the experiments, which differ substantially in representing local regularity of a signal. We consider Daubechies wavelet with support 4 (db2) and discrete approximation of Meyer filter (support is 102), where the latter excels in approximating signals with low local regularity/smoothness. The proposed method using Daubechies wavelet is referred here as ‘proposed (Daub)’ and that using discrete Meyer wavelet as ‘proposed (Meyer)’. Note that, the proposed (Daub) and proposed (Meyer) follow exactly the same approach (See Figure 2) except for the wavelet used in decomposition and reconstruction.
Our algorithm also uses PSO where we need to provide initialization and stopping criteria. We empirically observed that stopping with the maximum number of iterations as 15 after initializing the population with 15 randomly generated particles in 6dimensional space (, [0 1]) works efficiently to produce at least nearoptimal solution. Our PSO is designed for early termination. It stops if there is no change in the fitness value of the best solution in consecutive runs after the number of iterations is greater than 7. The minimum number of iterations is considered as 7 to allow sufficient exploration of the solution space . PSNR is considered as the fitness/objective function whose maximization results in minimization of error between the desired image and the generated image . This optimization function is as follows
(5) 
The source code of our approach can be downloaded from here^{2}^{2}2https://drive.google.com/open?id=1jHw9r6gvT7AC8EnmozBV4z4U8AvFnAOe.
4.2 Computation speed
The increasing developments in display technology for electronic gadgets and content delivery through data streaming demands faster interpolation for immediate display of content without compromising much on quality. Such reallife applications require an algorithm to get high resolution, good quality output as fast as possible. Therefore, as far as interpolation is concerned in today’s world, the computation time of an algorithm is still an important factor.

Technique (speed in seconds  
NEDI  NARM  SRTSE  FPCSLAL 



0.593  13.81  7.57  0.38  0.38 0.022 


2.68  64.16  34.57  4.23  0.54 0.014 


11.56  290.37  187.72  152.65  0.93 0.093 




Technique (speed in seconds)  
NEDI  SRTSE  FPCSLAL 



3.22  22.00  4.36  1.23 0.108 


13.70  113.18  148.65  1.78 0.036 


54.65  534.77  3072  5.79 0.201 



So, in this paper, we consider the computation time as an essential parameter to judge the efficiency of an algorithm. We use an Intel^{®}i5(3.30 GHz) machine with 16 GB RAM and use MATLAB^{®}(version 2016a) for this purpose. Our target is to perform fast interpolation which generates a high resolution image exploiting process similarity at low resolution without introducing any visual artifact. To calculate the computation speed, we have considered the 16, images from USC SIPI Miscellaneous gray scale image database^{3}^{3}3http://sipi.usc.edu/database/, and downsampled them to generate images of , and sizes. We consider gray scale images of above mentioned three sizes as we perform an analysis of computation speed variations with the change in 2D matrix size (image size). The average computation time (CPU time) of all the algorithms over 16 images of a particular size are presented in Table 2 and 3
for scaling factors 2 and 4, respectively. As the proposed approach is based on heuristic optimization, the computation time may slightly vary due to early termination and random initialization. So, for our algorithm, we have considered 5 runs for each image to present the average computation time (CPU time) (over 16
5 runs) along with the standard deviation, which is negligible. For NARM, we have shown the computation time (CPU time) only for scaling factor 2 as the code
^{4}^{4}4http://www4.comp.polyu.edu.hk/ cslzhang/NARM.htm provided does not have an implementation of factor 4. Deep learning based algorithms such as EDSR, and SRGAN require a considerable number of low resolution (LR)high resolution (HR) image pairs for learning, which (CPU time) consumes considerable amount of time in graphical processing units (GPUs) cruz2018single. Moreover, the implementations of EDSR and SRGAN by their authors (trained networks) also require GPUs to generate the output. As a fair comparison of computation times between approaches implemented in CPU and GPU is not possible, we do not include the same.From Table 2, it is evident that FPCSLAL and proposed (Daub) algorithm have the least computation time (CPU time) of 0.38 seconds for an input image of yielding an output of size . If the input size increases, our proposed (Daub) becomes substantially faster compared to the other algorithms compared here. Our proposed (Meyer) algorithm is slightly slower than the proposed (Daub) algorithm, but much faster when compared with other interpolation techniques presented in the table. Table 3, where the scaling factor is 4, shows that for an input image of size to , our proposed algorithms are much faster than all the existing algorithms compared here. So, our proposed approach is most suited among all the algorithms here when time is an important factor of application.
It is to be noted that the faster speed achieved by our algorithms is not by utilizing large memory space. This is because memory is occupied only by the multiscale decompositions up to two levels, and the input and output images. In addition, a little memory is required Poli2007 for the PSO particles and parameters.
4.3 Subjective Evaluation:
Computation time (CPU time) comparison above shows that our proposed algorithms are considerably faster than the other algorithms and the difference increases in favor of our algorithms with the increase of the input image size and scaling factor.
With the above observation, it becomes important to analyze and compare the interpolation performance. Before we go into quantitative evaluation, where decimation/downsampling of images from databases would be required, we present qualitative evaluation here. For qualitative analysis, we consider images from databases as input images for interpolation, and generate twice and four times larger images by the various interpolation approaches. For this experiment, we consider USC SIPI Miscellaneous image database (24 gray images and 16 color images). While the entire set of results is shared online^{1}^{1}1https://drive.google.com/file/d/1XENfISl1tSlOOJsWAR5FxIHnOSl0s/view?usp=sharing/, we consider a few examples here to summarize the generic observations on interpolation by the various approaches.
In Figure 3, we show the subjective comparison on a gray scale image considering the scaling factor to be 4. The cropped portion of the whole image is shown here for better visualization^{2}^{2}2Visualization is appropriate when looked at 100% on a computer screen. The observations made later may not be visible in print.. We see that NEDI’s output is smoother than all the others, which results in blurring and distortions especially at regions having changing gray values. SRTSE, FPCSLAL, and EDSR produce much sharper outputs than NEDI but are prone to ringing artifacts at regions of drastic change and appearance of the nonexistent pattern in homogeneous regions. SRGAN produces artifacts which look like fixed pattern noise. Our approach produces negligible amount of the artifacts mentioned above, but are prone to some amount of jaggies at the boundary regions. As can be seen, the overall impression from the results is that the proposed algorithms perform better than the others. In Figure 4, we show another such (scaling factor 4) subjective comparison now on a color image. We again see here the artifacts mentioned above in the results by NEDI, SRTSE, FPCSLAL, EDSR, SRGAN and our proposed algorithms. Further, we can see that the artifacts in the results of the proposed algorithms is less pronounced compared to that in the others.
In Figures 3 and 4, although both the proposed algorithms do better than the others and also contain some amount of jaggies artifact, there are differences in their results. We have observed in our experiments that the jaggies introduced by the proposed algorithms are mainly due to aliasing, but with different characteristics. As can be seen, in Figure 3, the effect of aliasing by the proposed (Meyer) algorithm is concentrated at the midfrequencies (like texture regions). On the other hand, aliasing by the proposed (Daub) algorithm is spread across all frequencies in small amount and is pronounced at high frequencies (sharp edges, see Figures 2(g) and 3(g)).
Further, in Figures 5 and 6, we show subjective comparison considering scaling factor 2 on gray and color images, respectively. Some of the observations similar to Figures 3 and 4 such as NEDI’s smoother output and appearance of the nonexistent pattern in outputs of SRTSE, FPCSLAL, SRGAN, and EDSR are evident upon close observation. NARM, like NEDI, also produces a smooth output. However, upon quick observation it is hard to find a substantial difference between all the results. In such a situation, it is obvious that one should prefer an algorithm with the least computation time (CPU time), which is, our proposed (Daub) algorithm.
Now, let us consider a situation where it is possible for a user to choose an algorithm for an image. The following may be done based on the subjective evaluation of interpolation performance. In such a scenario, the proposed (Daub) algorithm should be chosen for images having a good amount of contents with changing pixel values (mid frequency) like textures. Whereas, the proposed (Meyer) algorithm should be employed if the changes in an image are mostly drastic (high frequency) and it has a good amount of homogeneous regions (low frequency). However, if for any input image, it is not possible for a user to choose an algorithm among proposed (Daub) and proposed (Meyer) for generating the output, then proposed (Daub) algorithm should automatically be chosen because of

The lower computation time (CPU time).

The fact that natural images contain very less high frequency contents than low and mid frequency ones.
4.4 Quantitative Evaluation:
Unlike subjective evaluation, to evaluate the methods quantitatively, we need a known target/reference based on which the generated high resolution image can be assessed. Therefore, we generate the low resolution input images considering the actual images in databases as the desired high resolution images.
As stated in Dong2013, low resolution images to be used as the inputs are usually obtained by decimation (smoothing followed by downsampling) of those images that are considered as the desired high resolution images. As stated in Demirel2011; Tian2011, wavelet approximation filter is used for the smoothing, as said in ledig2017photo, Gaussian filter is used for smoothing, as stated in Li2001; Mallat2010 and Dong2011 only downsampling without smoothing is considered. The smoothing in the decimation can be achieved using different filters with different properties. However, an interpolation algorithm might be particularly suited to a kind of smoothing used to produce the low resolution input for quantitative evaluation. This could happen when the interpolation uses operators/filters consistent with a particular kind of decimation.
Therefore, we choose to treat the kind of smoothing used in the decimation as a parameter. To evaluate an algorithm’s quantitative performance corresponding to an image, we calculate the PSNR, SSIM and FSIM values for five types of inputs obtained through the following operations

Bicubic approximation(Bicubic)

Wavelet filter approximation (Daubechies)

Wavelet filter approximation (DMeyer)

Gaussian filtering followed by downsampling (Gaussian)

Downsampling without any smoothing (Subsampling)
For quantitative comparisons, experiments are carried out using standard USC SIPI Miscellaneous image database (24 gray images and 16 color images), BSD image database (100 color images), SET 5 (5 color images) and SET 14 (14 color and gray scale images) Huang2015. These images offer different types of challenges for an interpolation algorithm to deal with. As PSO based optimization in our algorithm can yield slightly different results in different runs, like in Section 4.2, we consider the average performance over 5 runs of our algorithm on each image. These average performances are obtained for all the images in a database, and the average over all these images are reported. The standard deviation quantifying variation in average performance over all images in a database during the 5 runs is found to be in the order of dB (PSNR), and hence it is not reported individually. For all the different cases, the quantitative results are listed in tables with the best performing result in bold.

Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  27.60  27.78  31.24  30.41  32.07  30.83  29.27  28.09  
Daubechies  28.25  29.14  29.15  29.31  29.22  24.14  30.24  29.39  
DMeyer  28.95  30.00  27.05  27.35  26.96  26.31  29.09  30.68  
Gaussian  28.55  29.12  27.94  27.91  27.99  27.45  29.11  29.93  
Subsampling  28.06  28.96  24.70  25.30  24.40  28.40  27.46  28.02  
Average  28.28  29.00  28.02  28.06  28.13  27.43  29.03  29.22  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.7972  0.8013  0.8867  0.8714  0.8995  0.8859  0.8427  0.8216  
Daubechies  0.8280  0.8433  0.8543  0.7607  0.8619  0.7076  0.8786  0.8587  
DMeyer  0.8337  0.8540  0.8066  0.8166  0.8108  0.7834  0.8526  0.8765  
Gaussian  0.8204  0.8271  0.8238  0.8208  0.8293  0.8133  0.8494  0.8578  
Subsampling  0.8340  0.8495  0.7330  0.7581  0.7316  0.8407  0.8244  0.8218  
Average  0.8223  0.8350  0.8208  0.8055  0.8266  0.8062  0.8495  0.8472  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.9342  0.9365  0.9694  0.9618  0.9747  0.973  0.9479  0.9381  
Daubechies  0.9451  0.9520  0.9699  0.9658  0.9711  0.9075  0.9724  0.9503  
DMeyer  0.9527  0.9616  0.9286  0.9313  0.9296  0.9267  0.9558  0.9616  
Gaussian  0.9437  0.9500  0.9352  0.9337  0.9365  0.9338  0.9576  0.9578  
Subsampling  0.9426  0.9491  0.9093  0.9103  0.9092  0.9649  0.9335  0.9361  
Average  0.9437  0.9499  0.9424  0.9405  0.9442  0.9412  0.9534  0.9487  

Tables 4, 5 and 6 show comparative results considering images from USC SIPI Miscellaneous gray scale image database in terms of PSNR, SSIM, and FSIM, respectively, for the scaling factor of 2. Table 4 shows that among all the algorithms compared on the basis of PSNR, EDSR performs the best for input generated by bicubic approximation, proposed (Daub) algorithm for input generated by Daubechies wavelet approximation, the proposed (Meyer) algorithm for input generated by Meyer wavelet approximation and Gaussian filtering followed by downsampling, and NARM for input generated by only downsampling.
But on an average over all types of inputs, the performance of our proposed approach is superior. PSNR value of proposed (Meyer) and proposed (Daub) algorithms are respectively 0.22 dB and 0.03 dB higher than the next best, NARM. Tables 5 and 6 respectively considering SSIM and FSIM also show that different algorithms perform the best for different input types, with the proposed approach being the best for Daubechies and Meyer wavelet approximations, and Gaussian filtering followed by downsampling. On an average over all types of inputs, our proposed (Daub) algorithm yields slightly better results.

Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  30.60  31.13  36.48  34.92  37.82  35.59  33.18  31.12  
Daubechies  31.23  33.25  33.85  33.75  33.95  32.69  34.90  33.22  
DMeyer  32.49  35.12  30.09  30.26  29.75  29.11  32.59  35.47  
Gaussian  32.11  33.74  31.24  31.00  31.18  30.64  32.67  34.37  
Subsampling  32.00  34.56  28.34  28.76  27.94  27.42  31.64  33.11  
Average  31.66  33.56  32.00  31.74  32.13  31.09  33.00  33.46  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  27.63  27.84  31.12  30.55  32.17  31.22  29.53  28.16  
Daubechies  28.55  29.37  29.12  29.18  29.07  28.53  30.10  29.31  
DMeyer  29.22  30.29  26.95  27.24  26.7  26.46  28.82  30.70  
Gaussian  28.74  29.31  27.92  27.91  27.96  27.74  28.85  29.82  
Subsampling  28.78  29.39  24.59  25.16  23.99  24.01  27.51  28.00  
Average  28.54  29.24  27.94  28.00  27.98  27.59  28.97  29.20  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  30.33  30.49  36.46  35.19  37.97  35.54  33.48  30.65  
Daubechies  28.95  32.73  33.41  33.13  33.13  31.71  34.35  32.72  
DMeyer  32.8  34.93  29.63  29.75  29.37  28.78  31.98  35.44  
Gaussian  32.15  33.27  30.65  30.56  30.69  30.35  31.93  33.97  
Subsampling  32.58  34.79  28.05  28.35  27.44  26.95  31.00  33.07  
Average  31.36  33.24  31.64  31.40  31.72  30.66  32.55  33.17  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic 
27.59  27.94  32.29  31.28  33.30  31.90  29.94  28.03  
Daubechies  28.43  29.81  28.47  29.89  28.06  28.91  30.76  29.71  
DMeyer  29.42  31.18  26.99  27.29  26.63  26.31  26.24  31.52  
Gaussian  28.95  30.00  28.01  27.96  28.03  27.74  29.09  30.42  
Subsampling  28.94  30.51  25.05  25.56  24.19  24.14  27.96  29.09  
Average  28.67  29.89  28.16  28.40  28.04  27.80  28.80  29.75  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8803  0.8847  0.9359  0.9274  0.9431  0.9247  0.9124  0.8927  
Daubechies  0.9001  0.9119  0.9195  0.9220  0.9257  0.8988  0.9278  0.9157  
DMeyer  0.9066  0.9192  0.8817  0.8872  0.8817  0.8521  0.9082  0.9273  
Gaussian  0.8975  0.9045  0.8954  0.8929  0.8986  0.8818  0.9079  0.9185  
Subsampling  0.9062  0.9188  0.8351  0.8505  0.8327  0.7912  0.8931  0.8955  
Average  0.8981  0.9078  0.8935  0.8960  0.8964  0.8697  0.9099  0.9100  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.7873  0.7938  0.8856  0.8740  0.899  0.8843  0.8509  0.8264  
Daubechies  0.8217  0.8421  0.8556  0.8574  0.8634  0.8429  0.8716  0.8544  
DMeyer  0.8331  0.8562  0.8064  0.8135  0.8058  0.7838  0.8449  0.8784  
Gaussian  0.8112  0.8218  0.8172  0.8136  0.8224  0.8109  0.8351  0.8518  
Subsampling  0.8353  0.8511  0.7323  0.7558  0.7235  0.7063  0.8257  0.8141  
Average  0.8177  0.8330  0.8194  0.8228  0.8228  0.8057  0.8457  0.8457  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic 
0.8935  0.8956  0.9532  0.944  0.9594  0.9398  0.9306  0.9041  
Daubechies  0.8869  0.9248  0.9333  0.93  0.9343  0.9045  0.9388  0.9283  
DMeyer  0.9212  0.9346  0.8908  0.893  0.8887  0.8548  0.9179  0.9423  
Gaussian  0.9115  0.9184  0.9073  0.9043  0.9097  0.8916  0.9173  0.9332  
Subsampling  0.9224  0.9386  0.8479  0.8591  0.8376  0.8005  0.9049  0.9148  
Average  0.9071  0.9224  0.9065  0.9061  0.9059  0.8782  0.9219  0.9245  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8135  0.8229  0.9042  0.8928  0.9153  0.8955  0.8722  0.8359  
Daubechies  0.8459  0.8671  0.8533  0.8780  0.8442  0.8552  0.8896  0.8741  
DMeyer  0.8572  0.8806  0.8257  0.8324  0.8218  0.7950  0.7231  0.8950  
Gaussian  0.8393  0.8527  0.8397  0.8361  0.844  0.8274  0.8560  0.8744  
Subsampling  0.8573  0.8783  0.7586  0.7810  0.7431  0.7162  0.8431  0.8497  
Average  0.8426  0.8603  0.8363  0.8441  0.8337  0.8179  0.8368  0.8658  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.9407  0.9422  0.9814  0.9727  0.9866  0.9807  0.9631  0.9466  
Daubechies  0.9570  0.9627  0.9719  0.9706  0.9729  0.9656  0.9765  0.9634  
DMeyer  0.9658  0.9731  0.9398  0.9418  0.9399  0.9329  0.9596  0.9753  
Gaussian  0.9578  0.9622  0.9487  0.9463  0.9499  0.9461  0.9608  0.9704  
Subsampling  0.9640  0.9704  0.9225  0.9266  0.9234  0.9149  0.9516  0.9587  
Average  0.9570  0.9621  0.9528  0.9516  0.9546  0.9481  0.9517  0.9629  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8682  0.8731  0.9400  0.9283  0.9503  0.9441  0.9114  0.9063  
Daubechies  0.8966  0.9080  0.9188  0.9192  0.9224  0.9148  0.9288  0.9169  
DMeyer  0.9051  0.9189  0.8820  0.8888  0.8815  0.8745  0.9092  0.9289  
Gaussian  0.8815  0.8890  0.8959  0.8925  0.8967  0.8973  0.9033  0.9143  
Subsampling  0.9161  0.9264  0.8490  0.8634  0.8458  0.8407  0.9040  0.8885  
Average  0.8935  0.9031  0.8971  0.8984  0.8993  0.8943  0.9113  0.9110  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.9257  0.9262  0.9748  0.9604  0.9822  0.9739  0.9499  0.9307  
Daubechies  0.9344  0.9503  0.9554  0.9488  0.9572  0.9432  0.9599  0.9483  
DMeyer  0.9498  0.9586  0.9173  0.9166  0.9164  0.9029  0.9408  0.9574  
Gaussian  0.9407  0.9454  0.9370  0.9320  0.9381  0.9347  0.9438  0.9548  
Subsampling  0.9577  0.9677  0.8979  0.9013  0.8954  0.8827  0.9390  0.9422  
Average  0.9417  0.9496  0.9456  0.9318  0.9379  0.9275  0.9467  0.9467  


Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.9174  0.9212  0.973  0.9637  0.9783  0.9727  0.9501  0.9286  
Daubechies  0.9406  0.9489  0.9562  0.9602  0.9325  0.9530  0.9684  0.9523  
DMeyer  0.9520  0.9628  0.9224  0.9243  0.9209  0.9168  0.8756  0.9684  
Gaussian  0.9382  0.9454  0.9302  0.9278  0.9312  0.9295  0.9454  0.9578  
Subsampling  0.9489  0.9580  0.8990  0.9067  0.8953  0.8926  0.9354  0.9459  
Average  0.9394  0.9472  0.9362  0.9366  0.9316  0.9329  0.9350  0.9506  

The computationally intensive algorithms proposed in Yang2010; Dong2013, generate high resolution color images, by applying their proposed interpolation technique on the luminance component and a simple interpolation technique like the cubic keys1981cubic method to generate chromatic component^{5}^{5}5This is evident from the codes provided as well. As our approach is computationally fast, we leverage this advantage and apply our interpolation to all the channels. The benefits of our approach are thus reflected in all the three channels.
Tables 7  18 show comparative results in terms of PSNR, SSIM, and FSIM considering scaling factor of 2 for the images from the USC SIPI Miscellaneous color and BSD, SET 5, and SET 14 databases. Here, we see the technique performing the best for different kinds of inputs is similar, that is, EDSR performs the best for input generated by bicubic approximation, proposed (Daub) algorithm for input generated by Daubechies wavelet approximation, proposed (Meyer) algorithm for input generated by Meyer wavelet approximation (except FSIM in SET 5) and NARM for input generated by only downsampling. For Gaussian, the best performing algorithm is our proposed (Meyer). On an average over all types of inputs, the performance of our algorithm is comparable with the best result (that of NARM) in terms of PSNR. However, our algorithms’ average performance is the best in terms of SSIM for all databases and also in terms of FSIM except for SET 5 database.
Tables 19  33 show the comparative results of PSNR, SSIM and FSIM similar to that of Tables 4  18, except that the scaling factor is 4 instead of 2. Here, we find that different techniques perform the best for different kinds of inputs. On an average over all types of inputs, the performance of our algorithm is the best for all the databases except for USC SIPI Miscellaneous gray database in terms of PSNR. But in terms of SSIM and FSIM, our techniques’ performance is comparable with the best which is one of the algorithms among SRTSR, EDSR, and SRGAN.
The above quantitative analysis shows that on an average performance over all types of inputs considered here, our proposed algorithms in terms of PSNR, SSIM and FSIM perform either the best or near to the best considering different databases containing color and gray scale images, and different scaling factors. Among the proposed algorithms, we do not find enough evidence here to choose one over the other. But one can choose the proposed (Daub) algorithm based on the observations made during qualitative analysis.
So, through the computation time (CPU time), qualitative and quantitative analyses, we find that the proposed approach not only consistently produces the best or nearbest results but also does so in the least time (CPU time). Moreover, the lower computation time (CPU time) becomes more significant with the increase in the input image size and scaling factor.
Note that, our proposed approach suffers from a few drawbacks due to the use of heuristic optimization, PSO. Due to the stochastic behavior of PSO, our approach can yield slightly different results in different runs. Although the performance for quantitative evaluation shows that there can be a very slight variation from the average values, the algorithms may not produce the best result possible by them in a particular run. Moreover, the computation time may slightly vary in each run due to the early termination. In addition, as mentioned already, although our approach produces negligible amount of artifacts than the other algorithms, it is prone to appearance of some amount of jaggies at the boundary regions.
4.5 Auxiliary analysis
We provide a couple of additional analysis of the proposed approach. The first analysis is regarding the fact that the proposed approach is designed to generate images considering any scaling factor of the form , . When one desires a scaling factor , which cannot be expressed as , we suggest that a value of is considered such that . Then, the interpolated image of larger size can be decimated by using any simple and fast symmetric spline spath1995two approximation.
We show the results obtained by performing such an interpolation of scaling factor 3 using the proposed (Daub) algorithm in Figure 7. As we see for both the gray and color images, no additional artifacts have been introduced in the images of Figures 6(b) and 6(d) due to the decimation using bicubic approximation applied on Figures 6(a) and 6(c), respectively.
The second analysis here is regarding further improvement in computation time (CPU time) of the proposed approach. In this analysis, we show that by compromising a little in the interpolation performance of the proposed approach, results can be obtained much faster. Consider Table 34, which shows computation times for proposed (Daub) algorithm similar to those discussed in Tables 2 and 3. However, the computation times in Table 34 have been obtained by altering the proposed (Daub) algorithm’s PSO based optimization, where instead of zero change in fitness value during iteration, a change dB is checked for termination. As can be seen, comparing the computation times in Table 34, and Tables 2 and 3, there is about average improvement in the case where a scaling factor is 2 and about average improvement when a scaling factor is 4. With the alteration in the proposed (Daub) algorithm, it was observed that the average interpolation performance over all images in the USC SIPI Miscellaneous gray scale database was reduced by only around dB (in terms of PSNR) when input images were obtained by Daubechies wavelet filter approximation.
The final analysis provided here is on performance efficiency. Considering the USC SIPI Miscellaneous gray image database, Table 35 lists the average performance for the scaling factor 2 of different approaches along with the average time (CPU time) taken to achieve them. Doing so, it sheds some light on the rate of quantitative performance with respect to computational expenses. As can be seen, the proposed approach is the most efficient one, achieving the best average performance in the least average time.
4.6 Domainspecific Interpolation
In this section, we aim to analyse the performance and applicability of our proposed approach for domain specific interpolation. For this, we consider interpolation of human face and text images.
Human face images are images where a wide range of frequencies exist just like many other natural images. As we have shown earlier that our proposed algorithm performs well for natural images (five databases), we expect it to do well in face images as well. For the experiment on face images, we consider the standard gray scale face images from Jaffe^{6}^{6}6http://www.kasrl.org/jaffe.html (213 images) database and the three color face images from SET5 database.
To quantitatively evaluate the performance, we calculate the PSNR, SSIM and FSIM values for five types of inputs generated through the five approaches mentioned in Section 4.4. We compare results of our approach with the two best performing existing algorithms found in our earlier quantitative and subjective evaluations, which are EDSR and SRGAN. We present the average PSNR, SSIM and FSIM values over the five types of inputs for scaling factors of 2 and 4 in Tables 3639. The results show that our proposed approach performs the best in terms of PSNR and is as good as or better than the two best performing algorithms EDSR and SRGAN in terms of SSIM and FSIM. For subjective evaluation, we have shown the performance on cropped versions of a gray scale image taken from the Jaffe database and a color image taken from the SET5 database. Figures 8 and 9 show the subjective evaluation of the two images for scaling factors of 2 and 4, respectively. It shows that our proposed approach generates outputs as good as or better than the other two.
On the other hand, text images are images which mainly have very low and very high frequencies alone unlike most natural images. As pointed out in walha2016resolution, text images have sudden discontinuity, regularity in fine pattern and continuity of same intensity values along a particular direction. As our proposed approach is not targeted to work specifically on such special images, we do not expect it to perform well in text images. For an analysis, we have taken cropped versions of handwritten and machine printed images from standard IAM^{7}^{7}7http://www.fki.inf.unibe.ch/databases/iamhandwritingdatabase/downloadtheiamhandwritingdatabase database to perform subjective comparison. Figures 10 and 11 show the interpolated output for scaling factors of 2 and 4, respectively. Interestingly, we find qualitatively that the performance of our algorithm is comparable to that of the two best performing existing algorithms EDSR and SRGAN even in the case of text images.

Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  26.11  27.11  26.54  27.92  27.22  25.92  26.08  
Daubechies  26.39  26.24  26.19  26.40  26.09  26.47  26.29  
DMeyer  26.30  25.76  26.03  25.79  25.54  26.15  26.70  
Gaussian  25.92  26.29  25.92  26.54  26.26  25.89  26.13  
Subsampling  25.28  21.85  23.05  21.40  21.76  23.69  23.81  
Average  26.00  25.45  25.54  25.61  25.37  25.62  25.80  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.6532  0.7291  0.7082  0.7565  0.7352  0.6880  0.6927  
Daubechies  0.6677  0.7213  0.7084  0.7439  0.7199  0.7201  0.6956  
DMeyer  0.6633  0.7030  0.6967  0.7158  0.6929  0.6860  0.7103  
Gaussian  0.6414  0.7101  0.6867  0.7231  0.711  0.6709  0.6858  
Subsampling  0.6624  0.5810  0.6194  0.5872  0.5749  0.6485  0.6320  
Average  0.6596  0.6889  0.6839  0.7053  0.6868  0.6827  0.6833  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8553  0.9186  0.8955  0.9303  0.9257  0.8791  0.8783  
Daubechies  0.8686  0.9019  0.8928  0.9110  0.9078  0.8985  0.8848  
DMeyer  0.8695  0.8998  0.8937  0.9044  0.8961  0.8940  0.8938  
Gaussian  0.8418  0.9048  0.8847  0.9125  0.9056  0.8768  0.8750  
Subsampling  0.8673  0.8208  0.8383  0.8214  0.8223  0.8544  0.8420  
Average  0.8605  0.8892  0.8810  0.8959  0.8915  0.8801  0.8748  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  27.9  30.96  30.00  32.37  30.99  28.78  29  
Daubechies  28.08  29.40  29.19  29.36  28.91  29.46  29.58  
DMeyer  28.03  28.67  28.90  28.33  28.00  29.26  30.04  
Gaussian  27.61  29.43  28.82  29.94  29.31  29.00  29.14  
Subsampling  27.07  25.21  26.3  24.29  24.54  27.31  27.42  
Average  27.74  28.73  28.64  28.86  28.35  28.76  29.04  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  25.34  26.85  26.39  27.58  27.02  25.84  25.91  
Daubechies  25.58  26.02  26.03  26.06  25.92  26.18  26.21  
DMeyer  25.57  25.69  25.90  25.51  25.42  26.06  26.49  
Gaussian  24.69  26.09  25.82  26.44  26.12  25.90  25.95  
Subsampling  24.51  21.69  22.91  20.83  21.42  23.73  23.69  
Average  25.14  25.27  25.41  25.28  25.18  25.54  25.65  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  27.09  30.21  29.26  31.98  30.57  27.96  28.21  
Daubechies  27.40  28.6  28.46  28.7  28.32  28.79  29.06  
DMeyer  27.26  27.77  28.12  27.51  27.24  28.66  29.48  
Gaussian  26.73  28.51  28.00  29.16  28.58  28.17  28.31  
Subsampling  26.64  24.34  25.62  23.12  23.63  26.76  26.94  
Average  27.02  27.89  27.89  28.10  27.67  28.07  28.40  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  23.47  27.39  26.51  28.33  27.49  25.65  25.83  
Daubechies  25.34  26.16  26.05  25.90  25.82  26.26  26.30  
DMeyer  25.20  25.47  25.81  25.19  24.99  26.16  26.62  
Gaussian  24.82  26.27  25.70  26.58  26.15  25.83  25.88  
Subsampling  24.21  19.93  23.03  20.54  21.12  24.01  23.97  
Average  24.61  25.04  25.42  25.31  25.11  25.58  25.72  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.7895  0.8493  0.8289  0.8697  0.8399  0.8104  0.8127  
Daubechies  0.8003  0.8358  0.8229  0.8497  0.8174  0.8232  0.8180  
DMeyer  0.7950  0.8162  0.8098  0.8207  0.7823  0.8064  0.8201  
Gaussian  0.7825  0.8287  0.8118  0.8416  0.8199  0.8096  0.8123  
Subsampling  0.7964  0.7392  0.7598  0.7323  0.6937  0.7857  0.7718  
Average  0.7927  0.8138  0.8067  0.8228  0.7906  0.8070  0.8070  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.6400  0.7121  0.6923  0.7362  0.7124  0.6738  0.6735  
Daubechies  0.6531  0.7008  0.69132  0.7204  0.6965  0.6914  0.6853  
DMeyer  0.6495  0.6878  0.6827  0.6959  0.6684  0.6782  0.6941  
Gaussian  0.6145  0.6838  0.6686  0.7000  0.6829  0.6677  0.6677  
Subsampling  0.6495  0.5816  0.6105  0.5735  0.5615  0.6444  0.6266  
Average  0.6413  0.6732  0.6690  0.6852  0.6643  0.6711  0.6694  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.7792  0.8593  0.8272  0.8048  0.8598  0.8029  0.8088  
Daubechies  0.7917  0.8388  0.8167  0.8566  0.8229  0.8173  0.8195  
DMeyer  0.7823  0.8147  0.8012  0.8173  0.7790  0.8029  0.8220  
Gaussian  0.7658  0.8297  0.8074  0.8524  0.8291  0.8033  0.8075  
Subsampling  0.7910  0.7260  0.747  0.7044  0.6803  0.7848  0.7694  
Average  0.7820  0.8137  0.7999  0.8071  0.7942  0.8022  0.8054  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.6260  0.7518  0.7243  0.7785  0.7496  0.7009  0.7025  
Daubechies  0.6878  0.7353  0.7207  0.7506  0.7231  0.7218  0.7135  
DMeyer  0.6826  0.7159  0.7083  0.7208  0.6866  0.7072  0.7205  
Gaussian  0.6609  0.7169  0.6997  0.7379  0.7176  0.6979  0.6970  
Subsampling  0.6845  0.5321  0.6398  0.5928  0.5743  0.6776  0.6539  
Average  0.6684  0.6904  0.6986  0.7161  0.6902  0.7011  0.6975  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8751  0.9301  0.9092  0.9465  0.9364  0.8948  0.8961  
Daubechies  0.8874  0.9171  0.9051  0.9257  0.9181  0.9054  0.9027  
DMeyer  0.8865  0.8995  0.8958  0.9039  0.8924  0.8951  0.9067  
Gaussian  0.8701  0.9117  0.8971  0.9215  0.9161  0.8948  0.8968  
Subsampling  0.8888  0.8500  0.8639  0.8466  0.8358  0.8816  0.8754  
Average  0.8816  0.9017  0.8942  0.9088  0.8998  0.8943  0.8955  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.7435  0.8246  0.8038  0.8378  0.8361  0.7874  0.7870  
Daubechies  0.7611  0.8286  0.8165  0.836  0.8351  0.8063  0.8052  
DMeyer  0.7643  0.8204  0.8128  0.8243  0.8205  0.8024  0.8107  
Gaussian  0.7200  0.7985  0.7795  0.8048  0.8088  0.7733  0.7721  
Subsampling  0.7932  0.7764  0.8031  0.7588  0.7677  0.8158  0.8080  
Average  0.7564  0.8093  0.8031  0.8123  0.8137  0.7970  0.7966  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.838  0.9004  0.7963  0.8682  0.9203  0.8582  0.8607  
Daubechies  0.8502  0.8870  0.8700  0.9012  0.8897  0.8685  0.8700  
DMeyer  0.8455  0.8713  0.8605  0.8798  0.8658  0.8599  0.8722  
Gaussian  0.8317  0.8817  0.8613  0.9038  0.8967  0.8576  0.8595  
Subsampling  0.8660  0.8236  0.8406  0.8067  0.8049  0.8641  0.8538  
Average  0.8463  0.8728  0.8457  0.8719  0.8755  0.8617  0.8632  


Technique  
NEDI  SRTSE  FPCSLAL  EDSR  SRGAN 



Bicubic  0.8010  0.9005  0.8745  0.9172  0.9097  0.8622  0.8605  
Daubechies  0.8465  0.8874  0.8743  0.8917  0.8879  0.8753  0.8716  
DMeyer  0.8482  0.8712  0.8665  0.8725  0.8646  0.8706  0.8771  
Gaussian  0.8190  0.8595  0.8554  0.8833  0.8809  0.8546  0.8530  
Subsampling  0.8539  0.7665  0.8291  0.7907  0.7915  0.8539  0.8414  
Average  0.8337  0.8570  0.8600  0.8711  0.8669  0.8633  0.8607  


Input Image Size (Time in Seconds)  







Computational Efficiency  Technique  
NEDI  NARM  SRTSE  FPCSLAL  EDSR  SRGAN 



Avg. Time(sec)  14.36  403  210  526  GPU  GPU  1.29  3.76  
Avg. PSNR  28.28  29  28.02  28.06  28.12862  27.42518  29.03  29.22  
Avg. SSIM  0.8223  0.835  0.8208  0.8055  0.826608  0.806168  0.8495  0.8472  
Avg. FSIM  0.9437  0.9499  0.9424  0.9405  0.944217  0.941171  0.9534  0.9487 
5 Conclusion
The paper proposes an approach that performs interpolation exploiting acrossscale ‘process similarity’ in wavelet decomposition. The first stage of our proposed approach is the modeling stage, where a model capturing the structural relation between scales is estimated. The model is in the form of optimal weights obtained through particle swarm optimization (PSO). The weights are used to fuse the high frequency components that are subsequently employed to estimate the input image from its low resolution approximation. The second stage is the generation stage, where these optimal weights modeling the structural relation between scales are used to generate the higher resolution image from the input image. In this stage, the approach considers the input image as the low resolution approximation of the output high resolution image to be generated.
We find through qualitative analysis that our process similarity based interpolation approach produces results with fewer artifacts. Quantitative analysis shows that on an average, our proposed approach, in terms of PSNR, SSIM and FSIM perform either the best or nearbest considering different databases containing color and gray scale images, and different scaling factors. The major advantage of our proposed approach is that it is very fast in terms of CPU computation time, much faster than all the algorithms compared.
The proposed approach has been implemented by performing both DWT and SWT based on the findings that they complement each other where the former provides value faithfulness and the latter provides accurate location correspondence. The implementation of our approach has been performed considering two different wavelet functions (Daubechies wavelet with support 4 and discrete approximation of Meyer filter with support 102) yielding two proposed algorithms. Subjective and quantitative comparisons between these algorithms have shown one of them to be marginally more preferable.
The modeling performed in this paper employing process similarity is based on wavelet decomposition. In future, process similarity can be explored as an independent concept beyond the use of wavelet decomposition.
Measure  Technique  
EDSR  SRGAN 



PSNR  30.37  28.93  30.75  30.93  
SSIM  0.8352  0.7833  0.8417  0.8392  
FSIM  0.9358  0.9152  0.9384  0.9376 
Measure  Technique  
EDSR  SRGAN 



PSNR  33.00  31.59  33.64  34.11  
SSIM  0.9367  0.8688  0.9139  0.9162  
FSIM  0.9579  0.9402  0.9593  0.9602 
Measure  Technique  
EDSR  SRGAN 



PSNR  27.41  27.10  27.59  27.85  
SSIM  0.7516  0.7234  0.7508  0.7506  
FSIM  0.8780  0.8780  0.8730  0.8740 
Measure  Technique  
EDSR  SRGAN 



PSNR  28.91  28.88  29.64  29.94  
SSIM  0.8447  0.7904  0.8101  0.8134  
FSIM  0.8998  0.8981  0.8960  0.8972 
6 Acknowledgment
Authors would like to thank Mr. Debanjan Sengupta and Ms. Anusha Vupputuri, Ph.D. students at IIT Kharagpur, for their help in completing this article. The authors would also like to thank the anonymous reviewers, as their contribution to bring the article to its best version is enormous.
References
References
Appendix A Discrete and Stationary Wavelet Transforms
Wavelet transform mallat1999wavelet is a powerful multiresolution analysis (MRA) mallat1989theory, designed to obtain the components of input signal at every location in different scales /resolutions. The discrete wavelet transform (DWT) mallat1989theory works on discrete signals to produce discrete coefficients quantifying the said components. A discrete signal can be represented as follows
(6) 
With unique real valued expansion coefficients , a acts as a basis function over which the signal is represented. In other words, given all the s and the corresponding set of basis functions, the signal can be generated.
In the field of waveletbased analysis and synthesis, the basis functions are of two types, known as scaling function and wavelet function. The scaling functions defined as s satisfy certain properties. In fact, , where , respectively represent scale and translation, and .
One property satisfied by scaling functions, which is the heart of discrete wavelet transform, is
(7) 
where and s are known as scaling function coefficients. Note that the above expression relates s at consecutive scales. Another function called the wavelet function () is also defined, which is related to the scaling function as follows
(8) 
(9) 
where s are known as wavelet function coefficients. Note that, the above expression relates and at consecutive scales with being at the finer scale. In such condition, any function can be decomposed to approximation coefficients (low frequency component) with scaling function and detail coefficients (high frequency component) with wavelet function. They are collectively referred to as the wavelet coefficients.
Now let us consider a discrete signal having M samples such that . So, is the highest (finest) possible scale and is the lowest (coarsest) possible scale. As the input signal is at the scale, it is decomposed into approximation and detail coefficients at scale as follows
(10) 
(11) 
where and denote the approximation and detail coefficients, respectively. The decomposed coefficients are combined in the following way for exact reconstruction.
(12) 
To generate the approximation and detail coefficients, the following convolutions and downsampling can be performed employing scaling and wavelet function coefficients.
(13) 
(14) 
Just as the scaling and wavelet function coefficients are used for the analysis above, they can be used for synthesis as well to reconstruct the signal. The diagram 12 depicts the one level decomposition (analysis) followed by reconstruction (synthesis) procedure.
The procedure applied on can be further applied on the approximation coefficients at scales till , if required, to obtain their approximation and detail coefficients. This suggests that is considered the approximation coefficient at the finest scale, that is, .
As the presence of downsampling and upsampling results in loss of translation invariance, in stationary wavelet transform (SWT) nason1995stationary, exactly the same procedure of decomposition is followed, but the downsampling and upsampling are not performed.
The above discussion considers signals of single dimension (e.g. time series). To handle 2dimensional signals like images, scaling and wavelet functions are required for both rowwise and columnwise operations. This leads us to
(15) 
where and are the two dimensional scaling and wavelet functions, respectively, which are assumed to be separable functions. Note that the wavelet functions are directionally sensitive as well. The wavelet decomposition and reconstruction for a 2dimensional signal is defined in (1), (2), (3) which are the 2dimensional extension of 1dimensional representations defined in (11),(12),(14) respectively.
Appendix B Particle Swarm Optimization
Particle swarm optimization (PSO) eberhart1995new acquires its inspiration from movement of the flock of birds searching for food and is very good at approximating solutions using relatively less sensitive and limited number of parameters compared to other such heuristic algorithms in a computationally inexpensive manner eberhart1995new; eberhart1998comparison; lee2006application; omran2004particle. Particle’s initial position and velocity are initialized either based on the prior knowledge of the problem or randomly. Movement of a particle in search space is evaluated based on two criteria. The first criterion is, personal experience, which is represented by the best value achieved by every individual particle and is termed as personal best (). The second criterion is, neighboring particle (including own) experience, which is represented by best value among all personal bests obtained so far in the population, and is known as global best (). In each iteration, position and corresponding velocity of each particle are updated based on its previous velocity, and .
The generic procedure of particle swam optimization (PSO) is shown in Algorithm 1. In a population of N particles, the corresponding particle position with respect to a Ddimensional search is defined as and velocity of the particle is defined as . The positions of particles in each dimension are initialized randomly in the range [ ] and the corresponding velocities in the range [ ]. A fitness function that optimizes desired criteria is employed. In every iteration, PSO computes the fitness of all solutions in the population, and updates and their . The updates take place only when the current and have higher fitness values respectively than the previous. Then they are used along velocity values to stochastically update the population (or position) to be employed in the next iteration. As mentioned by the authors eberhart1995new, the implementation of PSO requires only primitive mathematical operators making it computationally inexpensive in terms of memory and speed. But the technique may get trapped in local minima for a nonconvex problem especially with limited particles and iterations.
It is straightforward that in our use of PSO, the weights form 6dimensional search space from where the of Algorithm 1 takes values, and the fitness function is PSNR. In our algorithm, we have set the values of , as 2 and as 1.
Comments
There are no comments yet.