1 Introduction
In recent years we see an interesting trend, in which many image restoration algorithms choose to operate on local image patches rather than processing the image as a whole. These techniques impose statistical prior knowledge on the patches of the processed image. Surprisingly, in many cases these methods lead to stateoftheart results. For example, in the Gaussian denoising case, the algorithm presented in [3] performs denoising using a statistical model based on sparse representation of the image patches, training a dictionary using the KSVD algorithm. The BM3D algorithm reported in [4] exploits interrelations between patches by grouping them into 3D groups, and applying collaborative filtering on them that is based on sparse representation as well. The work reported in [5, 6, 7] extends PCA and dictionary learning, both in the context of patches for handling severe Poisson image denoising. The scheme reported in [8] adopts the sparse representation model to handle the single image superresolution problem. The papers [9, 10] both propose a GMM modeling of image patches, and demonstrate the effectiveness of this to variety of inverse problems. The NSCR method by Dong et. al. [11] uses sparse representation for solving both the superresolution and the deblurring problems. The IDDBM3D method in [12] employs BM3D frames for image deblurring. All these papers and many others rely on operating on patches in order to complete the restoration task at hand. Many works use sophisticated priors when operating locally, and most often they resort to a simple averaging when combining the restored patches.
The work reported in [1] takes another step toward exploiting interrelations between image patches in the image. The work reported in [1] proposes to construct a 1D smoothed signal by applying a permutation on the pixels of the corrupted image. The permutation is obtained by ordering image patches to form ”the shortest possible path”, approximating the solution of the traveling salesman problem (TSP). Given the sorted image, the clean image is recovered by applying a 1D filter on the ordered signal. This method is simple and yet it leads to highquality results. On the down side, it is limited to the Gaussian denoising and inpainting problems. The work reported in [2] takes an extra step in exploring the patchordering concept. This work constructs a sophisticated and very powerful regularizer by combining the patchpermutation idea with a wavelet pyramid [13, 14]. The obtained regularizer is used for solving general inverse problems in imaging. This method leads to highquality results in series of tests, however it is quite involved.
In this paper we propose to compose the whole image from local patches using a prior that exploits interrelation between them. We harness the patchpermutation idea, merging it with the classical Maximum A’posteriori probability (MAP) estimator, by proposing a new, simple, yet powerful regularization for inverse imaging problems. We formulate the inverse problem as a weighted sum of two penalty terms: (i) a regular negative loglikelihood, and (ii) a novel regularization expression that forces smoothness in a robust way, by basing it on reordered list of the image pixels, obtained according to the similarity of image patches.
For constructing the permutationbased regulatization, we follow the core idea presented in [1]. We rely on the assumption that in an ideal image, close similarity between patches indicates a proximity between their center pixels. We therefore build a 1D (piecewise) smoothed signal by applying a patchbased permutation on the restored (unknown) image pixels. The permutation is obtained by extracting all possible patches with overlaps from the currently recovered image, and ordering them to form the (approximated) shortest possible path. The resulting ordering induces the permutation.
The proposed regularization forces smoothness on the obtained 1D signal via a Laplacian, penalized by the robust norm. Since the ordering is associated with all the pixels withing the patches, the smoothness term is accumulated over all these pixels. We also deploy weights that take into account the actual distances between the consecutive patches in the ordering.
The proposed scheme is demonstrated on several different problems: (i) White additive Gaussian image denoising, (ii) severe Poisson image denoising, (iii) image deblurring, and (iv) single image superresolution. We initialize our algorithm with an output of a recent method that handles each of these problems. The reconstructed image is then obtained by minimizing the penalty function described above using the LBFGS method [15, 16]. Our extensive experiments indicate that applying the proposed scheme leads to stateoftheart results.
We should note that the proposed scheme bares some similarity to recent work offering regularization of inverse problems by utilizing the similarity between patches formed as a graph [17, 18, 19, 20, 21, 22, 23, 24, 25]. These works propose various formats of using this graph’s Laplacian as a sparsifying operator. Our approach could be considered as a special case of such a Laplacian regularization, which forces the graph to be a simple and continuous chained ordering of the image pixels. As such, the regularization we obtain is simpler and easier to manage (since we keep only one forward and one backward neighbors per each pixel). In addition, our approach also provides a stronger stabilizing effect for highly illposed problems since it ties all the pixels to each other.
The paper is organized as follows. In Section 2 we discuss the principles behind the permutation construction, and how it becomes useful as a regularizer. Section 3 describes the proposed algorithm, along with the numerical scheme used. Section 4 presents experiment results and compares the new method with other leading schemes. Section 6 concludes this paper and raises directions for a future work.
2 Constructing the permutation
A common assumption in image processing is that clean images are usually (piecewise) smooth, i.e. the difference between any two neighboring pixels tends to be small. Most image processing algorithms, be it for restoration, segmentation, compression, and more, rely on this model to some extent. The problem with this assumption, however, is that violation from this behavior is due to image edges and texture, and both are central in forming and defining the visual content of an image, and as such, they cannot be sacrificed as simple outliers.
Adopting a totally different perspective towards handling of an image, a convenient and often used technique for developing an image processing algorithm is to convert the 2D image into a 1D array, treat the image as a vector, and then convert the resulting 1D array back to a 2D array. There are several popular scan methods that convert an image to a vector, the prominent of which are raster scan, zigzag scan, Hilbert Peano and other spacefilling curves.
The question we pose now is this: how can we combine the two approaches mentioned above? Namely, given a corrupted image , how should we construct a 2Dto1D conversion that produces the smoothest possible vector when applied to a clean image, . Such a conversion would be extremely helpful for image restoration tasks, because in the 1D domain its processing is expected to be very simple and effective. We should emphasize that the sought conversion method must be robust, i.e. be able to produce a meaningful ordering even when operated on a corrupted data.
A word on notations: The 2Dto1D conversion we are seeking is denoted by , and represented by a permutation matrix . I.e., applying to columnstacked image , , produces a vector with the reordering .
In this work we build on the ideas presented in [1], while giving it a new and novel interpretation. In order to propose such a smoothing conversion, we shall assume that each pixel of the clean image can be represented by the corresponding surrounding patch taken from the corrupted image . In other words, the assumption is that if two patches of the corrupted (or clean) image are close in terms of some distance function, then their corresponding central pixels in the clean image are expected to be similar as well. We will refer hereafter to this as the assumption. Throughout this work we shall use the Euclidean distance for assessing proximity between patches. We extract all possible patches of size with overlaps from the image , where . Our method refers to the patches as points in , defining a graph where the patches are its vertices and distances between them are the edges.
Recall that our aim is to find an ordering (or , its permutation matrix equivalent) such that the resulting permuted clean image is the smoothest. This can be defined as the following minimization task:
(1) 
where is a 1D difference operator (simple derivative). Interestingly, it is tempting to propose an ordering of the image patches in based on a simple sort of their center pixels, as it will lead to the ideal (smoothest) permutation. However, since we assume that is not available and instead we have , this option is impossible. Thus, the above objective is replaced by an alternative that relies on the assumption made above:
(2) 
This means that the graph vertices are ordered to form the shortest possible path that visits each vertex exactly once. Namely, we formulate the quest of as a classic Traveling Salesman Problem (TSP) [26].
A natural way to proceed is to use a known approximation algorithm that is known to perform well for the TSP. The problem with this approach is that the permutation induced from a toogood TSP solver is in fact creating and magnifying artifacts. An example for this behavior and the artifacts induced are shown in Figure 1, where we use the LinKernighan algorithm for approximating the TSP solution [27, 28], and then apply our recovery algorithm. We should note that the two orderings (our approach, as described in Algorithm LABEL:alg:reordering and LinKernighan) lead to an average total variation measure, , of and , respectively, when assessed on the original (true) image. This implies that the better ordering leads to % improvement in terms of the smoothness obtained, but this does not translate to better outcome in our algorithm.
The problem with a toogood TSP solver is that it adjusts the ordering to artifacts in the initial image, thereby creating correlation between the regularizer and the artifacts, which magnifies these artifacts.
The output of our scheme with LinKernighan heuristics or Algorithm
LABEL:alg:reordering for solving the TSP problem. The scheme is applied for the Gaussian denoising task with , and initialized with the BM3D result.In order to overcome this artifact magnification problem we use a randomized approach for the TSP solution. More specifically, we use the randomized version of NN (Nearest Neighbor) heuristics presented in [1] as a TSP solver. This algorithm starts from an arbitrary patch and continues by finding the closest two neighbors to the currently held vertex with the restriction that they have not been assigned yet, choosing one of them at random. At the th stage of the algorithm we have accumulated already vertices. Given the last of them, , we choose either or , its two closest neighbors, with probabilities and , where is chosen such as . The nearest neighbor search is performed from within the set of unvisited patches, and it is limited to a window of size around . If there is only one unvisited patch in this region, the heuristics chooses it. When no unvisited patches remain, the first and second nearest neighbor search is performed among all unvisited image patches. The square restriction is designed to reduce computation complexity; however it is important also for assuring that relevant patches are matched. The randomized NN heuristics is summarized in Algorithm LABEL:alg:reordering. An example of a reordered image is presented in Figure 2.
A drawback of the described reordering is it’s greediness nature. Indeed, as can be seen in Figure 2, the last part of the reordered clean image is not smooth. This is due to the fact that in the last stages of Algorithm LABEL:alg:reordering very few unvisited patches remain for the nearest neighbor search. While this may seem troubling, in section 5 we shall explain why this phenomenon has little effect on the final restored results.
3 The reconstruction algorithm
3.1 Constructing the Regularization Term
Our regularizer is constructed as a sum of several smoothness terms. A smoothness term is obtained by applying the matrices , and to the columnstacked image , , and penalizing the result by the robust norm:
(3) 
In this expression is the permutation matrix that represents the ordering obtained by the TSP solver. is a simple 1D Laplacian, and is a weighting diagonal matrix of the form , where
(4) 
The weights take into account the distances between the consecutive patches in the ordering. Intuitively we expect that the centers of the closer patches will be more similar. Therefore the weights are chosen to be inversely proportional to the norm of the 1DLaplacian of the corresponding patches. In other words, the coefficients are calculated using the following formula:
(5) 
where , and are the ordered patches. The boundary cases (, and ) are handled by setting , and . We note that if would have been defined as in Equation (5), but assuming patches of size pixels (i.e., only the center pixel is the actual patch), then our regularization simplifies to become the norm. The reason is that in this case hold simply the absolute values of the 1D Laplacian over the ordered pixels. The penalty term itself computes the very same Laplacian and divides by these values. Thus, all those ratios are ’1’es for nonzero values and ’0’ elsewhere, thus obtaining an norm. Thus, the term in Equation (3) seeks to sparsify the second derivative of the permuted image in a robust way.
The problem with the coefficients is that the distances between the patches which contain edges or texture are usually relatively high, and therefore these weights are relatively low. The coefficients that appear in Equation (4) are designed to overcome this problem, by magnifying the weights for edge/texture patches,
(6) 
where is a parameter to be set. In order to identify patches that contain edges or texture, our algorithm calculates the magnitude of image gradient using the central difference method. Namely, , the magnitude of the gradient at location (, ), is calculated using:
where and are the horizontal and vertical gradients at location (, ). If we denote by value of the pixel at location (, ), then the and are given by:
A patch is identified as active if the sum of its gradient magnitudes is above a threshold , i.e:
(7) 
Finally, the values of are clipped by in order to reduce the condition number of the operator MLP, thereby increasing the rate of convergence of the optimization problem.
3.2 Subimage Accumulation
In the smoothness term in Equation (3) the ordering is associated only with the central pixels of the patches. However, since the permutation originates from full patch ordering, it makes sense to associate the ordering with all the pixels withing the patches, as shown in Figure 3. Therefore, we propose to construct the regularizer as a sum of the smoothness terms for all pixels within the patches:
(8) 
The operator associates an (i,j)shifted subimage of with the ordering . The notion of subimages and their role here is depicted in Figure 4
. First, the image is padded using mirror reflection of itself with
pixels on all sides. Then extracts an subimage starting from the (i,j)th location in the padded image. For example: in Figure 2(a) we are referring to the location , while in Figure 2(b) it is .Actually, accumulating the smoothness terms over patch pixels increases the number of orderings from 1 to , and introduces an implicit spatial prior. This way the similarity is forced between the whole patches rather than only between their central pixels. An example of the output of our reconstruction scheme without subimage accumulation is shown in Figure 5. The resulting image seems very rugged due to the lack of a spatial smoothing prior.
The work reported in [1] suggests to further increase the number of the orderings by applying the TSP solver 10 times (each time starting it from a random patch ), this way creating 10 different permutation matrices. We explored this option by accumulating the regularization term in Equation (8) over a group of such permutations. For the regularization we employ here, our finding suggests that this approach has a negligible effect on the results.




3.3 Building the Objective Function
In order to build the objective function, we denote by the column stacked version of the corrupted image, by the desired image, and by the regularization term in Equation (8). Then, the inverse problem is formulated as a weighted sum of the negative loglikelihood term , and the regularizer . In order to keep values within the pixel boundaries we add to the objective function two penalties: and . Therefore, the reconstruction algorithm is generally formulated as the following minimization problem:
(9) 
where is defined as:
(10) 
Each component of is greater than zero when , and grows with the difference . The parameter controls the strength with which we enforce this penalty.
In order to use a derivative based optimization method for solving the obtained minimization problem, the energy function in Equation (9) is smoothed by replacing the absolute values with , when
(11) 
In other words, is replaced with its smooth version , and with , where:
(12) 
and
(13) 
The function is smooth and convex (in ), since
In all our simulations we solve the unconstrained smoothed optimization problem using LBFGS method [16] implemented in minFunc [15]. The minimization task runs approximately 200300 iterations. For images it takes around 12 minutes, and for images 510 minutes. Our simulations ran on Intel i7 core with 16GB RAM. The code of Algorithm 1 and parts of the code of the minFunc are implemented in C. The rest of the code is implemented in Matlab.
4 Experimental Results
4.1 Gaussian Denoising
For the Gaussian denoising task, the smooth unconstrained optimization problem is formulated as:
(14) 
We run denoising experiments for noise levels . Our scheme is initialized with the output of the BM3D algorithm [4], i.e. the ordering is calculated using the BM3D output. The simulation parameters are summarized in tables 2 and 1. In table 3 we bring quantitative results of these experiments. For each test we compare the PSNR achieved by our scheme with the one referring to the initial images and show the improvement. Note that we do not bring SSIM measure of quality in this table, simply because the conclusions these values lead to are the same as the ones drawn from the PSNR. Examples of qualitative results are shown in Figure 6. For and higher, we get an improvement in almost all experiments. For medium noise level, for example , our scheme does not succeed to improve the PSNR in most of the experiments, because the global patch ordering forces selfsimilarity on the image patches, and therefore it tends to eliminate distinctive details. In addition, our algorithm does not perform well on image areas that contain sensitive texture or high amount of edges. Examples of such areas are: striped pants in the Barbara image and friction ridges in the fingerprint image.
Along with the experiments described above we also performed the Gaussian denoising experiment (with ) while replacing the distance by the SSIM index. We calculated the SSIM measure of small patches using the algorithm presented in [29] with two differences: (i) we used Gaussian weighting function with
for computing the local statistics (mean, variance and covariance), instead of
Gaussian used in [29]; (ii) the local statistics of any pixel in the patch was calculated using only the pixels that belong to the patch. For example, for computing the local statistics of the topleft pixel of the patch, , we used of the Gaussian, i.e. pixels , where . In fact, full Gaussian window was used only for calculating the statistics of the central pixel of the patch. Unfortunately, this scheme led to performance deterioration when compared to the option. We believe that the reason for such behavior is this: SSIM ordering minimizes the distance between patches in terms of their mean, variance and covariance, while our regularizer penalizes for distance between pixels. Thus, in this case, the ordering and the regularizer are not consistent with each other. Clearly, this opens up an opportunity to redefine the regularizer to work in terms of the SSIM measure. We leave this as a future extension of our work.c  B  

1.5  20  3.5  1  7  121 
25  50  75  100  

/ PSNR  25 / 20.17  50 / 14.15  
BM3D  Our  Improvement  BM3D  Our  Improvement  
Cameraman  29.43  29.44  0.01  26.15  26.65  0.50 
House  32.88  33.05  0.17  29.64  30.21  0.56 
Peppers  30.23  30.38  0.15  26.70  27.09  0.39 
Montage  32.32  32.59  0.28  27.79  28.57  0.78 
Lena  32.06  31.96  0.10  29.01  29.13  0.12 
Barbara  30.68  30.39  0.30  27.23  27.15  0.08 
Boats  29.87  29.66  0.20  26.70  26.81  0.11 
Fprint  27.71  27.15  0.56  24.53  24.22  0.32 
Man  29.59  29.52  0.07  26.79  26.89  0.11 
Couple  29.69  29.68  0.01  26.45  26.61  0.16 
Hill  29.83  29.70  0.13  27.15  27.22  0.07 
Average  30.39  30.32  0.07  27.10  27.32  0.22 
/ PSNR  75 / 10.63  100 / 8.13  
BM3D  Our  Improvement  BM3D  Our  Improvement  
Cameraman  24.36  25.01  0.65  23.10  23.73  0.63 
House  27.48  28.18  0.69  25.90  26.66  0.77 
Peppers  24.71  25.13  0.42  23.27  23.68  0.41 
Montage  25.39  26.36  0.97  23.75  24.71  0.97 
Lena  27.19  27.41  0.21  25.85  26.14  0.29 
Barbara  25.15  25.20  0.05  23.65  23.74  0.09 
Boats  25.01  25.17  0.16  23.85  24.03  0.18 
Fprint  22.82  22.64  0.19  21.59  21.50  0.10 
Man  25.29  25.44  0.15  24.21  24.38  0.17 
Couple  24.71  24.86  0.14  23.54  23.65  0.11 
Hill  25.63  25.76  0.13  24.57  24.71  0.14 
Average  25.25  25.56  0.31  23.93  24.27  0.33 
4.2 Debluring
For the debluring task the smooth unconstrained optimization problem is the following:
(15) 
where is a blur matrix.
We repeat the experiments reported in [12]. The blur parameters for each experiment, PSF (point spread function) and of the noise, are summarized in table 4. The PSFs are normalized so that . Our scheme is initialized with the output of the IDDBM3D algorithm [12]. The simulation parameters are summarized in tables 5 and 1. In table 6 we bring quantitative results of the deblurring experiments. For each we compare the PSNR achieved by our scheme with one of the initial images and show the improvement. Examples of qualitative results are shown in Figure 7. Our algorithm improved the image quality in most experiments.
Scenario  PSF  

1  2  
2  8  
3  uniform  
4  49  
5  Gaussian with  4 
6  Gaussian with  64 
Scenario  1  2  3  4  5  6 

Test  Image  BSNR  Input  IDDBM3D  Our  Improvement  
PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  
1  C.man  31.87  22.23  0.709  31.11  0.891  31.49  0.904  0.38  0.013 
House  29.16  25.61  0.767  35.54  0.889  36.02  0.896  0.48  0.007  
Lena  29.89  27.25  0.882  35.20  0.972  35.63  0.978  0.43  0.006  
Barbara  30.81  23.34  0.795  30.97  0.970  31.09  0.974  0.12  0.003  
Average  30.44  24.61  0.788  33.20  0.931  33.56  0.938  0.35  0.007  
2  C.man  25.85  22.16  0.668  29.31  0.862  29.75  0.873  0.44  0.011 
House  23.14  25.46  0.724  33.99  0.869  34.56  0.879  0.57  0.009  
Lena  23.87  27.04  0.872  33.64  0.957  34.12  0.966  0.49  0.008  
Barbara  24.79  23.25  0.789  27.20  0.926  27.29  0.930  0.08  0.004  
Average  24.43  24.47  0.763  31.04  0.904  31.43  0.912  0.39  0.008  
3  C.man  40.00  20.77  0.624  31.24  0.899  31.17  0.910  0.08  0.011 
House  40.00  24.11  0.697  36.98  0.918  37.57  0.928  0.60  0.009  
Lena  40.00  25.84  0.829  34.74  0.968  35.14  0.973  0.40  0.004  
Barbara  40.00  22.49  0.737  28.53  0.933  28.31  0.930  0.23  0.003  
Average  40.00  23.30  0.722  32.87  0.930  33.05  0.935  0.18  0.006  
4  C.man  18.53  24.63  0.609  28.63  0.858  29.21  0.873  0.58  0.014 
House  15.99  28.08  0.631  33.85  0.868  34.41  0.879  0.57  0.011  
Lena  16.47  28.81  0.903  33.76  0.957  34.29  0.967  0.52  0.010  
Barbara  17.35  24.22  0.849  26.09  0.908  26.17  0.913  0.07  0.005  
Average  17.17  26.44  0.748  30.58  0.898  31.02  0.908  0.44  0.010  
5  C.man  29.19  23.36  0.734  27.69  0.858  28.58  0.873  0.88  0.015 
House  26.61  27.82  0.794  33.55  0.874  34.17  0.883  0.62  0.009  
Lena  27.18  29.16  0.928  34.00  0.968  34.38  0.973  0.38  0.005  
Barbara  28.07  23.77  0.831  24.93  0.883  25.01  0.886  0.07  0.003  
Average  27.77  26.03  0.822  30.04  0.896  30.53  0.904  0.49  0.008  
6  C.man  17.76  29.83  0.703  34.69  0.932  34.89  0.939  0.21  0.007 
House  15.15  30.00  0.682  37.08  0.920  36.74  0.911  0.34  0.010  
Lena  15.52  30.02  0.911  36.34  0.972  36.32  0.972  0.02  0.000  
Barbara  16.59  29.78  0.939  35.22  0.979  35.21  0.980  0.01  0.000  
Average  16.36  29.91  0.809  35.83  0.951  35.79  0.951  0.04  0.001 
4.3 SuperResolution
For the single image SuperResolution (SR) task, our problem is:
(16) 
where blurs the high resolution image with a
Gaussian kernel with standard deviation 1.6, and
downsamples the image by a scaling factor 3 in each pixel. In addition, Gaussian noise with is added to the low resolution image.We repeat the experiments reported in [11], including both noiseless and noisy cases. Our scheme is initialized with output of NCSR algorithm [11]. The simulation parameters are summarized in tables 7 and 1. In table 8 are shown the quantitative results of the SR experiments. Qualitative results are shown in Figure 8. Our algorithm improves the image quality almost in all experiments.
Test  Noiseless  Noisy 

Noiseless  
Image  NCSR  Our  Improvement  
PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  
Butterfly  28.11  0.916  29.43  0.932  1.32  0.017 
Flower  29.51  0.856  29.82  0.865  0.31  0.009 
Girl  33.36  0.827  33.36  0.824  0.00  0.003 
Parthenon  27.18  0.751  25.39  0.757  0.21  0.006 
Parrot  30.52  0.914  31.02  0.921  0.50  0.006 
Raccoon  29.28  0.771  29.41  0.766  0.13  0.004 
Bike  24.75  0.803  25.15  0.815  0.40  0.011 
Hat  31.25  0.870  31.58  0.877  0.33  0.007 
Plants  34.00  0.918  34.60  0.924  0.61  0.006 
Average  29.81  0.847  30.23  0.853  0.42  0.006 
Noisy  
Image  NCSR  Our  Improvement  
PSNR  SSIM  PSNR  SSIM  PSNR  SSIM  
Butterfly  26.87  0.888  28.00  0.903  1.13  0.15 
Flower  28.08  0.793  28.44  0.807  0.36  0.14 
Girl  32.02  0.764  32.10  0.767  0.07  0.004 
Parthenon  26.38  0.699  26.63  0.710  0.25  0.010 
Parrot  29.51  0.877  29.86  0.879  0.35  0.003 
Raccoon  28.02  0.681  28.14  0.689  0.12  0.008 
Bike  23.79  0.737  24.28  0.760  0.49  0.023 
Hat  29.96  0.824  30.37  0.830  0.41  0.006 
Plants  31.74  0.859  32.22  0.866  0.49  0.006 
Average  28.49  0.791  28.89  0.801  0.41  0.010 
4.4 Poisson denoising
The Poisson negative loglikelihood is given by the following formula:
(17) 
where
Since are not defined for negative values, we extrapolate for using the second order Taylor series:
Then we construct the extrapolated likelihood term, , as a sum of the functions:
(18) 
Since the functions penalize the negative values when , the corresponding th components of the function can be omitted. Therefore, the function is reduced to the function:
(19) 
The difference between the and is that the summarizes only over the components for which . Thus, the unconstrained optimization problem for the Poisson denoising is formulated by:
(20) 
where is calculated using the peak value: , and is the maximum pixel value of the clean image.
We run experiments for several noise levels – peak = 4, 2, 1, 0.5, 0.2 and 0.1. Our scheme is initialized with the output of the SPDA algorithm [7]. Since SPDA achieves better results for low peaks using binning, we also use binning in the simulations with peaks: 0.5, 0.2 and 0.1. The simulation parameters are summarized in tables 10 and 9. In table 11 we show quantitative results of the experiments. Examples of qualitative results are shown in Figure 9. For the high peak values (peaks 4 and 2 without binning, and peak 0.5 with binning) our algorithm improves the image quality almost in all experiments.
Bin  c  B  

no  5  1  9  201  
5  1  7  101 
Peak  4  2  1  0.5  0.2  0.1 
20  N/A  N/A  10  N/A  N/A  
2.5  1  1  2.5  1  1  
Peak  Method  Bridge  Camera  Flag  House  Peppers  Saturn  Swoosh  Average 
4  SPDA  20.55  21.87  26.75  26.02  22.02  31.03  32.67  25.85 
0.367  0.673  0.883  0.753  0.699  0.853  0.960  0.741  
Our  20.92  22.84  28.35  27.15  23.68  31.43  33.71  26.87  
0.398  0.664  0.905  0.783  0.726  0.905  0.969  0.764  
Impr.  0.37  0.97  1.60  1.13  1.66  0.40  1.03  1.02  
0.031  0.009  0.023  0.031  0.027  0.052  0.009  0.023  
2  SPDA  21.14  21.49  25.41  25.07  21.17  29.36  29.24  24.55 
0.352  0.651  0.864  0.716  0.662  0.820  0.893  0.708  
Our  20.22  21.93  26.30  25.63  21.77  29.41  30.15  25.06  
0.362  0.605  0.894  0.749  0.684  0.871  0.930  0.728  
Impr.  0.08  0.43  0.89  0.56  0.60  0.05  0.91  0.50  
0.010  0.045  0.029  0.033  0.022  0.050  0.038  0.020  
1  SPDA  19.21  20.17  22.69  22.62  19.94  27.02  26.41  22.58 
0.304  0.587  0.821  0.632  0.609  0.778  0.823  0.650  
Our  19.24  20.46  22.36  23.20  19.99  26.93  27.37  22.79  
0.315  0.537  0.819  0.677  0.621  0.826  0.880  0.668  
Impr.  0.03  0.28  0.33  0.57  0.04  0.09  0.96  0.21  
0.011  0.049  0.001  0.045  0.013  0.048  0.056  0.018  
0.5  SPDAbin  18.58  18.94  19.39  21.23  18.60  25.89  26.56  21.31 
0.273  0.548  0.691  0.627  0.543  0.756  0.910  0.621  
Our + bin  18.69  19.68  19.63  21.72  18.92  26.37  27.55  21.79  
0.285  0.558  0.703  0.656  0.571  0.835  0.933  0.649  
Impr.  0.10  0.74  0.24  0.49  0.32  0.48  0.98  0.48  
0.012  0.010  0.012  0.030  0.028  0.079  0.023  0.028  
0.2  SPDAbin  17.88  17.97  18.60  19.59  17.57  24.03  23.70  19.91 
0.253  0.478  0.670  0.521  0.492  0.691  0.787  0.556  
Our + bin  17.83  18.29  18.58  20.16  17.56  24.09  24.88  20.20  
0.250  0.500  0.673  0.607  0.523  0.813  0.893  0.609  
Impr.  0.05  0.31  0.02  0.58  0.01  0.05  1.17  0.29  
0.003  0.023  0.003  0.086  0.031  0.122  0.106  0.053  
0.1  SPDAbin  17.04  16.80  16.22  18.78  16.29  21.94  21.99  18.44 
0.222  0.493  0.578  0.567  0.477  0.656  0.827  0.546  
Our + bin  17.10  17.16  16.03  18.91  16.28  21.72  22.38  18.51  
0.227  0.462  0.592  0.590  0.491  0.787  0.874  0.575  
Impr.  0.06  0.36  0.19  0.13  0.01  0.11  0.38  0.07  
0.005  0.030  0.014  0.024  0.014  0.131  0.047  0.029 
5 Discussion
5.1 Relation to the work reported in [1, 2]
There is an interesting connection between the approach presented in this paper and the work reported in [1, 2], which introduced the idea of patchordering. This connection is easiest to explain in the context of Gaussian denoising. If in our objective function in Equation (9) we replace the robust statistics () regularization with a Euclidean norm (), remove the penalties and , remove the weighting matrix, and do not employ subimage accumulation, we get the following minimization problem
(21) 
which can be rewritten as
(22) 
since P is unitary. The solution of this problem is
(23) 
In other words,
(24) 
where is a circulant matrix that represents a convolution filter. Therefore, the image denoising task obtained by the objective function in (21) reduces to applying permutation on the noisy image , smoothing the permuted signal with a convolution filter , and obtaining the result by applying the inverse permutation . This solution is similar to the basic idea of the scheme in [1], as formulated in Equation (2) in part ii@.
Now, lets add subimage accumulation in order to get closer to the actual scheme used in both our work and [1]. The minimization problem in (21) transforms into
(25) 
If we pad the image by a circular repetition, the matrices are unitary and circulant. In this case the objective function in Equation (25) can be rewritten as the sum of contributions of all subimages,
(26) 
where the contribution of the subimage is
(27) 
The approach taken in this work is to optimize this penalty directly (if we would have chosen to use the as a regularizer). In contrast, one could apply a suboptimal minimization strategy that minimizes the contribution of each subimage independently and averages the obtained values of . In fact, this is closely related to the approach taken in [1]. The result of this suboptimal approach will be
(28) 
where minimizes contribution of the subimage
(29) 
Equation (28) can be rewritten as
(30) 
where is the convolution filter we have seen before. Therefore, the formula in (30) reduces to the following operations: (i) extract all possible subimages using the operators and apply permutation on each subimage, (ii) smooth the permuted signals with convolutional filter , (iii) apply inverse permutation on the results, (iv) plug each subimage into its original place and average the results. To conclude, the formula in Equation (30) is reminiscent of the denoising strategy presented in Equation (7) of part ii@ in [1]. We should stress, however, that the work in [1] is not a special case of the method presented in this paper, due to various additional enhancements used in [1], which are not exploited in this work. These include learning the linear filter, cyclespinning over the choice of the permutation, and more.
The work reported in [2], as well as the presented work here, constructs an objective function with a permutationbased regularization. However, the scheme in [2] uses multiscale decomposition within the regularization, which makes the algorithm quite involved and computationally expensive. As a consequence, our method is simpler than the one proposed in [2], while remaining very effective. In Table 12 we compare the performance of our scheme with the algorithms presented in [1, 2] for the Gaussian denoising problem. In Table 13 we bring a comparison with [2] for image debluring. Tables 12 and 13 show that our method outperforms the scheme in [1], and achieves results that are comparable to the ones reported in [2].
/ PSNR  25 / 20.17  50 / 14.15  
[1]  [2]  Ours  [1]  [2]  Ours  
Lena  31.80  32.26  31.96  28.96  29.30  29.13 
Barbara  30.47  30.90  30.39  27.35  27.78  27.15 
Boats  29.70  29.88  29.66  26.69  26.91  26.81 
Fprint  27.34  27.32  27.15  24.13  24.06  24.22 
House  32.54  32.37  33.05  29.64  29.56  30.21 
Peppers  30.01  30.33  30.38  26.75  26.93  27.09 
Average  30.31  30.51  30.43  27.25  27.42  27.44 
/ PSNR  75 / 10.63  100 / 8.13  
[1]  [2]  Ours  [1]  [2]  Ours  
Lena  27.22  27.50  27.41  26.01  26.36  26.14 
Barbara  25.42  25.82  25.20  24.07  24.46  23.74 
Boats  24.99  25.15  25.17  23.90  24.04  24.03 
Fprint  22.47  22.47  22.64  21.44  21.53  21.50 
House  27.79  27.37  28.18  26.30  25.98  26.66 
Peppers  24.72  24.98  25.13  23.21  23.56  23.68 
Average  25.44  25.55  25.62  24.16  24.32  24.29 
Image  Method  Test 1  Test 2  Test 3  Test 4  Tests 5  Test 6 

Lena  [2]  8.56  6.92  8.86  5.52  4.95  6.91 
ours  8.38  7.08  9.30  5.48  5.22  6.30  
Barbara  [2]  8.06  4.57  6.01  2.20  1.41  6.06 
ours  7.75  4.04  5.82  1.95  1.24  5.43  
House  [2]  10.44  8.79  13.11  6.38  5.95  7.56 
ours  10.41  9.10  13.46  6.33  6.35  6.74  
C.man  [2]  9.24  7.38  10.21  4.34  4.68  5.26 
ours  9.26  7.59  10.40  4.58  5.22  5.06  
Average  [2]  9.08  6.92  9.55  4.61  4.25  6.45 
ours  8.95  6.96  9.75  4.58  4.50  5.88 
5.2 Choosing the TSP solver
In section 2 we have shown artifact magnification effect when using a deterministic TSP solver. In this section we revisit this phenomena in order to better explain and demonstrate it. Recall that our algorithm starts with some reconstruction result, which is naturally not perfect and thus may contain artifacts. Our regularization relies on a permutation of patches from this image. Thus, a permutation obtained by a more exact TSP solver is in fact creating and magnifying artifacts, since it will assign patches with the same artifacts as neighbors in our ordering. If, on the other hand, one uses the more crude (and somewhat random) ordering that we propose, such similar patches will be separated in the ordering, thus leading to a smoothing out effect of some of these artifacts. In section 2 Figure 1 is showing an example of artifact magnification in the Gaussian denoising test. To complete this presentation, we bring here examples of this effect in other applications. In Figure 10 we compare between the outputs of the proposed method with LinKernighan heuristics and Algorithm LABEL:alg:reordering for the debluring task. In Figures 11 and 12 we do the same comparison for the superresolution and Poisson denoising tasks respectively. Our conclusion from these experiments is that a more exact TSP solver magnifies artifacts in all applications.
5.3 Patch Reordering for Better Sparsification
In order to better understand the role of the permutation in the proposed algorithm, we compare the tail distribution functions of the Laplacian result with our ordering versus a zigzag scan. The Laplacian result are calculated by computing the vector , which is the essence of the regularization term we propose. Using all images listed in Table 3
(referring to the Gaussian denoising test) we compute the cumulative distribution function,
by applying a cumulative sum on the histogram of the components of . For comparison we calculate this while replacing our with the one obtained by a horizontal zigzag scan ordering. A comparison of the tail distribution function, for zigzag scan and our permutation is shown in Figure 13. As can be seen, our ordering leads to better sparsification of the Laplacian result, when applied to clean images. More specifically, the probability for the zigzag scan exhibits a tendency to have more nonzeros, and heavier tail, i.e. bigger values.5.4 Poorly Oredered Pixels
As already mentioned in Section 2, due to greedy nature of the NN heuristics in Algorithm LABEL:alg:reordering, the last part of the reordered image is not as smooth as the rest of the ordering. Nevertheless, the reconstruction results of our algorithm, as shown in the previous section, seem to be of good quality, which may be puzzling. We therefore performed several experiments to better understand this behavior, and here we propose an explanation for this phenomenon.
In order to avoid side effects related to the initialization, we start our experiment by computing an oraclepermutation, i.e., applying Algorithm LABEL:alg:reordering on the original clean image (Lena)^{1}^{1}1We repeated the following line of experiments for several images and the conclusions remain the same.. First we plot a graph of the absolute value of the 1D gradient of the reordered image. As can be seen in Figure 13(a), approximately 15% of the ordering towards the end is not smooth, just as expected. The location of these pixels in the image is shown in Figure 14(b), and as can be seen, these pixels are mostly in edge and texture areas.
We now perform a Gaussian denoising experiment with (), and our interest is in seeing how the MSE of the output pixels depends on their location in the ordering. We divide the permuted image into 50 groups of pixels with 50% overlap between them, and plot a graph of the average MSE versus location in the ordering, where the xaxis corresponds to the center of each group and the yaxis is the MSE for the groups. This graph is shown in Figure 13(b). Indeed, it is clear that the MSE grows as we tend towards the end of the ordering, implying that the last pixels is illtreated. However, even for these seemly poorlyserved pixels, the obtained MSE is significantly lower than the noise level – in this test, the MSE of the last group is , where the noise level is . This suggests that even though the last ordered pixels seem to have poor choice of neighbors, their treatment is still rather effective. So, how come this happens?
The answer resides in the subimage accumulation proposed in our regularization scheme. If a pixel falls in the last part of the ordering, as shown above, it does not imply that this pixel may not be sufficiently close to its neighbors in the patchordering, because the poor neighbor assignments are true only in terms of the ordering applied to the central pixels of the patches. The very same poorlypositioned pixels are highly likely to be assigned with effective neighbors in other orderings, as each pixel participates in such optional permutations. In order to demonstrate this, we check how many pixels fall in the last 15% of all orderings. For the image Lena this count drops to 3.6% of the pixels, and the location of these pixels is shown in Figure 14(c). These pixels are characterized by the fact that they have no relevant neighbors in the image, regardless of the permutation strategy adopted. As such, they are expected to be illtreated in the restoration procedure.

5.5 Initialization Strategy
The prime drawback of the proposed method is that it relies on a good initialization. In order to demonstrate this we run a Gaussian denoising experiment on Lena with and without BM3D initialization. In this experiment we apply seven rounds of minimization of the penalty function, each followed by a construction of the permutation. The first iteration uses the noisy image itself for building the permutation, while the next iterations rely on the temporary output. We use for the first iteration, for the second, and for the rest. As for the remaining parameters, we used the values listed in table 1. Qualitative and quantitative results of this experiment are shown in Figure 16. Clearly, the result falls short compared to BM3D and other stateoftheart denoising methods. Thus, our algorithm relies heavily on a goodquality initialization, and further work is required for seeking alternatives to this initialization strategy.
6 Conclusion
In this paper we have extended the work in [1], which introduced the concept of patch ordering for handling image denoising and inpainting. Our work exploits the existing interrelation between image patches by building a MAP estimator with permutationbased smoothnesspromoting prior on the objective image. The presented scheme is applicable to diverse set of inverse problems in image processing, and in this work we demonstrate its effectiveness for Poisson image denoising, Gaussian image denoising, image deblurring and single image super resolution. In most these tests, our method improved the image quality comparing to the initial image, in some experiments showing an improvement that goes beyong 1dB.
Note that throughout the experiments reported above, we have set the patch ordering only once, based on the initialization image, and then restored a better outcome using our algorithm. In principle, one should iterate this process, by updating the ordering based on the improved image, and then minimizing the MAP functional again. Our initial tests suggest that this is not effective if we are using the same ordering algorithm as described in Section 2. In our future work we plan to investigate alternative ordering that would lead to an overall improvement. In a wider context, we are also seeking ways to depart from the greedy ordering method depicted in Algorithm LABEL:alg:reordering in various ways, such that the final outcome is further improved.