Random Walk Graph Laplacian based Smoothness Prior for Soft Decoding of JPEG Images

07/07/2016 ∙ by Xianming Liu, et al. ∙ Harbin Institute of Technology 0

Given the prevalence of JPEG compressed images, optimizing image reconstruction from the compressed format remains an important problem. Instead of simply reconstructing a pixel block from the centers of indexed DCT coefficient quantization bins (hard decoding), soft decoding reconstructs a block by selecting appropriate coefficient values within the indexed bins with the help of signal priors. The challenge thus lies in how to define suitable priors and apply them effectively. In this paper, we combine three image priors---Laplacian prior for DCT coefficients, sparsity prior and graph-signal smoothness prior for image patches---to construct an efficient JPEG soft decoding algorithm. Specifically, we first use the Laplacian prior to compute a minimum mean square error (MMSE) initial solution for each code block. Next, we show that while the sparsity prior can reduce block artifacts, limiting the size of the over-complete dictionary (to lower computation) would lead to poor recovery of high DCT frequencies. To alleviate this problem, we design a new graph-signal smoothness prior (desired signal has mainly low graph frequencies) based on the left eigenvectors of the random walk graph Laplacian matrix (LERaG). Compared to previous graph-signal smoothness priors, LERaG has desirable image filtering properties with low computation overhead. We demonstrate how LERaG can facilitate recovery of high DCT frequencies of a piecewise smooth (PWS) signal via an interpretation of low graph frequency components as relaxed solutions to normalized cut in spectral clustering. Finally, we construct a soft decoding algorithm using the three signal priors with appropriate prior weights. Experimental results show that our proposal outperforms state-of-the-art soft decoding algorithms in both objective and subjective evaluations noticeably.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 9

page 11

page 12

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Millions of images are now captured and viewed daily on social networks and photo-sharing sites like Facebook and Flickr111

It is estimated that 300 million photos are uploaded to Facebook a day.

. The explosive volume increase of these images are outpacing the cost decrease of storage devices, and thus lossy image compression is still indispensable in today’s visual communication systems. The most prevalent compression format for images remains JPEG (Joint Photographic Experts Group)222https://jpeg.org/: a lossy image compression standard whose first and most popular version was finalized more than two decades ago. JPEG is a block-based transform coding scheme, where an image is first divided into non-overlapping pixel blocks, transformed via discrete cosine transform (DCT) to coefficients, then lossily quantized and entropy coded.

Broadly speaking, there are two approaches to decoding a JPEG image. Given encoded quantization bin indices of different DCT coefficients in a pixel block, hard decoding chooses the bin centers as reconstructed coefficients and performs inverse DCT to recover the block’s pixels. It is thus inevitable that when the quantization bin sizes increase (coarser quantization), the resulting reconstruction quality will worsen correspondingly.

Instead, one can take a soft decoding approach: each DCT coefficient is only constrained to be within the indexed quantization bin, and the reconstruction value is chosen with the help of signal priors. This is the approach taken in many previous works [1, 2, 3, 4] and is the approach taken in this paper. The challenge thus lies in identifying appropriate signal priors and incorporating them into an effective soft decoding algorithm.

Fig. 1: Conceptual diagram shows how three image priors complement each other in soft decoding of JPEG images.

In this paper, we combine three signal priors—Laplacian prior for DCT coefficients [5], sparsity prior and graph-signal smoothness prior for image patches [6, 7]—to build an efficient soft decoding algorithm for JPEG images. Specifically, for each

code block in an image, we first use the Laplacian prior—modeling the probability distributions of DCT coefficients as Laplacian—to compute an initial signal estimate in a closed form that minimizes the expected mean square error (MMSE). However, optimizing non-overlapping blocks individually can lead to block artifacts. Using a sparse signal model for a larger patch that straddles multiple

code blocks, one can remove block artifacts using an over-complete dictionary of atoms [8] trained offline from a large dataset of natural image patches.

The complexity of computing a code vector is high when the dictionary is large

[8]. Unfortunately, when one reduces the dictionary size, recovery of high DCT frequencies in image patches becomes poor. We explain this phenomenon by drawing an analogy to low-rate vector quantizers (VQ) [9] when the DCT frequencies are modeled as Laplacian distributions.

To satisfactorily recover high DCT frequencies, we design a new graph-signal smoothness prior, where the key assumption is that the desired signal (pixel patch) contains mainly low graph frequencies. Our graph-signal smoothness prior is constructed using left eigenvectors of the random walk graph Laplacian matrix (LERaG); compared to previous graph-based priors [10, 11, 12], LERaG has desirable image filtering properties with low computation overhead. We demonstrate how LERaG can facilitate recovery of high DCT frequencies of piecewise smooth (PWS) signals via an interpretation of low graph frequency components as relaxed solutions to normalized cut in spectral clustering [13]—the closer a target signal is PWS, the more easily the pixels are divided into clusters, and the more likely LERaG can restore the signal. Finally, we construct a soft decoding algorithm using the three signal priors with appropriate prior weights. Experimental results show that our proposal outperforms state-of-the-art soft decoding algorithms in both objective and subjective evaluations noticeably.

The outline of the paper is as follows. We first review related works in Section II. We then discuss the three signal priors in order: the Laplacian prior, the sparsity prior and the graph-signal smoothness prior, in Section IV, Section V and Section VI, respectively. In Section VII, we combine the priors into a JPEG soft decoding algorithm. Finally, we present results and conclusion in Section VIII and IX, respectively.

Ii Related Work

Ii-a Soft Decoding of JPEG Images

There are two general approaches to reconstruct a compressed JPEG image in the literature: deblocking and dequantization. The purpose of deblocking is to remove the block artifacts in a JPEG image. By regarding compression noise as additive white Gaussian noise, deblocking works on decoded JPEG image directly, and recovers it in a similar way as denoising via pre-defined prior models, such as local smoothness [14, 15], Gaussian processes [16], sparsity [17], etc. However, the non-linearity of quantization operations in JPEG makes quantization noises signal dependent, far from being white and independent. Inaccurate modeling of compression artifacts limits the restoration performance of deblocking. Dequantization, also called soft decoding, treats JPEG image restoration as an ill-posed inverse problem: find the most probable transform coefficients in a code block subject to indexed quantization bin constraints, with the help of signal priors and optimization. Soft decoding utilizes all available information such as indexed quantization bin boundaries and natural image priors, and hence has more potential to get better restoration performance. Soft decoding is also the approach taken in this paper. In the following, we review some popular and state-of-the-art methods in the literature.

Many dequantization methods exploit image priors in the pixel or transform domain to address the inverse problem, such as the classical projection on convex sets (POCS) method [1], the total variation (TV) model [2], and non-local self-similarity in DCT domain [18]. Accompanying its success in other restoration applications, the sparsity prior has also shown its promise in combating compression distortions [3, 19, 20]. For instance, Chang et al. [3] proposed to learn the dictionary from the input JPEG image, and use total variation and quantization constraint to further limit the solution space. However, the dictionary trained from JPEG image itself will also learn noise patterns as atoms. Therefore, in some cases, this method will enhance but not suppress the compressed artifacts. Very recently, Zhao et al. [20] proposed to combine structural sparse representation prior with quantization constraint. This scheme achieves state-of-the-art soft decoding performance. As depicted by the theoretical analysis in Section V-C, the sparsity prior cannot provide satisfactory high-frequency information preservation when the dictionary size is small.

Ii-B Graph Laplacian Regularizer

Leveraging on recent advances in graph signal processing [7, 21], graph Laplacain regularizer has shown its superiority as a new prior in a wide range of inverse problems, such as denoising [22, 11]

, super-resolution

[23], bit-depth enhancement [24] and deblurring [25]. The definition of graph Laplacian regularizer relies on the graph Laplacian matrix, which describes the underlying structure of the graph signal. Two definitions of graph Laplacian matrix are typically used: combinatorial Laplacian and normalized Laplacian.

Most of existing methods [22, 11, 23, 24] utilize combinatorial graph Laplacian to define the smoothness prior, which is real, symmetric and positive-semidefinite. Its spectrum carries a notion of frequency. However, its filtering strength depends on the degrees of the graph vertices. Several normalized graph Laplacian versions are proposed so that their filtering strength is independent of the vertex degrees. One popular option is to normalize each weight symmetrically [7]. The symmetrically normalized graph Laplacian has similar properties as the combinatorial Laplacian, except that it does not have the DC component. Therefore, it cannot handle constant signals well. [25] proposed a doubly stochastic graph Laplacian, which is symmetric and contains the DC component. However, it requires non-trivial computation to identify transformation matrices to make the rows and columns of the Laplacian matrix stochastic.

In our previous work [26], we used a graph-signal smoothness prior based on combinatorial graph Laplacian for soft decoding of JPEG image. In this paper, we propose a new smoothness prior using left eigenvectors of the random walk graph Laplacian matrix to overcome the drawbacks of the combinatorial graph Laplacian. Further, we provide a thorough analysis from a spectral clustering perspective to explain why our proposed graph-signal smoothness prior can recover high DCT frequencies of piecewise smooth signals.

Iii Problem Formulation

Iii-a Quantization Bin Constraint

We begin with the problem setup. In JPEG standard, each non-overlapping pixel block (called code block in the sequel) in an image is compressed independently via transform coding. Specifically, each code block in vector form is transformed via DCT to 64 coefficients , where is the transform matrix. The -th coefficient is quantized using quantization parameter (QP) —assigned a quantization bin (q-bin) index (q-index) as:

(1)

Because natural images tend to be smooth, most AC coefficients are close or equal to zero. JPEG also employs larger QPs for higher frequencies [27] due to human’s lesser visual sensitivities to fast changing details, resulting in compact signal representation in the DCT domain for most code blocks.

At the decoder, having received only q-index there exists an uncertainty when recovering , in particular:

(2)

This quantization constraint defines the search space for during restoration of the code block.

Iii-B Soft Decoding of JPEG Images

A JPEG standard decoder [27] simply chooses the q-bin centers as the reconstructed coefficients and performs inverse DCT to recover the block’s pixels. Clearly, the restoration quality will worsen as the sizes of the q-bins increase, particular at high frequencies. This approach is called hard decoding. Instead, our approach is to only constrain each DCT coefficient to be within the indexed q-bin using (2), and choose the “optimal” reconstruction value using pre-defined signal priors. This approach is called soft decoding.

It is known that block artifacts appear as unnatural high frequencies across block boundaries at low rates [1]. To reduce block artifacts, we enforce consistency among adjacent blocks as follows. As shown in Fig. 2, we define a larger patch that encloses a smaller code block , where . Mathematically, , where binary matrix extracts pixels in corresponding to the smaller code block . is the basic processing unit containing pixels from multiple patches, and thus when recovering , pixels across patch boundaries are enforced to be consistent by averaging.

Fig. 2: A patch being optimized encloses a smaller code block. Pixels across block boundaries are enforced to be consistent by averaging to remove discontinuities.

The ill-posed nature of soft decoding means that signal prior(s) is needed to differentiate among good candidate solutions ’s. We first denote by the vector of q-indices corresponding to block surrounded by (i.e., ) and the vector of corresponding QPs. Having defined , we pursue a maximum a posterior (MAP) formulation and seek the signal

with the largest posterior probability

. Mathematically, MAP is formulated as:

(3)

where denotes the likelihood of observing q-indices given . can be written as:

(4)

where here means element-by-element division. Thus, the MAP formulation (3) becomes:

(5)

Iv The Laplacian Prior

Given that individual DCT coefficients of feasible solutions are constrained to be inside indexed q-bins (2), one natural choice for signal prior is the Laplacian prior, which is also defined for individual DCT coefficients. Specifically, the Laplacian prior [5]

states that the probability density function of a DCT coefficient

is:

(6)

where is a parameter. [5] show that the higher the frequency, the sharper the Laplacian distribution (larger ).

Using the Laplacian prior alone, given q-bin constraints we can compute a minimum mean square error (MMSE) solution for soft decoding. In particular, each optimal DCT coefficient of a code block in a MMSE sense can be computed independently as:

(7)

By taking the derivative of (7) with respect to and setting it to zero, we obtain a closed-form solution:

(8)

The recovered code block can be obtained by performing inverse DCT on the estimated coefficients . After performing MMSE estimation for all code blocks, we get a soft decoded image.

A clear advantage of Laplacian-based soft decoding is that there is a closed-form MMSE solution computed efficiently, which by definition has a smaller expected squared error than a MAP solution, which is just the most probable solution [28]. The closed form is possible because the Laplacian prior describes distributions of individual coefficients, which is also how the q-bin constraints are specified in (2). However, the MMSE solution (8) can only be used to recover code blocks separately, and thus cannot handle block artifacts that occur across adjacent blocks. As such, we next propose to employ the sparsity prior to provide additional a priori signal information and optimize at a larger patch level .

V The Sparsity Prior

Given that it is difficult to use the Laplacian prior directly to recover a larger patch , we first formulate a MAP problem for a patch using the sparsity prior. Then, using again the Laplacian prior we analyze the K-SVD based dictionary learning, showing that when the dictionary is small, the atoms tend to have low average DCT frequency.

V-a Sparse Signal Model

The sparsity prior [8] states that a signal can be well approximated by a sparse linear combination of atoms from an appropriately chosen over-complete dictionary , where the dictionary size is much larger than the signal dimension, i.e., . Mathematically we write:

(9)

where code vector contains the coefficients corresponding to atoms in , and is a small error term. is assumed to be sparse; i.e., the number of non-zero entries in is small. Dictionary can be learned offline from a large set of training patches using K-SVD [8].

Given signal , an optimal can be found via [8]:

(10)

where is the -norm of ; is a parameter trading off the fidelity term with the sparsity prior. (10) is NP-hard, but the matching and basis pursuit algorithms have been shown effective in finding approximated solutions [8, 29, 30]. For instance, in [8], the orthogonal matching pursuit (OMP) algorithm is employed, which greedily identifies the atom with the highest correlation to the current signal residual at each iteration. Once the atom is selected, the signal is orthogonally projected to the span of the selected atoms, the signal residual is recomputed, and the process repeats.

We can write the sparsity prior as a probability function:

(11)

V-B Sparsity-based Soft Decoding

Given the prior probability in (

11), we can rewrite the optimization problem in (5) as follows:

(12)

Compared to (10), there is an additional quantization constraint, which is specific to the soft decoding problem.

In (12), both signal and code vector are unknown. We can solve the problem alternately by fixing one variable and solving for the other, then repeat until convergence:

  • Step 1–Initial Estimation: Initialize . For example, is initialized as the closed-form solution image using the Laplacian prior described in Section IV.

  • Step 2–Sparse Decomposition: Given of iteration , compute the corresponding code vector by solving the following minimization problem:

    (13)

    using OMP333In the unlikely case that there exist multiple ’s that yield the same minimum objective, we assume that the algorithm returns deterministically the “first” solution (e.g., one with non-zero coefficients of the smallest atom indices) so the solution to (13) is always unique. stated earlier.

  • Step 3–Quantization Constraint: Given sparse code , optimize that satisfies the q-bin constraint:

    (14)

    which can be solved via quadratic programming [31].

Step 2 and Step 3 are repeated until converges. In the above procedure, the computational burden mainly lies in the sparse code search step, where the complexity of OMP increases linearly with the number of dictionary atoms.

We prove local convergence of our sparsity-based soft decoding algorithm via the following lemma.

Lemma 1.

The sparsity-based soft decoding algorithm converges to a local minimum.

Proof.

Steps 2 and 3, examining variables and separately, are both minimizing objective in (12) that is lower-bounded by 0. At step 2, given that was optimized assuming a fixed previous , the algorithm finds . If , then the objective must be smaller; otherwise, is not the objective-minimizing argument, a contradiction. The same is true for step 3. Hence the objective is monotonically decreasing when the variables are updated to different values. Given that the objective is lower-bounded by , the objective cannot decrease indefinitely, and hence the algorithm converges to a local minimum. ∎

V-C Limitation of Small K-SVD Trained Dictionary

Fig. 3: Illustration of product VQ for two frequencies: (DC) and (AC). (a) and (b) show probability distributions of and respectively. When the dictionary is small (low rate), (c) shows that quantization is coarser for large magnitude in than , since the probability of large magnitude in high frequency is relatively low. When the dictionary is large (high rate), (d) shows that there are enough quantization bins so that quantization for large magnitude in high frequency is sufficiently fine.

We now argue that when the size of a dictionary trained by K-SVD [8] is small (to reduce complexity during code vector search via OMP), the recovery of high DCT frequencies suffers. Given a set of training patches each of size , dictionary of atoms, is trained using K-SVD via the following formulation:

(15)

The variables are both and , which are solved alternately by fixing one and solving for the other.

We write (15) in the DCT frequency domain using the Parsavel’s theorem [32]:

(16)

where is the DCT transform for the -pixel patch, and is the vector of DCT coefficients for .

We rewrite the optimization to its constrained form:

(17)

where is a pre-set sparsity limit for each .

For intuition, we focus on the special case when . In this special case, the dictionary learning problem is analogous to the vector quantization (VQ) design problem [9]. Specifically, selecting atoms in dictionary is analogous to designing partition so that their union is the space of feasible signal , i.e., , and no two partitions overlap, i.e. , . The centroid of each cell is atom .

When the number of training samples tends to infinity, the sum over training patches is replaced by integration over each with probability :

(18)

where a signal in partition (DCT frequency domain) is represented by atom (in pixel domain), thus distortion .

We have assumed in Section IV that

is a product of Laplacian distributions for individual DCT frequencies, where for high frequencies the distributions are more skewed and concentrated around zero. Thus, as the number

of atoms in dictionary (hence the number of partitions ) decreases, in order to minimize the expected squared error in (18), quantization bins will be relatively coarse for large magnitude of high frequencies, because they are less probable in comparison [33]. See Fig. 3 for an example that shows coarse quantization bins for for a product VQ of two dimensions. This explains why when the dictionary is small, recovery of large magnitude high DCT frequencies is difficult.

Fig. 4: (a) The relationship between the dictionary size and the high-frequency information preservation, where the mean frequency is used to measure the high-frequency information contained in atoms of dictionary. (b) The relationship between the dictionary size and the restoration performance, where we use the test image Butterfly when QF = 10 as an example.
Fig. 5: Illustration of reconstructed images using dictionaries with different sizes. (a) Image1: the reconstructed image using dictionaries with size 1000. (b) Image2: the reconstructed image using dictionaries with size 7000. (c) The difference between two reconstructed images. The quality improvement of Image2 over Image1 mainly lies on edge structures.

We can observe empirically this is indeed the case. For a given K-SVD trained dictionary , we compute the mean frequency (MF) of atoms as follows. Let and be a DCT frequency and an atom ’s corresponding coefficient at that frequency. MF for is computed as:

(19)

We computed MF for dictionaries of sizes ranging from to (the dimension of an atom is if is set to ). As illustrated in Fig. 4 (a), MF is roughly linearly increasing with dictionary size. Further, as illustrated in Fig. 4 (b), the performance of sparsity-based soft decoding also improves with dictionary size. As shown in Fig. 5, we can observe that using a large dictionary can significantly improve PSNR and SSIM values of the restored image, where the improvement over a smaller dictionary mainly lies in edge structures.

Vi The Graph-Signal Smoothness Prior

Given that the sparsity prior cannot recover high DCT frequencies when QP is large and the dictionary is small, we introduce the graph-signal smoothness prior, which was shown to restore piecewise smooth (PWS) signals well [10, 11, 12]. We first define a discrete signal on a graph, then define a conventional smoothness notion for graph-signals and interpret graph-signal smoothness in the graph frequency domain. Finally, we propose a new graph-signal smoothness prior based on the random walk graph Laplacian for soft decoding.

Vi-a Signal on Graphs

We define a graph as . and are respectively the sets of vertices and edges in . in the adjacency matrix is the weight of the edge connecting vertices and . We consider only undirected graphs and non-negative edge weights [7]; i.e., and .

A graph-signal is a collection of discrete samples on vertices of a graph , where . In our work, for each target pixel patch we construct a fully-connected graph, where each vertex (pixel) is connected to all other vertices in . The edge weight between two vertices and is conventionally computed using Gaussian kernels [7]:

(20)

where and are the intensity and location of pixel , and and are chosen parameters. This definition is similar to domain and range filtering weights in bilateral filter [34].

Vi-B Graph Laplacians and Signal Smoothness

A degree matrix is a diagonal matrix where . A combinatorial graph Laplacian is [7]:

(21)

There exist two normalized variants of : normalized graph Laplacian and random walk graph Laplacian ,

(22)

A normalized version means that its filtering strength does not depend on the degrees of the graph vertices.

Given , one can describe the squared variations of the signal with respect to using the graph Laplacian regularizer [10, 11]:

(23)

A common graph-signal smoothness prior can then be defined as:

(24)

which states that a probable signal has small . By (23), is small if has similar values at each vertex pair connected by an edge, or the edge weight is small. Thus is smooth if its variations across connected vertices are consistent with the underlying graph (where edge weights reflect expected signal variations). For example, if a discontinuity is expected at pixel pair , one can pre-set a small weight , so that the smoothness prior will not over-smooth during optimization. Later, we will propose a different smoothness prior than the conventional (24) with more desirable filtering properties.

Vi-C Frequency Interpretation of Graph Smoothness Prior

Since is a real, symmetric, positive semi-definite matrix, it can be decomposed into a set of orthogonal eigenvectors, denoted by

, with real non-negative eigenvalues

. We define as the eigen-matrix with ’s as columns and as the diagonal matrix with ’s on its diagonal. can thus be written as:

(25)

We define the

graph Fourier transform

(GFT) matrix as . A graph-signal can be transformed to the graph frequency domain via:

(26)

With this definition, the graph Laplacian regularizer can be written as:

(27)

can thus be written as a sum of squared GFT coefficients each scaled by the eigenvalue , which is interpreted as frequency in graph transform domain.

By maximizing the graph-signal smoothness prior in (24) (minimizing ), signal is smoothened with respect to the graph, i.e., high graph frequencies are suppressed while low graph frequencies are preserved. It is shown [10, 11, 12] that PWS signals—with discontinuities inside signals that translate to high DCT frequencies—can be well approximated as linear combinations of low graph frequencies for appropriately constructed graphs. We provide an explanation next, which leads naturally to the derivation of a new smoothness prior.

Vi-D Building A Random Walk Graph Laplacian Prior

We first show that low graph frequencies for the normalized graph Laplacian can be interpreted as relaxed solutions for spectral clustering, which are PWS in general. We then argue that to induce desirable filtering properties for , a similarity transformation is required, resulting in the random walk Laplacian . Finally, we show that a more appropriate smoothness prior than (24) can be defined using , with many desirable filtering properties.

Vi-D1 Spectral Clustering

To understand low graph frequencies for computed from a PWS signal , we take a spectral clustering perspective. Spectral clustering [35] is the problem of separating vertices in a similarity graph into two subsets of roughly the same size via spectral graph analysis. Specifically, the authors in a landmark paper [13] proposed a minimization objective normalized cut (Ncut) to divide into subsets and :

(28)

where counts the edges across the two subsets and , and counts the degrees of vertices in :

(29)

From (28), it is clear that a good division of into similar clusters and with small will have small and large and . However, minimizing Ncut over all possible and that divide is NP-hard.

Towards an approximation, the authors first rewrote the minimization of Ncut as:

(30)

where

(31)

We see that is piecewise constant (PWC).

The problem constraints (31) are then relaxed, resulting in:

(32)

The idea is that if solution satisfies (31), then . Thus a solution to (32) can be interpreted as a relaxed / approximate solution to the Ncut problem in (28).

We now use this result from [13] to interpret low graph frequencies of . We first define and . (32) can then be rewritten as:

(33)

Note that the objective in (33) is the Rayleigh quotient with respect to matrix [36]. It is easy to see that minimizes this objective: . Hence is the first eigenvector of , and the solution to (33) is the second eigenvector of . We can thus conclude the following:

The second eigenvector of is a relaxed solution to the Ncut problem; if the solution becomes exact, then is PWC according to (31).

Eigen-basis of thus seem suitable to compactly represent PWS signals.

Vi-D2 Smoothness Prior using Random Walk Graph Laplacian

However, because the first eigenvector of , , is not a constant vector (DC component) in general, the eigen-basis of are not suitable for filtering of images, which tend to be smooth. We thus perform a similarity transformation444https://en.wikipedia.org/wiki/Matrix_similarity on to efficiently filter constant signals. Let , where is a matrix with eigenvectors of as columns, and is a diagonal matrix with corresponding eigenvalues on its diagonal. We now define . We see that has left eigenvectors (in rows):

(34)

Note that because and are similar, they have the same eigenvalues . When a constant signal is projected onto left eigenvectors , is in fact in , which is orthogonal to other rows of . This means that left eigen-basis coefficients of , defined as , has non-zero only in the first element when .

However, is asymmetric, which cannot be easily decomposed into a set of orthogonal eigenvectors with real eigenvalues. Thus there is no clear graph frequency interpretation of as we did for in (27).

To achieve symmetry, suppose we use and define the following regularization term instead:

(35)

If we write , then (35) becomes:

(36)

Since is a diagonal matrix, one can see that:

(37)

where and are maximum and minimum vertex degrees in graph . So instead of minimizing , we can minimize its upper bound .

We can interpret more naturally as follows. can now be written as:

(38)

Thus we can now have a graph frequency interpretation of our regularizer : high frequencies of (or ) are suppressed to restore smooth signal . For convenience, we call our regularizer Left Eigenvector Random-walk Graph Laplacian (LERaG).

For efficient computation, can be most efficiently computed as:

(39)

Compared to in (24), LERaG is based on the normalized variant , and thus its filtering is insensitive to the vertex degrees of particular graphs. Further, unlike , LERaG is based on left eigenvectors of that efficiently filter constant signals, thus is suitable for image filtering. Finally, compare to [25] which requires non-trivial computation to identify transformation matrices to make row and column stochastic, LERaG can be computed simply via (39). Thus our regularizer has desirable filtering properties with little computation overhead compared to previous graph-based regularizers in the literature.

Vi-D3 Analysis of Ideal Piecewise Constant Signals

We now show that LERaG computes to 0 for ideal PWC signal , where:

(40)

where constants and are sufficiently different, i.e., for some sufficiently large .

Using (20) to compute edge weights, this PWC signal implies that if and belong to different constant pieces. Assume further that is sufficiently large relative to location differences , so that the first exponential kernel in (20) alone determines . The adjacency matrix and the graph Laplacian are then both block-diagonal:

(41)

where is a matrix of all zeros, and and are respective adjacency and combinatorial graph Laplacian matrices for full unweighted graph of vertices:

(42)

Clearly, normalized is also block-diagonal:

(43)

where is a normalized graph Laplacian for a full unweighted graph of vertices:

(44)

We already know is the first eigenvector corresponding to eigenvalue . Further, we see also that a second (unnormalized) eigenvector corresponding to eigenvalue can be constructed as:

(45)

It is easy to verify that and . In fact, we see that takes the form in (31); i.e., for this graph corresponding to an ideal PWC signal, the relaxed solution to Ncut is also the optimal one, which is PWC.

Analyzing coefficients , we see that can be expressed exactly as a linear combination of the first two eigenvectors of , i.e., , where

(46)

and

(47)

Because the first two eigenvectors and correspond to eigenvalue of , we can conclude:

Given an ideal two-piece PWC signal defined in (40), can be represented exactly using the first two eigenvectors of corresponding to eigenvalue , and hence LERaG evaluates to .

In other words, is also ideal low-pass (cut-off frequency at 0) given eigenvectors of , and so LERaG, which penalizes high frequencies according to (38), computes to . Another interpretation is that an ideal two-piece PWC signal leads to two clusters of identical vertices, hence of is an exact solution to Ncut, which is PWC (31). In this case, and are both PWC, so together they are sufficient to represent , also PWC.

Vi-D4 Analysis of Ideal Piecewise Smooth Signals

We now generalize our analysis of PWC signals to PWS signals. We first define a PWS signal as:

(48)

where . In words, contains two smooth pieces in and : two samples in the same piece are similar to within , and two samples from two different pieces differ by more than . Clearly, if , then is PWC.

We now construct a complete graph for using (20) to compute edge weight between vertices and . The resulting normalized is still block-diagonal for sufficiently large , but the diagonal blocks are no longer in (44). in this case can be computed as:

(49)

One can verify that and . is roughly PWS according to (49), since is similar for nodes within the same smooth piece. Thus , also roughly PWS, can be well approximated as , with little energy leakage to high frequencies; i.e., is approximately low-pass given eigenvectors of . That means LERaG computes to a small value for PWS signal .

In the more general case when signal is not PWS, then is not block-diagonal, and second eigenvector computed in (33) corresponds to an eigenvalue . This first non-zero eigenvalue is called the Fiedler number [37], and is a measure of how connected the graph is. In our signal restoration context, it means that the closer is to PWS, the closer is to , and the more likely LERaG can recover the signal well.

Fig. 6: Signal reconstruction comparison using only the first two right eigenvectors of and DCT.

For intuition, we illustrate a simple 1-D PWS signal, where and . Fig. 6 (a) shows an example where the first two right eigenvectors of can represent the original signal well. In contrast, as illustrated in Fig. 6 (b), reconstruction using only the first two DCT frequencies is poor.

Vii Soft Decoding based on Prior Mixture

We now combine the priors discussed in previous sections to construct an efficient JPEG soft decoding algorithm.

Vii-a The Objective Function Formulation

Combining the sparsity prior and LERaG, we arrive at the following optimization for soft decoding:

(50)

where both patch and its sparse code are unknown.

For parameter of the sparsity prior, we empirically set . For parameter of the graph-signal smoothness prior, we set it adaptively as follows. As argued in Section IV-C, if a patch contains large high DCT frequencies, then it will not be recovered well by the sparsity prior. We thus adaptively increase if q-bin indices indicate the presence of high DCT frequencies in target .

Since by design adjacent patches have overlaps, the pixel values in an overlapping region from two patches should be as similar as possible. To this end, upon obtaining all the optimal reconstructed patches , the soft-decoded image can then be obtained by averaging over pixels in overlapped regions.

Vii-B Optimization

Optimization of (50) is not convex. Instead, we propose to employ an alternating procedure to optimize and iteratively. First, for initialization we compute the Laplacian prior based MMSE solution per code block for the entire image to obtain . Then, the -th () iteration of the optimization procedure is described as follows:

1) Fix and estimate :

The optimization problem becomes a standard sparse coding:

(51)

which can be efficiently solved by some well-developed greedy -minimization algorithm such as OMP.

2) Fix and estimate :

The objective function becomes:

(52)

which can be solved efficiently using quadratic programming [31].

The outputted solution is used as the initial for -th iteration. The graph is updated according to . The algorithm terminates when both and converge. The local convergence can be proved in a similar way as Lemma 1.

Images JPEG BM3D [38] KSVD [8] ANCE [18] DicTV [3] SSRQC [20] Ours
PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM
Butterfly 22.65 0.7572 23.91 0.8266 24.55 0.8549 24.34 0.8532 23.54 0.8228 25.31 0.8764 25.82 0.8861
Leaves 22.49 0.7775 23.78 0.8408 24.39 0.8684 24.18 0.8551 23.27 0.8245 25.01 0.8861 25.57 0.8979
Hat 25.97 0.7117 26.79 0.7497 27.41 0.7802 27.09 0.7706 27.33 0.7707 27.27 0.7753 27.71 0.7946
Boat 25.23 0.7054 26.31 0.7547 26.85 0.7739 26.59 0.7637 26.31 0.7491 26.85 0.7745 27.17 0.7783
Bike 21.72 0.6531 22.60 0.7039 22.84 0.7130 22.74 0.6973 22.28 0.6952 22.96 0.7119 23.32 0.7291
House 27.76 0.7732 28.87 0.8020 29.53 0.8185 29.07 0.8131 29.59 0.8072 29.92 0.8226 30.25 0.8237
Flower 24.51 0.6866 25.49 0.7352 25.84 0.7471 25.54 0.7337 25.88 0.7316 25.69 0.7407 25.93 0.7521
Parrot 26.15 0.7851 27.40 0.8329 27.22 0.8465 27.81 0.8475 27.92 0.8382 28.21 0.8566 28.25 0.8572
Pepper512 27.17 0.7078 28.31 0.7573 29.32 0.7949 29.03 0.7891 28.81 0.7769 29.28 0.7948 29.81 0.7984
Fishboat512 25.56 0.6563 26.44 0.6921 26.87 0.7072 26.72 0.6994 26.35 0.6963 26.86 0.7042 27.04 0.7113
Lena512 27.32 0.7365 28.43 0.7788 29.13 0.8079 28.96 0.8024 28.51 0.7976 29.19 0.8072 29.29 0.8077
Airplane512 26.34 0.7843 27.11 0.8101 27.62 0.8262 27.40 0.8201 26.95 0.8112 28.90 0.8261 29.19 0.8482
Bike512 22.25 0.6231 23.12 0.6693 23.39 0.6781 23.24 0.6605 22.73 0.6559 24.60 0.6815 24.86 0.6992
Statue512 25.72 0.6629 33.61 0.9188 27.03 0.7371 26.83 0.7265 26.51 0.7268 28.03 0.7627 28.30 0.7685
Average 25.06 0.7157 26.58 0.7765 26.62 0.7827 26.39 0.7737 26.14 0.7645 27.01 0.7872 27.32 0.7941
TABLE I: Quality comparison with respect to PSNR (in dB) and SSIM at QF = 5
Images JPEG BM3D [38] KSVD [8] ANCE [18] DicTV [3] SSRQC [20] Ours
PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM PSNR SSIM
Butterfly 29.97 0.9244 31.35 0.9555 31.57 0.9519 31.38 0.9548 31.22 0.9503 32.02 0.9619 32.87 0.9627
Leaves 30.67 0.9438 32.55 0.9749 33.04 0.9735 32.74 0.9728 32.45 0.9710 32.13 0.9741 34.42 0.9803
Hat 32.78 0.9022 33.89 0.9221 33.62 0.9149 33.69 0.9169 33.20 0.8988 34.10 0.9237 34.46 0.9268
Boat 33.42 0.9195 34.77 0.9406 34.28 0.9301 34.64 0.9362 26.08 0.7550 33.88 0.9306 34.98 0.9402
Bike 28.98 0.9131 29.96 0.9356 30.19 0.9323 30.31 0.9357 29.75 0.9154 30.35 0.9411 31.14 0.9439
House 35.07 0.8981 36.09 0.9013 36.05 0.9055 36.12 0.9048 35.17 0.8922 36.49 0.9072 36.55 0.9071
Flower 31.62 0.9112 32.81 0.9357 32.63 0.9271 32.67 0.9314 31.86 0.9084 33.02 0.9362 33.37 0.9371
Parrot 34.03 0.9291 34.92 0.9397 34.91 0.9371 35.02 0.9397 33.92 0.9227 35.11 0.9401 35.32 0.9401
Pepper512 34.21 0.8711 34.94 0.8767 34.89 0.8784 34.99 0.8803 34.24 0.8639 35.05 0.8795 35.19 0.8811
Fishboat512 32.76 0.8763 33.61 0.8868 33.36 0.8809 33.60 0.8861 32.53 0.8496 33.68 0.8859 33.73 0.8871
Lena512 35.12 0.9089 36.03 0.9171 35.82 0.9146 36.04 0.9177 34.85 0.8986 36.09 0.9187 36.11 0.9191
Airplane512 33.36 0.9253 34.38 0.9361 34.36 0.9341 34.53 0.9358 33.75 0.9134 35.81 0.9355 36.07 0.9439
Bike512 29.43 0.9069 30.47 0.9299 30.66 0.9258 30.71 0.9298 30.05 0.9043 32.26 0.9372 32.55 0.9387
Statue512 32.78 0.9067 33.61 0.9188 33.55 0.9149 33.55 0.9193 32.53 0.8806 34.88 0.9249 34.95 0.9273
Average 32.44 0.9097 33.52 0.9264 33.50 0.9229 33.57 0.9258 32.25 0.8945 33.91 0.9283 34.41 0.9311
TABLE II: Quality comparison with respect to PSNR (in dB) and SSIM at QF = 40

Viii Experimentation

Fig. 7: Fourteen test images. The image order is the same as Table I and II.

We now present experimental results to demonstrate the superior performance of our soft decoding algorithm. We selected fourteen widely used images in the literature as test images. The first eight images are sized of , the rest six images are sized of , as illustrated in Fig. 7.

Viii-a Competing Methods

Our proposed method is compared against: 1) BM3D [39]: a state-of-the-art image denoising method, which is included because the restoration of compressed images can be viewed as a denoising problem; in our test, the extended version [38] is used, which achieves better performance than original BM3D. 2) KSVD [8]: a well-known sparse coding framework; we test KSVD with a large enough over-complete dictionary (). In contrast, our scheme uses a much smaller one (). We will demonstrate that our scheme achieves better reconstruction performance along with lower computational complexity compared with KSVD. 3) ANCE [18]: a state-of-the-art JPEG image restoration algorithm. 4) DicTV [3]: a recent sparsity-based compressed image restoration algorithm, which exploits both sparsity and TV priors. 5) SSRQC [20]: a state-of-the-art JPEG soft decoding method published very recently, which can be regarded as one of the best algorithms to-date. The source codes of compared methods are all kindly provided by their authors.

The determination of parameters deserves clarification. Regarding the size of patches, larger patches can result in more overlapping pixels among neighboring code blocks, which will be beneficial in removing block artifcacts. However, it will also make the task of finding their sparse representations difficult. Thus we set the size of the patch to be , overlapping 2 pixels horizontally and vertically with neighboring patches, to achieve a good tradeoff. For the uncompressed training set used to obtain the over-complete dictionary, we randomly select five images from the Kodak Lossless True Color Image Suite555 http://r0k.us/graphics/kodak/. The training set does not have any overlap with the test set.

Our method can be easily extended to restore compressed color images. When compressing color images, JPEG first performs YUV color transformation, and then compresses the resulting Y, U and V channels separately. As the image signal energy is highly packed into the luminance channel Y, we apply our proposed scheme to Y. For chrominance channels U and V, we only use the graph-signal smoothness prior to speed up the restoration process. Due to space limitation, we only report the test results on gray-level images in the following.

Fig. 8: QF-PSNR performance comparison for QFs ranging from 10 to 80. The proposed method achieves the best PSNR performance on all QFs.

Viii-B Objective Performance Comparison

Table I and Table II tabulate the objective evaluations (PSNR and SSIM) of the above algorithms for fourteen test images. The test images are coded by a JPEG coder with with a small QF = 5 and a medium QF = 40.

When QF = 5, the average PSNR gains are 0.74dB and 0.7dB respectively, compared to two well-known algorithms BM3D and KSVD. Note that KSVD uses a large enough dictionary for reconstruction. While our method uses a much smaller dictionary, it achieves better results than KSVD, which demonstrate the effectiveness of the proposed graph-signal smoothness prior. Compared to the state-of-the-art ANCE algorithm, our method greatly improves the reconstruction quality. The PSNR gain is up to 1.48dB (Butterfly) and the average PSNR gain is 0.93dB. Our method also performs better than state-of-the-art sparsity based methods. Compared to DicTV, designed specifically to restore compressed images, the average gain of our method is 1.18dB. Compared to SSRQC, the newest sparsity-based soft decoding method, our method achieves an average gain of 0.31dB. When QF = 40, the proposed algorithm also performs better than other methods for all test images. The average PSNR gains are 0.89dB, 0.91dB, 0.84dB, 2.16dB and 0.5dB, respectively.

We further show the test results for a wide range of QFs (from 10 to 80), illustrated in Fig. 8. Due to the space limitation, here we only report the test results over six test images, including four images and two images. We observe that our method consistently performs better than other methods.

Viii-C Subjective Performance Comparison

We now demonstrate that our soft decoding algorithm also achieves better perceptual quality of the restored images. When QF is 5, the quantization noise is severe, which leads to very poor subjective quality of JPEG-compressed images. Therefore, we use QF = 5 as an example to evaluate visual quality of different schemes.

Fig. 9 and Fig. 10 illustrate the perceptual quality for different methods. The test images Butterfly and Leaves have rich structure regions. We use them as examples to demonstrate the performance of different methods in recovering structure regions. The images reproduced by BM3D suffer from highly visible noises, especially the blocking artifacts. KSVD achieves better results than BM3D, where fewer blocking artifacts are detectable. However, KSVD cannot preserve the edge structure well. In results produced by DicTV, there are still strong blocking artifact. This is because, in DicTV, the dictionary is learned from the JPEG compressed image itself. When quantization step is large, the structure noise is also learned as atoms of dictionary. Therefore, it will enhance but not suppress the quantization noise through subsequent sparse coding based restoration. ANCE can suppress most blocking artifacts, but there are still noticeable artifacts along edges. SSRQC removes most block artifacts. However, it can be observed that the edge regions are blurred to some extent.

The images restored by our method are much cleaner, in which the structures and sharpness of edges are well preserved. Our proposed method primarily benefit from exploiting both sparsity prior for recovering textures (low DCT frequencies) and graph-signal smoothness prior for recovering structures (high DCT frequencies). The experimental results validate this point. Our proposed method can also remove DCT blocking artifacts in smooth areas completely, which are largely free of the staircase and ringing artifacts along edges. Due to space limitation, here we only show the subjective comparisons for low QF, as the superiority can be better visually reflected in these cases. Our method also achieves better subjective quality for medium to high QFs as well.

Fig. 9: Comparison of tested methods in visual quality on Butterfly at QF = 5. The corresponding PSNR and SSIM values are also given as references.
Fig. 10: Comparison of tested methods in visual quality on Leaves at QF = 5. The corresponding PSNR and SSIM values are also given as references.

Viii-D Computational Complexity Comparison

We further report the results on computational complexity comparison. Here we show the average running time comparison on eight test images when QF = 40. The compared methods are running on a typical laptop computer (Intel Core i7 CPU 2.6GHz, 8G Memory, Win10, Matlab R2014a). As depicted in Table III, the complexity of our method is lower than the state-of-the-art algorithms BM3D (BM3D-SAPCA) and ANCE. Compared with K-SVD, which is with a large enough over-complete dictionary, the proposed scheme can significantly reduce the computational complexity.

TIME BM3D KSVD ANCE DicTV SSRQC Proposed
Average 373.35 209.71 307.43 39.53 70.32 143.73
TABLE III: Computation Complexity Comparison (in second) when QF = 40

Viii-E Comparison of Different Graph-signal Smoothness Prior

To demonstrate the superiority of the proposed graph Laplacian regularizer LERaG, we compare it with the popular combinational graph Laplacian regularizer, symmetrically normalized graph Laplacian regularizer and the doubly stochastic one proposed in [25]. We test eight images when QF = 5. The results reported are PSNR values after the first iteration. That is, after solving (51), we use three different graph Laplacian regularizer to formulate (52). For fairness of comparison, the regularization parameter are carefully selected for optimal performance for each case. It can be found that the proposed regularizer outperform other three ones with respect to average PSNR. Compared with other three graph Laplacian regularizers, our method can improve the PSNR performance up to 0.18dB, 0.87dB and 0.42dB, respectively.

Images Combinatorial Normalized Doubly Stochastic LERaG
Butterfly 25.42 24.70 25.15 25.57
Leaves 24.99 24.54 24.84 25.17
Hat 27.53 27.42 27.43 27.56
Boat 26.99 26.94 26.98 26.99
Bike 23.12 23.01 23.09 23.17
House 29.87 29.83 29.86 29.89
Flower 25.84 25.78 25.82 25.87
Parrot 27.97 27.95 27.97 28.02
Average 26.46 26.27 26.39 26.53
TABLE IV: Performance comparison of different graph-signal smooth priors with respect to PSNR (in dB) and SSIM at QF = 5

Ix Conclusion

In this paper, a novel soft decoding approach for the restoration of JPEG-compressed images is proposed. The main technical contribution is twofold. First, we propose a new graph-signal smoothness prior based on left eigenvectors of the random walk graph Laplacian with desirable image filtering properties, which can recover high DCT frequencies of piecewise smooth signals well. Second, we combine the Laplacian prior, sparsity prior and our new graph-signal smoothness prior into an efficient soft-decoding algorithm. Experimental results demonstrate that our method achieves better objective and subjective restoration quality compared to state-of-the-art soft decoding algorithms.

References

  • [1] A. Zakhor, “Iterative procedures for reduction of blocking effects in transform image coding,” IEEE Transactions on Circuits and Systems for Video Technology,, vol. 2, no. 1, pp. 91–95, Mar 1992.
  • [2] K. Bredies and M. Holler, “A total variation-based jpeg decompression model,” SIAM J. Img. Sci., vol. 5, no. 1, pp. 366–393, Mar. 2012.
  • [3] H. Chang, M. Ng, and T. Zeng, “Reducing artifacts in jpeg decompression via a learned dictionary,” IEEE Transactions on Signal Processing,, vol. 62, no. 3, pp. 718–728, Feb 2014.
  • [4] X. Liu, X. Wu, J. Zhou, and D. Zhao, “Data-driven sparsity-based restoration of jpeg-compressed images in dual transform-pixel domain,” in

    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , June 2015, pp. 5171–5178.
  • [5] E. Lam and J. Goodman, “A mathematical analysis of the dct coefficient distributions for images,” IEEE Transactions on Image Processing,, vol. 9, no. 10, pp. 1661–1666, Oct 2000.
  • [6] M. Elad, M. Figueiredo, and Y. Ma, “On the role of sparse and redundant representations in image processing,” Proceedings of the IEEE, vol. 98, no. 6, pp. 972–982, June 2010.
  • [7]

    D. Shuman, S. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,”

    IEEE Signal Processing Magazine,, vol. 30, no. 3, pp. 83–98, May 2013.
  • [8] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on Signal Processing,, vol. 54, no. 11, pp. 4311–4322, Nov 2006.
  • [9] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression.   Norwell, MA, USA: Kluwer Academic Publishers, 1991.
  • [10] J. Pang, G. Cheung, W. Hu, and O. Au, “Redefining self-similarity in natural images for denoising using graph signal gradient,” in APSIPA ASC, December 2014.
  • [11] J. Pang, G. Cheung, A. Ortega, and O. Au, “Optimal graph laplacian regularization for natural image denoising,” in IEEE International Conference on Acoustics, Speech and Signal Processing, April 2015.
  • [12] W. Hu, G. Cheung, and M. Kazui, “Graph-based dequantization of block-compressed piecewise smooth images,” in IEEE Signal Processing Letters, vol. 23, no.2, February 2016, pp. 242–246.
  • [13] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000.
  • [14] I. Reeve, H. and J. Lim, “Reduction of blocking effect in image coding,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 8, Apr 1983, pp. 1212–1215.
  • [15] D. Tschumperle and R. Deriche, “Vector-valued image regularization with pdes: a common framework for different applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 4, pp. 506–517, April 2005.
  • [16] Y. Kwon, K. I. Kim, J. Tompkin, J. H. Kim, and C. Theobalt, “Efficient learning of image super-resolution and compression artifact removal with semi-local gaussian processes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1792–1805, Sept 2015.
  • [17] C. Jung, L. Jiao, H. Qi, and T. Sun, “Image deblocking via sparse representation,” Signal Processing: Image Communication, vol. 27, no. 6, pp. 663 – 677, 2012.
  • [18] X. Zhang, R. Xiong, X. Fan, S. Ma, and W. Gao, “Compression artifact reduction by overlapped-block transform coefficient estimation with block similarity,” IEEE Transactions on Image Processing,, vol. 22, no. 12, pp. 4613–4626, Dec 2013.
  • [19] X. Liu, X. Wu, J. Zhou, and D. Zhao, “Data-driven soft decoding of compressed images in dual transform-pixel domain,” IEEE Transactions on Image Processing, vol. 25, no. 4, pp. 1649–1659, April 2016.
  • [20] C. Zhao, J. Zhang, S. Ma, X. Fan, Y. Zhang, and W. Gao, “Reducing image compression artifacts by structural sparse representation and quantization constraint prior,” IEEE Transactions on Circuits and Systems for Video Technology, vol. PP, no. 99, pp. 1–1, 2016.
  • [21] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs: Frequency analysis,” IEEE Transactions on Signal Processing, vol. 62, no. 12, pp. 3042–3054, June 2014.
  • [22] W. Hu, X. Li, G. Cheung, and O. Au, “Depth map denoising using graph-based transform and group sparsity,” in IEEE International Workshop on Multimedia Signal Processing, Pula, Italy, October 2013.
  • [23]

    Y. Mao, G. Cheung, and Y. Ji, “Image interpolation for dibr viewsynthesis using graph fourier transform,” in

    3DTV-Conference, July 2014, pp. 1–4.
  • [24]

    P. Wan, G. Cheung, D. Florencio, C. Zhang, and O. C. Au, “Image bit-depth enhancement via maximum a posteriori estimation of ac signal,”

    IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2896–2909, June 2016.
  • [25] A. Kheradmand and P. Milanfar, “A general framework for regularized, similarity-based image restoration,” IEEE Transactions on Image Processing, vol. 23, no. 12, pp. 5136–5151, Dec 2014.
  • [26] X. Liu, G. Cheung, X. Wu, and D. Zhao, “Inter-block consistent soft decoding of jpeg images with sparsity and graph-signal smoothness priors,” in 2015 IEEE International Conference on Image Processing (ICIP), Sept 2015, pp. 1628–1632.
  • [27] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, 1st ed.   Norwell, MA, USA: Kluwer Academic Publishers, 1992.
  • [28] P. Wan, G. Cheung, D. Florencio, C. Zhang, and O. Au, “Image bit-depth enhancement via maximum-a-posteriori estimation of AC signal,” in IEEE Transactions on Image Processing, vol. 25, no.6, June 2016, pp. 2896–2909.
  • [29] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 3397–3415, Dec 1993.
  • [30] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Rev., vol. 43, no. 1, pp. 129–159, Jan. 2001.
  • [31] J. Nocedal and S. J. Wright, Numerical Optimization, second edition.   World Scientific, 2006.
  • [32] J. Rayleigh, “On the character of the complete radiation at a given temperature,” Philosophical Magazine, vol. 7, pp. 460 C–469, 1889.
  • [33] P. F. Swaszek, “A vector quantizer for the laplace source,” IEEE Transactions on Information Theory, vol. 37, no. 5, pp. 1355–1365, Sep 1991.
  • [34] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Sixth International Conference on Computer Vision, 1998., Jan 1998, pp. 839–846.
  • [35] U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007.
  • [36] R. A. Horn and C. R. Johnson, Eds., Matrix Analysis.   New York, NY, USA: Cambridge University Press, 1986.
  • [37] D. A. Spielman and S. Teng, “Spectral partitioning works: Planar graphs and finite element meshes,” Linear Algebra and its Applications, vol. 421, no. 2 C3, pp. 284–305, 2007.
  • [38]

    K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Bm3d image denoising with shape-adaptive principal component analysis,” in

    Proc. Workshop on Signal Processing with Adaptive Sparse Structured Representations (SPARS’09), 2009.
  • [39] ——, “Image denoising by sparse 3-d transform-domain collaborative filtering,” IEEE Transactions on Image Processing,, vol. 16, no. 8, pp. 2080–2095, Aug 2007.