Generalized Tree-Based Wavelet Transform

11/20/2010 ∙ by Idan Ram, et al. ∙ Technion 0

In this paper we propose a new wavelet transform applicable to functions defined on graphs, high dimensional data and networks. The proposed method generalizes the Haar-like transform proposed in [1], and it is defined via a hierarchical tree, which is assumed to capture the geometry and structure of the input data. It is applied to the data using a modified version of the common one-dimensional (1D) wavelet filtering and decimation scheme, which can employ different wavelet filters. In each level of this wavelet decomposition scheme, a permutation derived from the tree is applied to the approximation coefficients, before they are filtered. We propose a tree construction method that results in an efficient representation of the input function in the transform domain. We show that the proposed transform is more efficient than both the 1D and two-dimensional (2D) separable wavelet transforms in representing images. We also explore the application of the proposed transform to image denoising, and show that combined with a subimage averaging scheme, it achieves denoising results which are similar to those obtained with the K-SVD algorithm.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

page 8

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Most traditional signal processing methods are designed for data defined on regular Euclidean grids. Development of comparable methods capable of handling non-uniformly sampled signals, data defined on graphs or “point clouds”, is important and very much needed. Many signal processing problems involve inference of an unknown scalar target function defined on such data. For example, function denoising involves estimating such a scalar function from its noisy version. A different example is image inpainting which involves estimating missing pixels of a column

stacked version of an image from its known pixels. A major challenge in processing functions on topologically complicated data, is to find efficient methods to represent and learn them.

Many signal processing techniques are based on transform methods, which represent the input data in a new basis, before analyzing or processing it. One of the most successful types of transforms, which has been proven to be a very useful tool for signal and image processing, is wavelet analysis [2]. A major advantage of wavelet methods is their ability to simultaneously localize signal content in both space and frequency. This property allows them to compactly represent signals such as 1D

steps or images with edges, whose primary information content lies in localized singularities. Moreover, wavelet methods represent such signals much more compactly than either the original domain or transforms with global basis elements such as the Fourier transform.

We aim at extending the wavelet transform to irregular, non-Euclidean spaces, and thus obtain a transform that efficiently represent functions defined on such data.

Several extensions of the wavelet transform, operating on graphs and high dimensional data, have already been proposed. The wavelet transforms proposed in [3],[4] and [5] were applied to the data points themselves, rather than functions defined on the data. Other methods took different approaches to construct wavelets applied to functions on the data. Maggioni and Coifman [6] and Hammond et al. [7] proposed wavelets based on diffusion operators and the graph Laplacian [8],[9], respectively. Jansen et al. [10] proposed three methods which were based on a variation of the lifting scheme [11],[12]. Gavish et al. [1] assumed that the geometry and structure of the input data are captured in a hierarchical tree. Then, given such a tree, they built a data adaptive Haar-like orthonormal basis for the space of functions over the data set. This basis, can be seen as a generalization of the one proposed in [13] for binary trees. Our proposed method generalizes the algorithm in [1], and can also construct data adaptive orthonormal non-Haar wavelets, given a data driven-hierarchical tree.

We note that the wavelet transforms proposed in [14], and [15], which are defined on images, also share some similarities with our proposed algorithm. These methods employ pairing or reordering of wavelet coefficients in the decomposition schemes in order to adapt to their input images. In fact, the easy path wavelet transform proposed in [15], which only recently has come to our attention, employs a decomposition scheme which is very similar to ours. Constraining our algorithm to a regular data grid and using the same starting point and search neighborhood as the ones employed in [15], both algorithms essentially coincide. Nevertheless, our approach is more general, as it tackles the more abstract problem of devising a transform for point clouds or high-dimensional graph-data, whereas [15] concentrates on images.

In this paper we introduce a generalized tree-based wavelet transform (GTBWT), which is an extension of the transform introduced by Gavish et al [1]. We first show that the transform of Gavish et al., when derived from a full binary tree, can be applied to a function over the data set using a modified version of the common 1D Haar wavelet filtering and decimation scheme. In each level of this wavelet decomposition scheme, a permutation derived from the tree is applied to the approximation coefficients, before they are filtered. Then we show how this scheme can be extended to work with different wavelet filters, by modifying the way different levels of the tree are constructed.

The construction of each coarse level of the tree involves finding a path which passes through the set of data points in the finer level. The points order in this path defines the permutation applied to the approximation coefficients of the finer level in the wavelet decomposition scheme. We propose a path constructed by starting from a random point, and then continue from each point to its nearest neighbor according to some distance measure, visiting each point only once. The corresponding permutation increases the regularity of the permuted approximation coefficients signal, and therefore it is more efficiently (sparsely) represented using the wavelet transform.

Next we show that the proposed scheme is more efficient than both the common 1D and 2D separable wavelet transforms in representing images. Finally, we explore the application of the proposed transform to image denoising, and show that combined with a proposed subimage averaging scheme, it achieves denoising results similar to the ones obtained with the K-SVD algorithm [16].

The paper is organized as follows: In Section II we describe how the Haar-like basis introduced in [1] is derived from a full binary tree representation of the data. We also describe how such a tree may be constructed. In Section III we introduce the generalized tree-based wavelet transform. We also explore the efficiency with which this transform represents an image. In Section IV we explore the application of our proposed algorithm to image denoising. We also describe the subimage averaging scheme and present some experimental results. We summarize the paper in Section V.

Ii Tree-based Haar wavelets

Let be the dataset we wish to analyze, where the samples may be points in high dimension, or nodes in a weighted graph or network. Also, let be a scalar function defined on the dataset, and let be the space of all functions on the dataset. Here we use the following inner product with the space :

(1)

which is different from the one used by Gavish et al. [1], since it does not contain a normalizing factor before the sum.

Gavish et al. assume that the geometry and structure of the data are captured by one or several hierarchical trees. They do not insist on any specific construction method for these trees, but only that they will be balanced [1]. Given such a tree, they construct a multiscale wavelet-like orthonormal basis for the space . They start by showing that such a tree induces a multi-resolution analysis with an associated Haar-like wavelet.

Let denote the level in the tree, with being the root and being the lowest level, where each sample is a single leaf. Also, let denote the space of functions constant on all folders (subtrees) at level , and let denote a constant function on with the value 1. Then , and by construction

(2)

Now, let be the orthogonal complement of in . Then, the space of all functions can be decomposed as

(3)

Before describing how the multiscale orthonormal basis is constructed given a hierarchical tree, we first describe how such a tree can be constructed from the data. Here we focus on the case of complete full binary trees and the corresponding orthonormal bases. For the case of more general trees and Haar like bases, the reader may refer to [1]. Let denote the -th point at level of the tree, where , and let and

denote a set and a vector containing point

indices, respectively. Also, let be a distance measure in , and let be a distance matrix associated with the -th level of the tree, where . The distance function describes the first-order interaction between data-points, and therefore it should be chosen so as to capture some notion of similarity between them, which would be meaningful to the application at hand. A complete full binary tree can be constructed from the data according to Algorithm 1. An example for such a tree is shown in Fig. 1.

Task: Construct a complete full binary tree from the data . Parameters: We are given the points and the distance function . Initialization: Set as the tree leaves. Main Iteration: Perform the following steps for : Construct a distance matrix , where . Set . Group the points in level in pairs by repeating times: Choose a random point , , and update . Pair with the point , where . Update . Place in a vector the reordered point indices of . Construct the coarse level from the finer level by replacing each pair and with the mean point . Output: The tree node points and the vectors containing the points order in each tree level.
Algorithm 1: Complete full binary tree construction from the data .
Fig. 1: An illustration of a complete full binary tree.

In the case of a full binary tree, the Haar-like basis constructed from the tree is essentially the standard Haar basis which we denote . The adaptivity of the transform is manifested in the fact that this basis is used to represent a permuted version of the signal , which is more efficiently represented by the Haar basis than itself. The permutation is derived from the tree, and is dependent on the data . Let denote a vector of length , which contains the indices of the points , in the order determined by the lowest level of the fully constructed tree. For example, for the tree in Fig. 1, . Also let be the signal permuted according to the vector . The wavelet coefficients can be calculated by the inner products , or by applying the 1D wavelet filtering and decimation scheme with the Haar wavelet filters on . Similarly, the inverse transform is calculated by applying the inverse Haar transform on the wavelet coefficients, and reordering the produced vector so as to cancel the index ordering in . We hereafter term the scheme described above as tree-based Haar wavelet transform, and we show next that it can be extended to operate with general wavelet filters.

Iii Generalized tree-based wavelets

Iii-a Generalized tree construction and transform

The above-described building process of the tree can be presented a little differently. In every level of the tree, we first construct a distance matrix using the mean points and the distance function . Then we group in pairs the points , according to the weights in , as described in Algorithm 1. Next we place the pairs of column vectors one after the other in a matrix of size , and keep the indices of the points in their new order in a vector of length . For example in level of the tree in Fig. 1, and .

Now let and be the Haar wavelet decomposition filters, and let and be the Haar wavelet reconstruction filters. We notice that replacing each pair by its mean point can be done by filtering the rows of with the low pass filter , followed by decimation of the columns of the outcome by a factor of . For example, the points and in level of the tree in Fig. 1 are obtained by filtering the rows of described above with the filter , and keeping the first and third columns of the produced matrix. Effectively this means that the approximation coefficients corresponding to a single-level Haar decomposition are calculated for each row of the matrix .

Next let , and let and denote the approximation and detail coefficient vectors, respectively, received for at level , where . Also, let denote a linear operator that reorders a vector according to the indices in . Then, applying the Haar transform derived from the tree to can be carried out according to the decomposition algorithm in Algorithm 2. Fig. 2 describes two single-level decomposition steps carried out according to Algorithm 2. We note that here the adaptivity of the transform is related to the permutations applied to the coefficients in every level of the tree. In fact, without the operator applied in each level, the decomposition scheme of Algorithm 2 reduces to that of the common 1D orthogonal wavelet transform. Also, since permutation of points is a unitary transform, the described transform remains unitary.

Task: Apply levels of tree-based wavelet decomposition to the signal . Parameters: We are given the signal , the vectors , and the filters and . Initialization: Set . Main Iteration: Perform the following steps for : Construct the operator that reorders its input vector according to the indices in . Apply to and receive . Filter with and decimate the result by to receive . Filter with and decimate the result by to receive . Output: The approximation coefficients and detail coefficients corresponding to .
Algorithm 2: Tree-based wavelet decomposition algorithm.
Fig. 2: Tree-based wavelet decomposition scheme

Finally, let the linear operator reorder a vector so as to cancel the ordering done by . Then the inverse transform is carried out using the reconstruction algorithm in Algorithm 3. Fig. 3 describes two single-level reconstruction steps carried out according to Algorithm 3.

Task: Reconstruct the signal based on a multilevel tree-based wavelet decomposition. Parameters: We are given the approximation and detail coefficients and , the vectors , and the filters and . Main Iteration: Perform the following steps for : Interpolate by a factor of and filter the result with . Interpolate by a factor of and filter the result with . Sum the results of the two previous steps to receive . Construct the operator that reorders its input vector so as to cancel the index ordering in . Apply to and receive . Output: The reconstructed signal .
Algorithm 3: Tree-based wavelet reconstruction algorithm.
Fig. 3: Tree-based wavelet reconstruction scheme

We next wish to extend the scheme described above to work with general wavelet filters. This requires modifying the tree construction by replacing the Haar filters by different wavelet filters and changing the manner in which the points are ordered in each level of the tree. The ordering procedure used in Algorithm 1 fits only the case when Haar wavelets are used, and therefore needs to be replaced.

We note that when filters other than Haar are used in the tree construction scheme described above, each point in the coarse level is calculated as a weighted mean of more than two points from the finer level , where the coefficients in serve as the weights. Therefore, the resultant graph is no longer a tree but rather a rooted -partite graph, which is a graph that contains disjoint sets of vertices so that no two vertices within the same set are adjacent. As the wavelet scheme described above was originally designed with the Haar wavelet filters, and in order to avoid cumbersome distinction between trees and -partite graphs, with a small abuse of terminology we will hereafter refer to the latter also as trees. An example of such a ”generalized” tree, is shown in Fig. 4.

The wavelet decomposition and reconstruction schemes corresponding to each generalized tree are those described in Algorithms 2 and 3, with the necessary change of wavelet filters type and index vectors to the ones used in the construction of the graph. We next propose a method to order the points in each level of the generalized trees.

Fig. 4: An illustration of a ”generalized” tree.
Task: Construct of a ”generalized” tree from the data . Parameters: We are given the points and the weight function . Initialization: Set as the tree leaves. Main Iteration: Perform the following steps for : Construct a distance matrix , where . Choose a random point and set . Reorder the points so that they will form a smooth path by repeating times: Set and update . Set . Place in a vector the reordered point indices of . Order the points according to the indices in and place them in a matrix . Obtain the points by: Apply the filter to the matrix . Decimate the columns of the outcome by a factor of 2. Output: The tree node points and the vectors containing the points order in each tree level.
Algorithm 4: Construction of a ”generalized” tree from the data .

Iii-B Smoothing

We wish to order the points in each a level of a tree in a manner which results in an efficient representation of the input signal by the tree-based wavelets. More specifically, we want the transformed signal to contain a small number of large coefficients, i.e. to be sparse. The wavelet transform is known to produce a small number of large coefficients when it is applied to piecewise regular signals [2]. Thus, we would like the operator , applied to , to produce a signal which is as regular as possible. When the signal is known, the optimal solution would be to apply a simple sort operation on the corresponding coefficients , obtained in each level. However, since we are interested in the case where is not necessarily known (such as in the case where is noisy, or has missing values), we would try to find a suboptimal ordering operation in each level , using the feature points . We assume that under the distance measure , proximity between the two points and suggests proximity between the coefficients and . Thus, we would try to reorder the points so that they will form a smooth path, hoping that the corresponding reordered 1D signal will also be smooth.

The “smoothness” of a 1D signal of length can be measured using its total variation

(4)

By analogy, we measure the ”smoothness” of the path by the measure

(5)

We notice that smoothing the path comes down to finding the shortest path that passes through the set of points , visiting each point only once. This can be regarded as an instance of the traveling salesman problem [17], which can become very computationally exhaustive for large sets of points. A simple approximate solution is to start from a random point, and then continue from each point to its nearest neighbor, not visiting any point twice. As it turns out, ordering the points in this manner improves the performance of the resultant transform.

A generalized tree construction, which employs the proposed point ordering method is summarized in Algorithm 4. Fig. 4 shows an example of ”generalized” tree which may be obtained with Algorithm 4 using a filter of length 4 and disregarding boundary issues in the different levels. We term the filtering schemes described in Algorithms 2 and 3 combined with the tree construction described in Algorithm 4 generalized tree-based wavelet transform (GTBWT).

(a) (b)
(c) (d)
Fig. 5: -term approximation results (PSNR in dB) for the generalized tree-based, common 1D and 2D separable wavelet transforms, obtained with different wavelet filters: (a) Daubechies 1 (Haar). (b) Daubechies 4. (c) Daubechies 8. (d) Comparison between the generalized tree-based wavelet results obtained with the different filters.

An interesting question is whether the wavelet scheme described above represents the signal more efficiently than the common 1D and 2D separable wavelet transforms. Here, we measure efficiency by the -term approximation error, i.e. the error obtained when representing a signal with non-zero transform coefficients.

Before relating to this question, we first describe how the wavelet scheme described above can be applied to images. Let be a grayscale image of size and let be its column stacked representation, i.e. is a vector of length containing individual pixel intensities. Then we first need to extract the feature points from the image, which will be later used to construct the tree. Let be the -th sample in , then we choose the point associated with it as the patch around the location of in the image . We next construct several trees, each with a different wavelet filter, according to the scheme described above. We choose the weight function to be the squared Euclidean distance, i.e. the element in is

(6)

We use the transforms corresponding to these trees to obtain -term approximations of a column stacked version of the image shown in Fig. 7(a) (the center of the Lena image). The approximations, shown in Fig. 5, were carried out by keeping the highest coefficients (in absolute value) in the different transform domains. We compare these results between themselves, and to the -term approximations obtained with the common 1D and the 2D separable wavelet transforms (the latter applied to the original image) corresponding to the same wavelet filters. Fig. 5 shows -term approximation results (PSNR) obtained for the image in Fig. 7

(a) using different wavelet filters. It can be seen that the generalized tree-based wavelet transform outperforms the two other transforms for all the wavelet filters that have been used. It can also be seen that the PSNR gap in the first thousands of coefficients increases with number of vanishing moments of the wavelet filters. Further, Figure

5(d) shows that the generalized tree-based wavelet results obtained with the db4 and db8 filters are close, and better than the ones obtained with the Haar filter.

Before concluding this section, we are interested to see how the basis elements of the proposed wavelet transform look like. Each basis element is obtained by setting the corresponding transform coefficient to 1 and all the other coefficients to zero, and applying the inverse transform. Unlike the case of common 1D wavelet bases, the series of data derived permutations applied to the reconstructed coefficients in the different stages of the inverse transform, result in basis functions which adapt themselves to the input signal .

As a first example, we examine the basis elements obtained for the synthetic image of size , shown in Fig 6(a). The image contains in its middle a square rotated in an angle of 45 degrees. We visualize the basis elements as images by reshaping them to the size of the input image. Figs 6(b)-(n) show some of the basis elements corresponding to the synthetic image, and obtained with the Symmlet 8 filter. The figures show two scaling functions from level , and two wavelet basis functions from each level , all corresponding to the two largest coefficients in the same level. We note that the reason we have more than one scaling function and one wavelet basis element in coarsest level

is related to our implementation of the transform. We use symmetric padding in the signal boundaries before applying it the wavelet filters, which slightly increases the number of coefficients (and corresponding basis elements) obtained in each level of the tree. It can be seen how the scaling functions and wavelets in the low levels of the tree (

) represent the low frequency information in the image, i.e. smooth versions of the rectangle. It can also be seen that the wavelet functions represent finer edges as the level of the tree increases, and that they successfully manage to capture edges which are not aligned with the vertical and horizontal axes.

We next examine the wavelet basis elements corresponding to the image in Fig. 7(a), obtained with the Symmlet 8 filter. Figs. 7(b)-(p) show two scaling functions from level , and two wavelet basis functions from each level , all corresponding to the two largest coefficients in the same level. It can be seen how the basis functions adapt to the shapes in the images. Here again the scaling functions and wavelets in the low levels of the tree () represent the low frequency information in the image, and the wavelet functions represent finer edges in the image as the level of the tree increases. We next present the application of the generalized tree-based wavelet transform to image denoising.

(a) (b) (c) (d) (e) (f) (g)
(h) (i) (j) (k) (l) (m) (n)
Fig. 6: Generalized tree-based wavelet basis elements derived from a synthetic image: (a) the original image. (b) scaling functions (=1). (c) wavelets (=1). (d) wavelets (=2). (e) wavelets (=3). (f) wavelets (=4). (g) wavelets (=5). (h) wavelet (=6). (i) wavelet (=7). (j) wavelet (=8). (k) wavelets (=9). (l) wavelets (=10). (m) wavelets (=11). (n) wavelets (=12).
(a) (b) (c) (d) (e) (f)
(g) (h) (i) (j) (k) (l)
(m) (n) (o) (p)
Fig. 7: Generalized tree-based wavelet basis elements derived from an image: (a) the original image. (b) scaling functions (=1). (c) wavelets (=1). (d) wavelets (=2). (e) wavelets (=3). (f) wavelets (=4). (g) wavelets (=5). (h) wavelets (=6). (i) wavelets (=7). (j) wavelets (=8). (k) wavelets (=9). (l) wavelets (=10). (m) wavelets (=11). (n) wavelets (=12). (o) wavelets (=13). (p) wavelets (=14).

Iv Image Denoising using the Generalized tree-Based Wavelet Transform

Iv-a The Basics

Let be an image of size , and let be its noisy version:

(7)

is a matrix of size which denotes an additive white Gaussian noise independent of

with zero mean and variance

. Also, let and be the column stacked representations of and , respectively. Our goal is to reconstruct from using the generalized tree-based wavelet transform. To this end, we first extract the feature points from similarly to the way they were extracted from in the previous section. Let be the -th sample in , then here we choose the point associated with it as the patch around the location of in the image . We note that since different features are used for the clear and noisy images, the corresponding trees derived from these images will also be different. Nevertheless, it was shown in [18],[19],[20] that the distance between the noisy patches and is a good predictor for the similarity between the clear versions of their middle pixels and . This means that the tree construction is roughly robust to noise, and that the obtained tree will capture the geometry and structure of the clear image quite well. We will further discuss the noise robustness of the algorithm in Section IV-C.

We next construct the tree according to the scheme described in Algorithm 4, where the dissimilarity between the points and in level is again measured by the squared Euclidean distance between them, which can be found in the location in .

Similarly to Gavish et al. [1], we use an approach which resembles the “cycle spinning” method [21] in order to smooth the image denoising outcome. This means that we randomly construct different trees, utilize the transforms corresponding to each of them to denoise , and average the produced images. Each level of a random tree is constructed by choosing the first point at random, and then continue from each point to its nearest neighbor

with a probability

, or to its second nearest neighbor with a probability , where here we set .

The denoising itself is performed by applying the proposed wavelet transform (derived from the tree of ) on , using hard thersholding on the transform coefficients, and computing the inverse transform.

In order to assess the performance of the proposed denoising algorithm we first apply it, using different wavelet filters, to a noisy version of the image shown in Fig. 7(a). For each of the transforms we perform

experiments for different realizations of noise with standard deviation

and average the results – these averages are given in Table I. We note that the denoising thresholds were manually found to produce good denoising results, as the theoretical wavelet threshold led to poorer results, with PSNR values which were lower by about 0.6 dB. It can be seen that better results are obtained with the Symmlet wavelets, and that generally better results are obtained with wavelets with a high number of vanishing moments. It can also be seen that all the transforms require about coefficients to represent the image (for each of the 10 results produced by the different trees) which is about percents of the number of coefficients required in the original space.

We next apply the proposed scheme with the Symmlet 8 wavelet to noisy versions of the images Lena and Barbara, with noise standard deviation and PSNR of dB. We note that this time we use patches of size and perform only one experiment for each image, for one realization of noise. The clear, noisy and recovered images can be seen in Fig. 8. For comparison reasons, we also apply to the two images the denoising scheme of Elad et al. [16], which utilize the K-SVD algorithm [22]. We chose to compare our results to the ones obtained by this scheme, as it also employs an efficient representation of the image content, and is one of many state-of-the-art image denoising algorithms [23],[24], which are based on sparse and redundant representations [25]. However, we note that this scheme is based on efficient representations of the image patches, while our scheme is based on efficient representation of the whole image. The PSNR of the results obtained with the proposed scheme and the K-SVD algorithm are shown in Table II. It can be seen that the results obtained by our algorithm are inferior compared to the ones obtained with the K-SVD. We next try to improve the results produced by our proposed scheme by adding an averaging element into it, which is also a variation on the “cycle spinning” method.

Iv-B Subimage averaging

Let be an matrix, containing column stacked versions of all the patches inside the image. When we built a tree for the image, we assumed that each patch is associated only with its middle pixel. Therefore the tree was associated with the signal composed of the middle points in the patches, which is the middle row of , and the transform was applied to this signal. However, we can alternatively choose to associate all the patches with a pixel located in a different position, for example the top right pixel in each patch. Since the whole patches are used in the construction of the tree, effectively this means that the tree can be associated with any one of the signals located in the rows of . These signals are the column stacked versions of the subimages of size , whose top right pixel reside in the top right patch in the image. We next apply the generalized tree-based wavelet transform denoising scheme to each of these signals. We then plug each subimage into its original place in the image, and average the different values obtained for each pixel, similarly to the way an image is reconstructed from its patches in the K-SVD image denoising scheme [16]. We hereafter refer to scheme described above as Subimage Averaging (SA).

The subimage averaging scheme can also be viewed a little differently. The tree also defines a wavelet transform on the data points , where the approximation coefficient vectors in level are the points . We denote detail coefficients vectors in level as . These vectors can be calculated by applying the filter to the matrix and decimating the columns of the outcome by a factor of 2. A matrix containing and all the detail coefficient vectors can also be obtained by applying the generalized tree-based wavelet transform to each row of the matrix . Similarly, the inverse transform can be performed on by applying each row of this matrix the generalized tree–based wavelet inverse transform. Therefore the subimage averaging scheme can be viewed as applying the transform derived from the tree directly on the image pathes, performing hard thresholding on them, applying the inverse transform to the coefficient vectors, and reconstructing the image from the clean patches.

The results obtained with the generalized tree-based wavelet transform, combined with the subimage averaging procedure, for the noisy version of the image in Fig. 7(a) are shown in Table I. It can be seen that this procedure increases the PSNR of the results by about dB. We note that since we use patches, each of the 10 images averaged in the cycle spinning procedure is reconstructed from 81 subimages, and it can be seen that on average about 360 transform coefficients are required to represent each one of these subimages. Thus the number of transform coefficients required to represent each subimage is slightly higher than the one required to represent an image when the subimage averaging is not used, but is still much lower than the number required in the original space.

We also applied the proposed denoising scheme to the noisy Lena and Barbara images. The reconstructed images are shown in Figs. 8(d) and (h), and Table II shows the corresponding PSNR results. It can be seen that the results obtained with the proposed algorithm for the Lena image are now closer to the ones received by the K-SVD algorithm, and the results obtained for the Barbara image are better than the ones obtained with the K-SVD algorithm.

We note that only a subset of design parameters has been explored when we applied the proposed image denoising algorithm. These design parameters include the patch size, the wavelet filter type (we have not explored biorthogonal wavelet filters at all), and the number of trees averaged by the cycle spinning method. They also include the number of nearest neighbors to choose from in the construction process of each level of these trees, and the parameter used to determine the probabilities of choosing them. We believe that better results may be achieved with the adequate design parameters, and therefore more tests need to performed in order to search for the parameters which optimize the performance of the algorithm. We also note that the use of the proposed sub-image averaging scheme is not restricted to the generalized tree-based wavelet transform. This scheme may be used to improve the denoising results obtained with different methods which use distance matrices similar to the one employed here.

GTBWT with SA
wavelet type PSNR [dB] #coeffs PSNR [dB] #coeffs
db1 (Haar) 28.19 303.28 29.05 349.02
db4 28.5 307.62 29.37 341.25
db8 28.49 384.08 29.33 403.2
sym2 28.45 293.84 29.25 326.48
sym4 28.51 308.6 29.38 349.81
sym8 28.55 342.76 29.39 392.77
TABLE I: Denoising results of a noisy version of the image in Fig. 7(a) (, input PSNR=20.18 dB), obtained using the GTBWT, with and without subimage averaging (SA), and with different wavelet filters. Also shown is the average number of coefficients (#coeffs.) used by each scheme to represent an image.
(a) (b) (c) (d)
(e) (f) (g) (h)
Fig. 8: Denoising results of noisy versions of the images Barbara and Lena (, input PSNR=20.17 dB) obtained with GTBWT with a Symmlet 8 filter with subimage averaging (SA) and without it: (a) Original Lena. (b) Noisy Lena (20.17 dB). (c) Lena denoised using GTBWT (30.3 dB). (d) Lena denoised using GTBWT with sub image averaging (31.21 dB) (e) Original Barbara. (f) Noisy Barbara (20.17 dB). (g) Barbara denoised using GTBWT (28.94 dB). (h) Barbara denoised using GTBWT with sub image averaging (29.82 dB).
Image GTBWT GTBWT + SA K-SVD
Lena 30.3 31.21 31.32
Barbara 28.94 29.82 29.62
TABLE II: Denoising results (PSNR in dB) of noisy versions of the images Barbara and Lena (, input PSNR=20.17 dB) obtained with: 1) GTBWT with a Symmlet 8 filter with subimage averaging (SA) and without it. 2) with the K-SVD algorithm.

Iv-C Robustness to noise

We wish to explore the robustness of the generalized tree-based wavelet denoising scheme to the noise in the points . More specifically we wish to check how using cleaner image patches will affect the denoising results. Here we obtained ”clean” patches from a noisy image by applying our denoising scheme once, and using the patches from the reconstructed image in a second iteration of the denoising algorithm for defining the permutations. For comparison reasons we also used clean patches from the original clean image in our denoising scheme, and regarded the obtained results as oracle estimates.

We applied the GTBWT denoising schemes, obtained with patches from the clean and reconstructed images, to the noisy Barbara and Lena images. The results obtained with the Symmlet 8 filter and with subimage averaging are shown in Table III. It can be seen that large improvements of about dB for the Lena image and dB for the Barbara image are obtained with clean patches. Applying iterations of the denoising algorithm slightly improves the results of the Lena image by about dB. However, applying iterations of the denoising algorithm to the Barbara image degrades the obtained results by about dB. This degradation probably results from oversmoothing of the patches of the reconstructed image, which leads to a loss of details. A choice of a different threshold in the denoising algorithm applied in first iteration, or a different method to clean the patches altogether, may lead to improved denoising results.

Image 1 iter 2 iters clean
Lena 31.21 31.34 32.25
Barbara 29.62 29.71 30.61
TABLE III: Denoising results (PSNR in dB) of noisy versions of the images Barbara and Lena (, input PSNR=20.17 dB) obtained using the GTBWT with subimage averaging. The trees were constructed using patches obtained from the noisy (1 iter), original (clean) and reconstructed (2 iters) images.

V conclusion

We have proposed a new wavelet transform applicable to graphs and high dimensional data. This transform is an extension of the multiscale harmonic analysis approach proposed by Gavish et al. [1]. We have shown a relation between the transform suggested by Gavish et al. and 1D Haar wavelet filtering, and extended the scheme so it will use general wavelet filters. We demonstrated the ability of the generalized scheme to represent images more efficiently than the common 1D and separable 2D wavelet transforms. We have also shown that our proposed scheme can be used for image denoising, and that combined with a subimage averaging scheme it achieves denoising results which are close to the state-of-the-art.

In our future work plans, we intend to consider the following issues:

  1. Seek ways to improve the method that reorders the approximation coefficients in each level of the tree, replacing the proposed nearest neighbor method.

  2. Extend this work to redundant wavelets.

  3. Improve the image denoising results by using two iterations with different threshold settings, and by considering spatial proximity as well in the tree construction.

Acknowledgements

The authors thank Ron Rubinstein for the fruitful discussions and ideas which helped in developing the presented work. The authors also thank the anonymous reviewers for their helpful comments.

References

  • [1]

    M. Gavish, B. Nadler, and R. R. Coifman, “Multiscale wavelets on trees, graphs and high dimensional data: Theory and applications to semi supervised learning,” in

    Proceedings of the 27th International Conference on machine Learning

    , 2010.
  • [2] S. Mallat, A Wavelet Tour of Signal Processing, The Sparse Way.   Academic Press, 2009.
  • [3] F. Murtagh, “The Haar wavelet transform of a dendrogram,” Journal of Classification, vol. 24, no. 1, pp. 3–32, 2007.
  • [4] A. B. Lee, B. Nadler, and L. Wasserman, “Treelets -An adaptive multi-scale basis for sparse unordered data,” Annals, vol. 2, no. 2, pp. 435–471, 2008.
  • [5] G. Chen and M. Maggioni, “Multiscale geometric wavelets for the analysis of point clouds,” in Information Sciences and Systems (CISS), 2010 44th Annual Conference on.   IEEE, 2010, pp. 1–6.
  • [6] R. R. Coifman and M. Maggioni, “Diffusion wavelets,” Applied and Computational Harmonic Analysis, vol. 21, no. 1, pp. 53–94, 2006.
  • [7] D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on graphs via spectral graph theory,” Applied and Computational Harmonic Analysis, 2010.
  • [8] F. R. K. Chung, “Spectral graph theory,” vol. 92, CBMS Regional Conference Series in Mathematics.   AMS Bookstore, 1997.
  • [9]

    U. Von Luxburg, “A tutorial on spectral clustering,”

    Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007.
  • [10] M. Jansen, G. P. Nason, and B. W. Silverman, “Multiscale methods for data on graphs and irregular multidimensional situations,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 71, no. 1, pp. 97–125, 2009.
  • [11] W. Sweldens, “The lifting scheme: A custom-design construction of biorthogonal wavelets,” Applied and Computational Harmonic Analysis, vol. 3, no. 2, pp. 186–200, 1996.
  • [12] ——, “The lifting scheme: A construction of second generation wavelets,” SIAM Journal on Mathematical Analysis, vol. 29, no. 2, p. 511, 1998.
  • [13] K. Egiazarian and J. Astola, “Tree-structured Haar transforms,” Journal of Mathematical Imaging and Vision, vol. 16, no. 3, pp. 269–279, 2002.
  • [14] S. Mallat, “Geometrical grouplets,” Applied and Computational Harmonic Analysis, vol. 26, no. 2, pp. 161–180, 2009.
  • [15] G. Plonka, “The Easy Path Wavelet Transform: A New Adaptive Wavelet Transform for Sparse Representation of Two-Dimensional Data,” Multiscale Modeling & Simulation, vol. 7, p. 1474, 2009.
  • [16] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Transactions on Image Processing, vol. 15, no. 12, pp. 3736–3745, 2006.
  • [17] T. H. Cormen, Introduction to algorithms.   The MIT press, 2001.
  • [18] A. Buades, B. Coll, and J. M. Morel, “A review of image denoising algorithms, with a new one,” Multiscale Modeling and Simulation, vol. 4, no. 2, pp. 490–530, 2006.
  • [19] A. D. Szlam, M. Maggioni, and R. R. Coifman, “Regularization on graphs with function-adapted diffusion processes,” The Journal of Machine Learning Research, vol. 9, pp. 1711–1739, 2008.
  • [20] A. Singer, Y. Shkolnisky, and B. Nadler, “Diffusion interpretation of non-local neighborhood filters for signal denoising,” SIAM J. on Imag. Sci, vol. 2, no. 1, pp. 118–139, 2009.
  • [21] R. R. Coifman and D. L. Donoho, Wavelets and Statistics.   Springer-Verlag, 1995, ch. Translation-invariant de-noising, pp. 125–150.
  • [22] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on signal processing, vol. 54, no. 11, p. 4311, 2006.
  • [23] J. Mairal, M. Elad, G. Sapiro et al., “Sparse representation for color image restoration,” IEEE Transactions on Image Processing, vol. 17, no. 1, p. 53, 2008.
  • [24] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local sparse models for image restoration,” in Computer Vision, 2009 IEEE 12th International Conference on.   IEEE, 2010, pp. 2272–2279.
  • [25] A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM review, vol. 51, no. 1, pp. 34–81, 2009.