Signal processing problems may involve inference of an unknown scalar target function defined on non-uniformly sampled high-dimensional grid, a graph or a network. A major challenge in processing functions on topologically complicated coordinates, is to find efficient methods to represent and learn them. Let be the dataset such that are points in high-dimension, or feature points associated with the nodes of a weighted graph or network. Also, let be a scalar function defined on the above coordinates, and let , where . A key assumption in this work is that under a distance measure in , proximity between the two coordinates and implies proximity between their corresponding values and . The goal in this work is to develop a redundant wavelet transform that can efficiently represent the high-dimensional function . Efficiency here implies sparsity, i.e. representing accurately with as few as possible wavelet coefficients.
In our previous work , we have introduced the generalized tree-based wavelet transform (GTBWT), which is a critically sampled (in fact, unitary) wavelet transform applicable to functions defined on irregularly sampled grid of coordinates. We have shown that this transform requires less coefficients than both the one-dimensional (1D) and two-dimensional (2D) separable wavelet transforms to represent an image, and is useful for image denoising. The main limitation of the GTBWT is sensitivity to translation. Indeed, in order to obtain a smooth denoising result in , we utilized a redundant representation obtained by applying several random variants of the GTBWT to the noisy image. This approach is effectively similar to applying a redundant transform to the image in a rather cumbersome and computationally intensive manner.
In this paper, we introduce a redundant tree-based wavelet transform (RTBWT), which extends the redundant wavelet transform ,,,, to scalar functions defined on high dimensional data clouds, graphs and networks. This transform is obtained by modifying an implementation of the redundant wavelet transform, which was proposed by Shensa  and Beylkin , similarly to the way we modified the decomposition scheme of the orthonormal transform in . This implementation employs a filter-bank decomposition scheme, similarly to the orthonormal discrete wavelet transform. However, in each level of this scheme none of the coefficients are discarded. We add in each decomposition level linear operators that reorder the approximation coefficients. These operators are data-dependent, and are obtained using tree-like structures constructed from the data points. Each reordering operator is derived by organizing the tree-node features in the corresponding level of the tree so as to shorten the path that passes through these points. The reordering operators increase the regularity of the permuted approximation coefficients signals, which cause their representation with the proposed wavelet transform to be more efficient (sparse).
We explore the use of the proposed transform to image denoising, and show that it outperforms the algorithm proposed by Elad and Aharon , which is based on the K-SVD algorithm ,, and achieves denoising results that are similar to those obtained with the BM3D algorithm . We also show that the RTBWT and GTBWT achieve similar denoising results, while the former is computationally less-demanding.
The paper is organized as follows: In Section II, we introduce the proposed redundant tree-based wavelet transform. In Section III, we explore the use of this transform to image denoising, and present experimental results that demonstrate its advantages. We summarize the paper in Section IV.
Ii Redundant Tree Based Wavelet Transform
Ii-a Decomposition and Reconstruction schemes
We wish to develop a redundant wavelet transform that efficiently (sparsely) represents its input signal , defined on a irregularly sampled grid of coordinates. To this end, we extend the redundant wavelet transform, similarly to the way we extended the orthonormal transform in . We note that we construct our proposed transform by modifying an implementation of the redundant wavelet transform as proposed by Shensa  and Beylkin , and not the well known algorithme trous ,,
. This implementation employs a filter-bank decomposition scheme, similarly to the orthonormal discrete wavelet transform. However, in each level of this scheme all the coefficients are retained since the highpass bands do not contain decimators, and the decimation in the lowpass bands is replaced by a split into even and odd sequences, which are further decomposed in the next decomposition level.
Fig. 1 describes the decomposition scheme of our proposed redundant wavelet transform. We denote the coarsest decomposition level and the finest level . and denote the approximation and detail coefficients in level , respectively. We start with the finest decomposition level, , and apply the linear operator , which produces a permuted version
of its input vector.denotes a linear operator that operates in the th band, out of bands, in the th decomposition level, and produces a permuted version of its input vector. These operators make the difference between our proposed wavelet decomposition scheme and the common redundant wavelet transform ,. As we explain later, these operators ”smooth” the approximation coefficients in the different levels of the decomposition scheme. Next, we apply the wavelet decomposition filters and on , and obtain the vectors and , respectively. Let and denote decimators that keep the odd and even samples of their input, respectively. Then we employ these decimators to obtain the signals and . These two vectors are used as inputs for the next decomposition level.
We continue in a similar manner in the following decomposition levels. Let denote an approximation coefficients vector, which is found in the th band (out of bands), in the th decomposition level. This vector is obtained by starting from the th sample in , and keeping every th sample. Then in the th decomposition level we decompose each of the vectors . We first apply on each vector the linear operator and obtain a permuted version . We then filter with and and obtain the vectors and , respectively. Finally, we employ the decimators and to split each of the vectors into even and odd sequences, respectively, and obtain the set of vectors .
In a similar manner, Fig. 2 describes the reconstruction scheme of our redundant wavelet transform. and , where and
denote the wavelet reconstruction filters, and the interpolators denoted byand place the samples of their input vector in the odd and even locations of their output vector, respectively. Finally, the linear operator reorders a vector so as to cancel the ordering done by , i.e. . We next describe how the linear operators are determined in each level of the transform.
Ii-B Building the operators
We wish to design the operators in a manner which results in an efficient (sparse) representation of the input signal by the proposed transform. The wavelet transform is known to produce a small number of large coefficients when it is applied to piecewise regular signals . Thus, we would like the operator , applied to , to produce a signal which is as regular as possible. We start with the finest level, and try to find the permutation that the operator applies to . When the signal is known, the optimal solution would be to apply a simple sort operation. However, since we are interested in the case where is not necessarily known (such as in the case where is noisy, or has missing values), we would try to find a suboptimal ordering operation, using the feature coordinates .
We recall our assumption that the distance predicts the proximity between the samples and . Thus, we try to reorder the points so that they form a smooth path, hoping that the corresponding reordered 1D signal will also be smooth. The “smoothness” of the reordered signal can be measured using its total variation measure
Let denote the points in their new order. Then by analogy, we measure the ”smoothness” of the path through the points by the measure
We note that in the case that is the Euclidean distance and is Lipschitz continuous, i.e., there exists a real constant such that
for all and in , then
which means that is an upper bound for .
Minimizing comes down to finding the shortest path that passes through the set of points , visiting each point only once. This can be regarded as an instance of the traveling salesman problem , which can become very computationally exhaustive for large sets of points. We choose a simple approximate solution, which is to start from an arbitrary point (random or pre-defined), and continue from each point to its nearest neighbor, not visiting any point twice. The permutation applied by the operator is defined as the order in the found path.
In order to employ the aforementioned method to find the operators and in the th decomposition level, we again require feature points in order to predict the proximity between the samples of . Since and are obtained from through filtering and subsampling, each approximation coefficient is in fact calculated as a weighted mean of coefficients from , where the coefficients in serve as the weights. Thus, we calculate the feature point , which corresponds to , by replacing each coefficient in this weighted mean by its corresponding feature point . We then employ the approximate shortest path search method described above to obtain the operators and using the feature points that correspond to the coefficients in and , respectively.
We continue in a similar manner in the following decomposition levels. In level we first obtain the feature points as weighted means of feature points from the finer level . Then we use these feature points to obtain the operators , running the approximate shortest path searches. Similarly to the GTBWT decomposition scheme , the relation between the feature points in a full decomposition can be described using tree-like structures. Each such “generalized” tree contains all the feature points which have participated in the calculation of a single feature point from the coarsest decomposition level. Also, each feature point in the tree level is connected to all the points in level that were averaged in its construction. Fig. 3 shows an example of a “generalized” tree, which may be obtained for a dataset of length , using a filter of length and disregarding boundary issues in the different levels. As the construction of these tree-like structures play an integral part in our proposed transform, we term it redundant tree-based wavelet transform (RTBWT).
We also note that the computational complexities of both the RTBWT and the GTBWT are dominated by the number of distances that need to be calculated in their wavelet decomposition schemes. In  we employed the orthonormal transforms corresponding to several randomly constructed trees in order to apply a redundant transform. A full RTBWT decomposition, corresponding to redundancy factor of , requires the calculation of distances. The method employed in  requires distance calculations in order to obtain a transform with a similar redundancy factor. Therefore for large it requires about times more distance calculations than the RTBWT. We next demonstrate the application of our proposed transform to image denoising.
Iii Image denoising using RTBWT
Let be an image of size where , and let be its noisy version:
denotes an additive white Gaussian noise independent of
with zero mean and variance. Also, let and be vectors containing the pixels of and , respectively, arranged in the same lexicographical order. Our goal is to reconstruct from using the RTBWT. To this end, we first construct the image of size
by paddingwith mirror reflections of itself, and then extract the feature points from . Let be the th sample in , then we choose the point associated with it as the patch around the location of in the image . Next, we obtain all the operators employed by the RTBWT using the scheme described above. We use the squared Euclidean distance to measure the dissimilarity between the feature points in each level, and restrict the nearest neighbor searches performed for each patch to a surrounding square neighborhood which contains patches. This restriction decreases the computational complexity of the transform and our experiments showed that with a proper choice of it also leads to improved denoising results.
Now let be a matrix of size , containing column stacked versions of all the patches inside the image. In  we observed that improved denoising results are obtained by using in the denoising process all the signals located in the rows of . These signals are the column stacked versions of the images of size , whose top left pixel resides in the top left patch in the image . We apply a level decomposition to each of these subimages, and obtain coefficient matrices of size . We chose to perform a level decomposition, which corresponds to a redundancy factor of , over a full decomposition, as the former is less computationally and memory demanding, but produces similar denoising results. Next, we zero in each such matrix all the columns whose norm is smaller than a threshold T. Note that this is different (and better performing), compared to the plain hard thresholding that is described in . We then apply the inverse transform to each subimage, plug it into its original place in the image, and obtain the denoised image by averaging the different values obtained for each pixel.
In order to assess the performance of the proposed image denoising scheme we apply it with the Symmlet 8 wavelet filter to noisy versions of the images Lena and Barbara, with noise standard deviations. The noisy and recovered images corresponding to can be seen in Fig. 4. For comparison, we also apply to the two images the algorithm proposed by Elad and Aharon , which is based on the K-SVD algorithm ,, the BM3D algorithm , and the GTBWT with the search neighborhood and thresholding method described above. The PSNR of the results obtained with all the four denoising schemes are shown in Table I. It can be seen that the results obtained with both the RTBWT and the GTBWT are better than the ones obtained with the K-SVD algorithm, and are close to the ones obtained with the BM3D algorithm. However, the RTBWT was about times faster than the GTBWT since it required much less distance calculations.
We have proposed a new redundant wavelet transform applicable to scalar functions defined on graphs and high dimensional data clouds. This transform is the redundant version of the GTBWT introduced in . We have shown that our proposed scheme can be used for image denoising, where it achieves denoising results that are close to the state-of-the-art. In our future work plans, we intend to:
Seek ways to improve the method that reorders the approximation coefficients in each level of the tree, replacing the proposed approximate shortest path search method.
Modify the denoising algorithm so it will adaptively choose the patch and search neighborhood sizes, and the denoising threshold.
Improve the image denoising results by using two iterations with different threshold settings.
-  I. Ram, M. Elad, and I. Cohen, “Generalized Tree-Based Wavelet Transform ,” IEEE Trans. Signal Processing, vol. 59, no. 9, pp. 4199–4209, 2011.
-  S. Mallat, A Wavelet Tour of Signal Processing, The Sparse Way. Academic Press, 2009.
-  J. Fowler, “The redundant discrete wavelet transform and additive noise,” IEEE Signal Processing Letters, vol. 12, no. 9, pp. 629–632, 2005.
-  M. Holschneider, R. Kronland-Martinet, J. Morlet, and P. Tchamitchian, “A real-time algorithm for signal analysis with the help of the wavelet transform,” in Wavelets. Time-Frequency Methods and Phase Space, vol. 1. Springer-Verlag, 1989, pp. 286–297.
-  M. Shensa, “The discrete wavelet transform: Wedding the a trous and mallat algorithms,” IEEE Trans. Signal Processing, vol. 40, no. 10, pp. 2464–2482, 1992.
-  G. Beylkin, “On the representation of operators in bases of compactly supported wavelets,” SIAM J. Numer. Anal., vol. 29, no. 6, pp. 1716–1740, 1992.
-  M. Elad and M. Aharon, “Image denoising via learned dictionaries and sparse representation,” in
-  ——, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Processing, vol. 15, no. 12, pp. 3736–3745, 2006.
-  M. Aharon, M. Elad, and A. Bruckstein, “On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them,” Linear algebra and its applications, vol. 416, no. 1, pp. 48–67, 2006.
-  ——, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal processing, vol. 54, no. 11, p. 4311, 2006.
-  K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-d transform-domain collaborative filtering,” IEEE Trans. Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
-  T. H. Cormen, Introduction to algorithms. The MIT press, 2001.