Error Metrics for Learning Reliable Manifolds from Streaming Data

11/13/2016
by   Frank Schoeneman, et al.
0

Spectral dimensionality reduction is frequently used to identify low-dimensional structure in high-dimensional data. However, learning manifolds, especially from the streaming data, is computationally and memory expensive. In this paper, we argue that a stable manifold can be learned using only a fraction of the stream, and the remaining stream can be mapped to the manifold in a significantly less costly manner. Identifying the transition point at which the manifold is stable is the key step. We present error metrics that allow us to identify the transition point for a given stream by quantitatively assessing the quality of a manifold learned using Isomap. We further propose an efficient mapping algorithm, called S-Isomap, that can be used to map new samples onto the stable manifold. We describe experiments on a variety of data sets that show that the proposed approach is computationally efficient without sacrificing accuracy.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/24/2018

Learning Manifolds from Non-stationary Streaming Data

Streaming adaptations of manifold learning based dimensionality reductio...
10/17/2017

S-Isomap++: Multi Manifold Learning from Streaming Data

Manifold learning based methods have been widely used for non-linear dim...
02/20/2022

ChemTab: A Physics Guided Chemistry Modeling Framework

Modeling of turbulent combustion system requires modeling the underlying...
08/16/2020

Geometric Foundations of Data Reduction

The purpose of this paper is to write a complete survey of the (spectral...
05/25/2017

Jeffrey's prior sampling of deep sigmoidal networks

Neural networks have been shown to have a remarkable ability to uncover ...
05/14/2018

Copulas for Streaming Data

Empirical copula functions can be used to model the dependence structure...
01/23/2021

ReliefE: Feature Ranking in High-dimensional Spaces via Manifold Embeddings

Feature ranking has been widely adopted in machine learning applications...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Progress in science and engineering depends more than ever on our ability to analyze huge amounts of sensor and simulation data [11]. The vast majority of this data, coming from, for example, high-performance high-fidelity numerical simulations, high-resolution scientific instruments (microscopes, DNA sequencers, etc.) or Internet of Things feeds, is a result of complex non-linear processes. While these non-linear processes can be characterized by low-dimensional sub-manifolds, the actual observable data they generate is high-dimensional.

High-dimensional data is inherently difficult to explore and analyze, owing to the “curse of dimensionality” and “empty space phenomenon” that render many statistical and machine learning techniques (e.g. clustering, classification, model fitting, etc.) inadequate 

[3, 16]. In this context, non-linear spectral dimensionality reduction (NLSDR) has proved to be an indispensable tool [10]. However, the standard NLSDR methods, e.g. Isomap [17] or Locally Linear Embedding [13], have been designed for off-line or batch processing where the entire input data is available at the time of analysis. Consequently, they are computationally too expensive or impractical in cases where dimensionality reduction must be applied on a data stream.

In this paper, we address the above limitation of classic NLSDR methods and propose an alternative solution that builds on the systematic study of errors in manifold learning. We first describe a protocol to capture a “collective error” of the Isomap method. Then, we show how this error can be used to detect a transition point at which sufficient data has been accumulated to describe a high quality manifold, and computationally lightweight techniques can be used to efficiently map the remaining data in a stream. Our specific contributions are as follows:

  1. We formalize a notion of collective error in Isomap and describe different strategies to quantify it using Procrustes analysis. We perform careful experimental study of the error behavior with respect to the available data using synthetic as well as real-life benchmark data. We identify properties of the error that can be used to detect when manifold becomes stable and robust to incorporating new points.

  2. We propose a new efficient algorithm to incorporate streamed data into a stable Isomap manifold. The complexity of the algorithm depends only on the size of the initial stable manifold and is independent of the stream size. Thus, the algorithm is suitable for high-volume and high-throughput stream processing.

In our approach, we focus on Isomap because of its broad adoption in scientific computing [15] and bio-medical research, including fMRI analysis [18], clustering of oncology data [12], genes and proteins expression profiling [9], modeling of spatio-temporal relationships in data  [6], and many others.

Isomap provides a simple and elegant way to estimate the intrinsic geometry of the data manifold based on a rough estimate of each data point’s neighbors on the manifold. Unfortunately, its high computational complexity means it is too expensive to use on any but relatively small data sets, and it is not suited for stream processing. Consequently, there is a significant demand for an Isomap variant that would be capable to learn a robust manifold from high-throughput data streams.

The remainder of this paper is organized as follows. In Section 2, we provide preliminary information and high level overview of NLSDR methods. In Section 3, we introduce our approach to quantify error in Isomap, and in Section 4 we report experimental results showing properties of our error measure. In Section 5, we introduce our new efficient algorithm for out-of-sample Isomap extension. We close the paper with a brief survey of related work in Section 6, and concluding remarks in Section 7.

2 Preliminaries

Figure 1: General non-linear spectral dimensionality reduction workflow.

In the most general terms, the non-linear dimensionality reduction problem can be posed as follows. Given a data matrix , such that each , the task is to find a corresponding low-dimensional representation, , for each , where .

Non-linear spectral dimensionality reduction methods assume that the data lies along a non-linear low-dimensional manifold embedded in a high-dimensional space, and exploit the global (Isomap [17], Minimum Volume Embedding [19]) or local (Locally Linear Embedding [13], Laplacian Eigenmaps [1]) properties of the manifold to map each to its corresponding .

Most NLSDR methods follow a similar series of data transformations as depicted in Figure 1. In the first step, the matrix is used to construct a neighborhood graph, where each node of the graph is connected to its

nearest neighbors (KNN) in some, typically Euclidean, metric. This process involves

pairwise distance computations. Next, the neighborhood graph is analyzed to construct a feature matrix. This matrix encodes properties of the input data that should be preserved during dimensionality reduction, i.e. it directly captures properties of the searched sub-manifold. For example, in the classic Isomap formulation [17], the feature matrix stores shortest paths between each pair of points (nodes) in the neighborhood graph, which approximates the actual geodesic distance between the points. The cost of the feature matrix construction varies between different NLSDR methods, but generally falls into the range and . To obtain the final -dimensional representation of the input data, , the feature matrix is factorized and the first eigenvectors become the output . The cost of this step is usually .

If directly and naively applied on streaming data, NLSDR methods have to recompute the entire manifold each time a new point is extracted from a stream. This quickly becomes computationally prohibitive but guarantees the best possible quality of the learned manifold given the data. To alleviate the computational problem, landmark or general out-of-sample extension methods have been proposed (see Section 6). These techniques however either neglect manifold approximation error or remain computationally too expensive for practical applications (for example, their complexity depends directly on the stream size). Here, we propose a different strategy. Initially, we aggregate incoming data and process it using the standard Isomap. At the same time, we trace the quality of the resulting manifold with some computationally acceptable overhead. When we detect that adding new points does not improve the quality of the discovered manifold, i.e. manifold is stable, we drop the standard Isomap and proceed with a faster approximate method that is sufficient to maintain the quality of the manifold. Our approach requires two components that we describe in the following sections: a method to assess accuracy of the learned manifold, and an efficient algorithm to assimilate the remaining points from a stream.

3 Proposed Approach to Isomap Error

To quantitatively assess the performance of Isomap in large-scale streaming applications we first need an error metric to determine the accuracy of the learned manifold.

Error in Isomap may arise due to several reasons, including incorrect parameter selection and noisy input data. Parameter selection error may be introduced by a poor choice of neighborhood size, , or selecting a sub-optimal dimensionality

. Error due to input data can be attributed to noise, missing or limited data entries, or a skewed, rather than uniform, sampling of the underlying manifold. Finally, there could be the intrinsic error of Isomap itself, for instance due to simplifying assumptions about manifold, limited numerical precision, etc.

While error in Isomap has been discussed in prior work (see Section 6), most of these studies have focused on one particular aspect of the error. We are interested in understanding the collective error associated with Isomap as a function of the size of the input data. Here we present metrics to measure this error.

3.1 Error Definition

We assume that there exists a function that maps each data sample to . The goal of NLSDR is to learn the inverse mapping, , that can be used to map high-dimensional to low-dimensional , i.e. .

Let the approximate mapping learned by Isomap be denoted by the inverse function . Let denote the data matrix containing the true low-dimensional mapping for the samples in , i.e., and denote the matrix with the approximate mapping, i.e. .

To measure the error between the true representation and the Isomap induced approximate representation, we leverage Procrustes analysis, which is widely used for shape analysis [5]. The idea behind Procrustes analysis is to align two matrices, and , by finding the optimal translation , rotation , and scaling that minimizes the Frobenius norm between the two aligned matrices, that is:

(1)

The above optimization problem has a closed form solution obtained by performing Singular Value Decomposition of

 [5].

We use the Procrustes analysis to measure the quality of the manifold approximated by Isomap in two ways: 1) A direct method for comparison when low-dimensional ground truth, , is available. 2) A reference-sample method that can be used when ground truth is absent. The asymptotic behavior of this method converges to that of the first approach.

Direct Procrustes Error

The first approach requires the ground truth reference for the low-dimensional manifold. When both and are known, the error is measured using (1), that is:

(2)

The error is expected to decrease with the size of the input sample . Our hypothesis is that and have an inverse relationship and converges to a constant value as becomes larger. We provide an informal analysis to support this hypothesis.

For manifolds with known properties (e.g. volume, radius of curvature, etc.), the original Isomap paper [17] shows that the ratio between the approximate geodesic distance between a pair of samples computed from the nearest neighbor graph and the true geodesic distance on the manifold can be bounded within

with probability at least

, for arbitrarily small values of and , if the density of the samples is at least greater than , which is a function of , , , , and the manifold properties ( is the bound on the ratio between the approximate and the true geodesic distances). Closer inspection reveals that:

(3)

We can relate and the Procrustes error

. In particular, using results from the Eigenvalue perturbation theory 

[7], we note that the difference between the eigenvalues of the approximate geodesic matrix will be within of the eigenvalues obtained from the true geodesic matrix, which implies that . Assuming that the data is sampled uniformly and using the result from (3), the following holds:

(4)

where is the minimum number of points required to establish the error bound.

Reference Sample Error

In the absence of low-dimensional ground truth we use a reference-sample method. The reference-sample method works in the following way. Given , we select , a reference set, and two equal sized sample sets . Next, we create two data sets, and , such that for . An approximation is learned for each and error is computed as the Procrustes error for the two learned approximations of the reference set :

(5)

As we increase the size of the sample sets and we expect the reference-sample error in (5) to behave similar to the Procrustes error computed directly as in (2). In fact, we empirically show that both errors have similar asymptotic behavior on several data sets.

Figure 2: Isomap run on data samples of various size from the Euler Isometric Swiss Roll. The learned manifold is compared with the ground truth data using Procrustes error (2).

4 Experimental Results

We present several experiments on a variety of data sets to illustrate the behavior of error using the metrics proposed in Section 3. This allows for better understanding of the asymptotic behavior of error in Isomap, required by our strategy for data stream processing. In particular, our objective is to show that the error converges after a certain number of samples are observed.

4.1 Test Data

The Swiss Roll data set is typically used for evaluating manifold learning algorithms. Swiss Roll data is generated from 2-dimensional data, embedded into 3 dimensions using a non-linear function. Specifically, we have Y where:

(6)

The common approach is to generate the 3-dimensional Swiss Roll from through the use of nonlinear functions of the form , , and . This is problematic due to the fact that , are not isometric, i.e. distances between points in are not preserved along the surface generated in . We propose a new data set to rectify this issue, the Euler Isometric Swiss Roll. The idea is to substitute the commonly used with the equations for the Euler Spiral:

(7)
(8)

The Euler spiral has the property that the curvature at any point is proportional to the distance from the origin. This gives constant angular acceleration along the curve thus ensuring that isometry is preserved.

In order to investigate our approach to error analysis in the absence of ground truth, we consider two benchmark data sets: the MNIST handwritten digit database and the Corel Image data set

. The MNIST database is composed of 70,000 normalized,

grayscale images of handwritten digits ‘0’ to ‘9’. Each image is represented by a 784-dimensional vector resulting from the normalized, grayscale image. Each of the ten digits has roughly 6,000 samples. The Corel Image data set consists of 68,040 photo images from various categories. Each image is represented using 57 features.

Figure 3: Isomap on two data samples of various size from the Euler Isometric Swiss Roll. First, the learned manifold is compared with the ground truth using Procrustes error (2) to illustrate the behavior of error with increasing data. Next, the shared points in both samples are compared using reference-sample error (5). In comparing (2) and (5) we highlight the similar asymptotic behavior of both approaches.

4.2 Results and Analysis

First we apply Isomap to samples of data from the Swiss Roll data set. This is to investigate the ability of Isomap to reconstruct the true low-dimensional structure under the conditions of limited data and increasing data. The obtained results are presented in Figure 2. Each experiment is performed 10 times and the mean error over all trials is reported.

Figure 2 shows that initially for smaller amounts of data the error as formulated in (2) is relatively high. As more data samples are used the error first decreases rapidly and then begins to converge at approximately 1,500 samples. This suggests that the approximation may be learned within some error tolerance using 1,500 points, and that additional data points contribute no significant information. It is at this point that we wish to use more computationally lightweight, approximate methods to process the remaining samples.

Next, we consider the reference-sample method described in Section 3. In applying our sampling method to the Swiss Roll data we accomplish two goals. Like in the first experiment, we are able to investigate the quality of the Isomap approximation as data availability increases. In addition, because we have ground truth available, we are able to validate our observation that both errors asymptotically converge, in the limit of increasing data.

In Figure 3, we present results for both approaches to error analysis. In the reference-sample method we take reference set with . We start with sample sets with 100 samples and increase their size by 100 samples at each step. The error between and is computed as prescribed by (5). At the same time, the error for compared with ground truth coordinates (6) is computed and plotted alongside the reference-sample error.

Having gained an understanding of the behavior of errors on synthetic data with availability of ground truth we turn our attention to real-world data. As an example we consider samples for single digits from the MNIST database and the Corel Image data set. For each data set, we determine its intrinsic dimensionality based on residual variance and then run the reference-sample method to obtain the error plots for increasing sample size, as shown in Figure 

4.

(a) Digit ‘5’ from MNIST Database
(b) Digit ‘7’ from MNIST Database
(c) Digit ‘9’ from MNIST Database
(d) Corel Image Database
Figure 4: Reference-sample error from (5) for the benchmark data sets. The results illustrate the stabilization of the learned manifold with increasing data.

For each data set, we observe a similar behavior to that seen for both approaches when applied to the Swiss Roll. The error decreases rapidly with additional samples. As the size of the sample sets approaches around 2,000-2,500 samples, the error begins to stabilize as both approximations become more accurate. The conclusion we draw from this is that for sample of more than points, a reliable manifold can be learned using both and , which results in convergence of the error.

5 Proposed Streaming Isomap Algorithm

To form a stable, well represented manifold, Isomap requires an adequate number of samples to learn the topology of the manifold. More samples can capture the idiosyncrasies of the manifold surface better. However, our experimental results in Section 4 suggest that there exists a transition point beyond which the quality of the manifold does not change significantly, even on addition of more samples. This means that once the point of transition is reached, instead of building the entire manifold in a batch fashion, we can potentially employ a computationally less expensive procedure to map the new samples arriving in the data stream. Based on this observation, we propose an out-of-sample extension method called Streaming- or S-Isomap that can adequately predict the embedding of incoming samples in an efficient manner. The algorithm avoids recomputation of the entire geodesic distance matrix or its eigen decomposition, both of which are . While other out-of-sample techniques exist [2], we demonstrate in Section 5.2 that the proposed algorithm is significantly more efficient.

0.975
0:  , , , ,
0:  
1:  , KNN, ,
2:  
3:  for  do
4:     
5:  
6:  )
7:  
8:  
9:  
10:  return  
Algorithm 1 Streaming Isomap (S-Isomap)

The algorithm assumes that a transition point has already been identified by monitoring the Isomap error using methods discussed in Section 3. Let matrix denote the batch of samples encountered in the stream before the point of transition. From our previous experimental results we can assume that is relatively small compared to the remaining size of the stream. Let represent the remaining part of the input data stream. Furthermore, let represent the geodesic distance matrix between the samples in , and let be the matrix containing the corresponding low-dimensional representations of the samples in .

5.1 Algorithm

Our proposed method is shown in Algorithm 1. The key assumption is that because the manifold learned from is stable, we do not have to recompute the entire geodesic distance matrix each time a new point is added. Instead, it is sufficient that we find the nearest neighbors of in and use those to approximate geodesic distance between and the remaining points in . This step is realized in lines 1-4. Here, stores indexes of the nearest neighbors of , represents the corresponding distances, and is the approximate geodesic distance between and the -th point in . Given the updated geodesic distances we can obtain the low-dimensional coordinates of using transformations similar to [8]. Specifically, we match the inner product between and points in to the computed geodesics. If we denote the mean of entries of by and the mean of all entries of by , then our desired inner product can be expressed by as given in line 6. Here, we use to represent a vector of ones, and to represent the vector of row means of . Then, by solving for , whose inner product with is equal , we obtain the low-dimensional representation of (lines 7-9).

5.2 Analysis

For a single point , the proposed S-Isomap algorithm takes time to compute NN, to compute , and for the least-squares estimation in solving for , where is the size of the batch . Thus, its runtime complexity is per streaming point. If we consider the cost of learning the initial manifold, the proposed S-Isomap algorithm requires time to process a stream of size . This is significant because the cost of mapping a new sample depends only on the batch size and parameters and . Since these are very small compared to the entire stream size, the resulting computational savings are significant. Consequently, the algorithm is well suited for high-throughput streams. In contrast, the complexity of the incremental Isomap [8], which is the other method designed for streams, scales quadratically with the size of the stream, and it can be when inserting -th sample from the stream if no initial batch is available, or when is given.

The space complexity of our algorithm is dominated by the terms and due to the geodesic distance matrix, , and the low-dimensional representation of the batch samples, . Thus, the space requirement does not grow with the size of the stream, which makes the algorithm appealing for handling high-volume streams.

5.3 Experimental Results

To validate the proposed S-Isomap algorithm we perform a set of experiments to asses how accurately and efficiently it maps samples in a stream to a manifold learned from a fixed sized batch.

Figure 5: Effect of Procrustes error as we progressively vary the size of the batch data from Euler Isometric Swiss Roll. The Procrustes error using only the batch data set (without our proposed S-Isomap algorithm) is also added as reference. The results illustrate that the error due to streaming points is low.
(a) Digit ‘5’ from MNIST Database
(b) Digit ‘7’ from MNIST Database
(c) Digit ‘9’ from MNIST Database
(d) Corel Image Database
Figure 6: Effect of Procrustes error as we progressively vary the size of the batch data set on benchmark data sets. The results illustrate that the error due to streaming points is very low, as well as the asymptotic behavior is almost the same.
Figure 7: Timing results for our method compared to Isomap (horizontal reference line on top). The results are in log scale and demonstrate the performance gain achieved from our proposed S-Isomap algorithm.
 
Figure 8: Timing results for S-Isomap compared to Isomap for fixed batch size of 2,000 points with streaming data. The results are in log scale and demonstrate the performance gain achieved from our proposed S-Isomap algorithm.

In the first set of experiments, we measure the error in the mapping obtained using Algorithm 1 and compare it with the error in the mapping from using standard Isomap algorithm. Using a data set of size , we learn a manifold using a batch of size and then map the remaining samples using S-Isomap. The entire mapping is compared to the ground truth using the Procrustes error. Figure 5 shows the results for the Isometric Swiss Roll data with 10,000. Similar results are shown for the other data sets in Figure 6. For the other data sets, we use the mapping obtained by applying Isomap on full data as the ground truth. The results show that while for smaller samples the error is high, the error stabilizes to a low value after the point of transition. The error closely tracks the error for only the batch portion of the data set. This indicates that beyond the point of transition, the mapping obtained using S-Isomap is as accurate as that obtained from the standard Isomap.

To assess the computational efficiency of the S-Isomap algorithm, we show the time taken by S-Isomap (including the time for Isomap on the batch and S-Isomap on remaining stream) in Figure 7. The standard Isomap timing result on all 10,000 samples is shown as the horizontal baseline for reference. We note that S-Isomap is able to map the samples in the stream much more efficiently than Isomap on the entire data. Overall, Figures 5 and 7 show that the proposed method is able to map samples in a stream in a highly efficient manner without sacrificing the quality of the manifold.

To understand how S-Isomap would behave in a truly streaming mode we conduct a second timing experiment. Here, we fix the size of the batch to the estimated switching point (2,000 for the Isometric Swiss Roll) and then progressively add samples to the stream and process it with S-Isomap. We measure the cumulative time taken by S-Isomap to process the remainder of the stream, for different sizes. Figure 8 shows the times compared with the runtime for running Isomap on the aggregated batch and stream data. As previously, the cumulative time of running S-Isomap scales linearly with the size of the stream. This further confirms efficiency of the method.

6 Related Work

Given the high computational complexity of Isomap, variants of Isomap, such as Landmark Isomap [4] and out-of-sample extension techniques [2], have been proposed as a computationally viable alternative. Both of these methods either use a smaller set of landmark points or approximations to avoid performing the costly eigen decomposition on the geodesic distance matrix, where is the number of points in the entire data set. However, they still require computing the full geodesic distance matrix which is , where is the diameter of the embedded NN graph. The Incremental Isomap algorithm [8] avoids both eigen decomposition and a recreation of the geodesic distance matrix. However, it requires updates to the geodesic distance matrix that incurs a significant cost, as discussed in Section 5.2. Consequently, the method is unsuitable for the streaming setting.

Errors in Isomap have been discussed in prior work [10], but those studies have been typically in regards to selection of parameters. For example, Samko et al. [14] proposed measuring a simple manifold embedding error for a range of to find the best choice of . Similarly, an approach based on the -edge disjoint minimal spanning tree algorithm has been proposed to construct a neighborhood graph with connectivity guarantees [20]. In the same spirit, several strategies to assess intrinsic manifold dimension are available [10]. However, to the best of our knowledge there is no prior work in defining and understanding the behavior of error that persists even with the selection of optimal parameters. We address this error in taking an abstract view of Isomap, providing a protocol for measuring collective error, and understanding its behavior. In doing so we identify the optimal point where we may switch from exact to lightweight methods.

7 Conclusions

The error in Isomap approaches as the density of sampled data tends to infinity. However, in practical settings, we can expect that after a certain level of sampling the error does not change significantly. In other words, the learned manifold becomes stable if the sample size reaches a certain threshold, under the assumption of uniform sampling. In this paper, we have presented the error metrics that can be used to empirically observe when the manifold becomes stable. In particular, the reference-sample metric is appealing because it can assess the manifold quality even when ground truth data is unavailable.

Equipped with the knowledge of the point of transition, we have presented a streaming algorithm, S-Isomap, that can be used to efficiently map new data samples to the stable manifold, instead of performing costly updates. The fact that the cost of mapping new samples in S-Isomap depends only on the size of the initial batch used to generate the stable Isomap, and is independent of the size of the stream itself, makes it a very powerful tool to process massive streams of data.

8 Acknowledgment

Authors wish to acknowledge support provided by the Center for Computational Research at the University at Buffalo. This work has been supported in part by the NSF grants CNS-1409551 and IIS-1651475.

References

  • [1] M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Neur. Inf. Proc. Sys., pages 585–591, 2001.
  • [2] Y. Bengio, J. Paiement, P. Vincent, O. Delalleau, N. Le Roux, and M. Ouimet.

    Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering.

    In Neur. Inf. Proc. Sys., pages 177–184, 2004.
  • [3] R. Clarke, H.W. Ressom, A. Wang, J. Xuan, M.C. Liu, E.A. Gehan, and Y. Wang. The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat. Rev., 8(1):37–49, 2008.
  • [4] V. de Silva and J.B. Tenenbaum. Global versus local methods in nonlinear dimensionality reduction. In Neur. Inf. Proc. Sys., pages 705–712, 2002.
  • [5] I.L. Dryden and K.V. Mardia. Statistical Shape Analysis. John Wiley & Sons, 1998.
  • [6] O.C. Jenkins and M.J. Mataric. A spatio-temporal extension to isomap nonlinear dimension reduction. In Int. Conf. Mach. Lear., page 56. ACM, 2004.
  • [7] P. Lancaster and M. Tismenetsky. The Theory of Matrices: With Applications. Computer Science and Scientific Computing Series. Academic Press, 1985.
  • [8] M.H.C. Law and A.K. Jain. Incremental nonlinear dimensionality reduction by manifold learning. IEEE Tran. Patt. Anal. Mach. Intell., 28(3):377–391, 2006.
  • [9] G. Lee, C. Rodriguez, and A. Madabhushi.

    Investigating the efficacy of nonlinear dimensionality reduction schemes in classifying gene and protein expression studies.

    IEEE/ACM Trans. on Comp. Biology and Bioinformatics, 5(3):368–384, 2008.
  • [10] J.A. Lee and M. Verleysen. Nonlinear dimensionality reduction. Springer Science & Business Media, 2007.
  • [11] K. Morik, H. Durrant-Whyte, G. Hill, D. Muller, and T. Berger-Wolf. Data driven science: SIGKDD Panel. In KDD, pages 2329–2330, 2015.
  • [12] C. Orsenigo and C. Vercellis. A comparative study of nonlinear manifold learning methods for cancer microarray data classification. Expert Systems with Applications, 40(6):2189–2197, 2013.
  • [13] S.T. Roweis and L.L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
  • [14] O. Samko, A.D. Marshall, and P.L. Rosin. Selection of the optimal parameter value for the isomap algorithm. Pattern Recognition Letters, 27(9):968–979, 2006.
  • [15] S.K. Samudrala, J. Zola, S. Aluru, and B. Ganapathysubramanian. Parallel framework for dimensionality reduction of large-scale datasets. Scientific Programming, 2015:1, 2015.
  • [16] D.W. Scott and J.R. Thompson. Probability density estimation in higher dimensions. In Symposium on the Interface of Computing Science and Statistics, volume 528, pages 173–179, 1983.
  • [17] J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319, 2000.
  • [18] B. Thirion and O. Faugeras. Nonlinear dimension reduction of fMRI data: the Laplacian embedding approach. In IEEE Int. Symp. Biomed. Imaging Macro to Nano, pages 372–375, 2004.
  • [19] K.Q. Weinberger, B. Packer, and L. K. Saul. Nonlinear dimensionality reduction by semidefinite programming and kernel matrix factorization. In

    International Conference on Artificial Intelligence and Statistics

    , 2005.
  • [20] L. Yang. Building k edge-disjoint spanning trees of minimum total length for isometric data embedding. IEEE Tran. Patt. Anal. Mach. Intell., 27(10):1680–1683, 2005.