Scalability and robustness of spectral embedding: landmark diffusion is all you need

01/03/2020
by   Chao Shen, et al.
0

While spectral embedding is a widely applied dimension reduction technique in various fields, so far it is still challenging to make it scalable and robust to handle "big data". Motivated by the need of handling such data, we propose a novel spectral embedding algorithm, which we coined Robust and Scalable Embedding via Landmark Diffusion (ROSELAND). In short, we measure the affinity between two points via a set of landmarks, which is composed of a small number of points, and ”diffuse” on the dataset via the landmark set to achieve a spectral embedding. The embedding is not only scalable and robust, but also preserves the geometric properties under the manifold setup. The Roseland can be viewed as a generalization of the commonly applied spectral embedding algorithm, the diffusion map (DM), in the sense that it shares various properties of the DM. In addition to providing a theoretical justification of the Roseland under the manifold setup, including handling the U-statistics like quantities and providing a spectral convergence rate, we show various numerical simulations and compare the Roseland with other existing algorithms.

READ FULL TEXT
research
06/16/2008

Local Procrustes for Manifold Embedding: A Measure of Embedding Quality and Embedding Algorithms

We present the Procrustes measure, a novel measure based on Procrustes r...
research
06/28/2017

Landmark Diffusion Maps (L-dMaps): Accelerated manifold learning out-of-sample extension

Diffusion maps are a nonlinear manifold learning technique based on harm...
research
03/06/2022

Diffusion Maps : Using the Semigroup Property for Parameter Tuning

Diffusion maps (DM) constitute a classic dimension reduction technique, ...
research
01/31/2019

Compressed Diffusion

Diffusion maps are a commonly used kernel-based method for manifold lear...
research
06/09/2020

Manifold structure in graph embeddings

Statistical analysis of a graph often starts with embedding, the process...
research
06/03/2020

Spectral convergence of diffusion maps: improved error bounds and an alternative normalisation

Diffusion maps is a manifold learning algorithm widely used for dimensio...
research
05/07/2014

Representative Selection for Big Data via Sparse Graph and Geodesic Grassmann Manifold Distance

This paper addresses the problem of identifying a very small subset of d...

Please sign up or login with your details

Forgot password? Click here to reset