On landmark selection and sampling in high-dimensional data analysis

06/24/2009
by   Mohamed-Ali Belabbas, et al.
0

In recent years, the spectral analysis of appropriately defined kernel matrices has emerged as a principled way to extract the low-dimensional structure often prevalent in high-dimensional data. Here we provide an introduction to spectral methods for linear and nonlinear dimension reduction, emphasizing ways to overcome the computational limitations currently faced by practitioners with massive datasets. In particular, a data subsampling or landmark selection process is often employed to construct a kernel based on partial information, followed by an approximate spectral analysis termed the Nystrom extension. We provide a quantitative framework to analyse this procedure, and use it to demonstrate algorithmic performance bounds on a range of practical approaches designed to optimize the landmark selection process. We compare the practical implications of these bounds by way of real-world examples drawn from the field of computer vision, whereby low-dimensional manifold structure is shown to emerge from high-dimensional video data streams.

READ FULL TEXT
research
01/15/2010

An Explicit Nonlinear Mapping for Manifold Learning

Manifold learning is a hot research topic in the field of computer scien...
research
06/17/2023

Linearly-scalable learning of smooth low-dimensional patterns with permutation-aided entropic dimension reduction

In many data science applications, the objective is to extract appropria...
research
02/28/2022

Learning Low-Dimensional Nonlinear Structures from High-Dimensional Noisy Data: An Integral Operator Approach

We propose a kernel-spectral embedding algorithm for learning low-dimens...
research
10/22/2018

Perturbation Bounds for Procrustes, Classical Scaling, and Trilateration, with Applications to Manifold Learning

One of the common tasks in unsupervised learning is dimensionality reduc...
research
12/22/2020

Unsupervised Functional Data Analysis via Nonlinear Dimension Reduction

In recent years, manifold methods have moved into focus as tools for dim...
research
02/01/2016

A Spectral Series Approach to High-Dimensional Nonparametric Regression

A key question in modern statistics is how to make fast and reliable inf...
research
10/01/2020

Ray-based classification framework for high-dimensional data

While classification of arbitrary structures in high dimensions may requ...

Please sign up or login with your details

Forgot password? Click here to reset