Non-Asymptotic Analysis of Tangent Space Perturbation

11/20/2011
by   Daniel N. Kaslovsky, et al.
0

Constructing an efficient parameterization of a large, noisy data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach consists in recovering a local parameterization using the local tangent plane. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data samples from a nonlinear manifold, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. Using eigenspace perturbation theory and non-asymptotic random matrix theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals an appropriate scale for local tangent plane recovery. We also introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for stable recovery. With the purpose of providing perturbation bounds that can be used in practice, we propose plug-in estimates that make it possible to directly apply the theoretical results to real data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2022

Entrywise Recovery Guarantees for Sparse PCA via Sparsistent Algorithms

Sparse Principal Component Analysis (PCA) is a prevalent tool across a p...
research
10/12/2016

Towards a Theoretical Analysis of PCA for Heteroscedastic Data

Principal Component Analysis (PCA) is a method for estimating a subspace...
research
01/06/2023

Principal Component Analysis in Space Forms

Principal component analysis (PCA) is a workhorse of modern data science...
research
07/01/2022

Local manifold learning and its link to domain-based physics knowledge

In many reacting flow systems, the thermo-chemical state-space is known ...
research
02/08/2019

Non-Stationary Streaming PCA

We consider the problem of streaming principal component analysis (PCA) ...
research
09/15/2014

On the optimality of shape and data representation in the spectral domain

A proof of the optimality of the eigenfunctions of the Laplace-Beltrami ...
research
12/20/2019

Big Data Approaches to Knot Theory: Understanding the Structure of the Jones Polynomial

We examine the structure and dimensionality of the Jones polynomial usin...

Please sign up or login with your details

Forgot password? Click here to reset