Principal Component Analysis in Space Forms

01/06/2023
by   Puoya Tabaghi, et al.
0

Principal component analysis (PCA) is a workhorse of modern data science. Practitioners typically perform PCA assuming the data conforms to Euclidean geometry. However, for specific data types, such as hierarchical data, other geometrical spaces may be more appropriate. We study PCA in space forms; that is, those with constant positive (spherical) and negative (hyperbolic) curvatures, in addition to zero-curvature (Euclidean) spaces. At any point on a Riemannian manifold, one can define a Riemannian affine subspace based on a set of tangent vectors and use invertible maps to project tangent vectors to the manifold and vice versa. Finding a low-dimensional Riemannian affine subspace for a set of points in a space form amounts to dimensionality reduction because, as we show, any such affine subspace is isometric to a space form of the same dimension and curvature. To find principal components, we seek a (Riemannian) affine subspace that best represents a set of manifold-valued data points with the minimum average cost of projecting data points onto the affine subspace. We propose specific cost functions that bring about two major benefits: (1) the affine subspace can be estimated by solving an eigenequation – similar to that of Euclidean PCA, and (2) optimal affine subspaces of different dimensions form a nested set. These properties provide advances over existing methods which are mostly iterative algorithms with slow convergence and weaker theoretical guarantees. Specifically for hyperbolic PCA, the associated eigenequation operates in the Lorentzian space, endowed with an indefinite inner product; we thus establish a connection between Lorentzian and Euclidean eigenequations. We evaluate the proposed space form PCA on data sets simulated in spherical and hyperbolic spaces and show that it outperforms alternative methods in convergence speed or accuracy, often both.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

Linear Classifiers in Mixed Constant Curvature Spaces

Embedding methods for mixed-curvature spaces are powerful techniques for...
research
01/31/2018

An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data

We provide a probabilistic and infinitesimal view of how the principal c...
research
12/03/2021

Nested Hyperbolic Spaces for Dimensionality Reduction and Hyperbolic NN Design

Hyperbolic neural networks have been popular in the recent past due to t...
research
07/12/2018

Turning Big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering

We develop and analyze a method to reduce the size of a very large set o...
research
11/20/2011

Non-Asymptotic Analysis of Tangent Space Perturbation

Constructing an efficient parameterization of a large, noisy data set of...
research
11/11/2015

Principal Autoparallel Analysis: Data Analysis in Weitzenböck Space

The statistical analysis of data lying on a differentiable, locally Eucl...
research
07/03/2020

Online Supervised Acoustic System Identification exploiting Prelearned Local Affine Subspace Models

In this paper we present a novel algorithm for improved block-online sup...

Please sign up or login with your details

Forgot password? Click here to reset