Principal Ellipsoid Analysis (PEA): Efficient non-linear dimension reduction clustering

08/17/2020
by   Debolina Paul, et al.
71

Even with the rise in popularity of over-parameterized models, simple dimensionality reduction and clustering methods, such as PCA and k-means, are still routinely used in an amazing variety of settings. A primary reason is the combination of simplicity, interpretability and computational efficiency. The focus of this article is on improving upon PCA and k-means, by allowing non-linear relations in the data and more flexible cluster shapes, without sacrificing the key advantages. The key contribution is a new framework for Principal Elliptical Analysis (PEA), defining a simple and computationally efficient alternative to PCA that fits the best elliptical approximation through the data. We provide theoretical guarantees on the proposed PEA algorithm using Vapnik-Chervonenkis (VC) theory to show strong consistency and uniform concentration bounds. Toy experiments illustrate the performance of PEA, and the ability to adapt to non-linear structure and complex cluster shapes. In a rich variety of real data clustering applications, PEA is shown to do as well as k-means for simple datasets, while dramatically improving performance in more complex settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2022

Compressibility: Power of PCA in Clustering Problems Beyond Dimensionality Reduction

In this paper we take a step towards understanding the impact of princip...
research
11/07/2012

Randomized Dimension Reduction on Massive Data

Scalability of statistical estimators is of increasing importance in mod...
research
02/22/2021

Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders

High-dimensional data sets are often analyzed and explored via the const...
research
11/23/2022

Kernel PCA for multivariate extremes

We propose kernel PCA as a method for analyzing the dependence structure...
research
08/26/2022

Tangent phylogenetic PCA

Phylogenetic PCA (p-PCA) is a version of PCA for observations that are l...
research
01/31/2016

Principal Polynomial Analysis

This paper presents a new framework for manifold learning based on a seq...

Please sign up or login with your details

Forgot password? Click here to reset