Principal Ellipsoid Analysis (PEA): Efficient non-linear dimension reduction clustering

by   Debolina Paul, et al.

Even with the rise in popularity of over-parameterized models, simple dimensionality reduction and clustering methods, such as PCA and k-means, are still routinely used in an amazing variety of settings. A primary reason is the combination of simplicity, interpretability and computational efficiency. The focus of this article is on improving upon PCA and k-means, by allowing non-linear relations in the data and more flexible cluster shapes, without sacrificing the key advantages. The key contribution is a new framework for Principal Elliptical Analysis (PEA), defining a simple and computationally efficient alternative to PCA that fits the best elliptical approximation through the data. We provide theoretical guarantees on the proposed PEA algorithm using Vapnik-Chervonenkis (VC) theory to show strong consistency and uniform concentration bounds. Toy experiments illustrate the performance of PEA, and the ability to adapt to non-linear structure and complex cluster shapes. In a rich variety of real data clustering applications, PEA is shown to do as well as k-means for simple datasets, while dramatically improving performance in more complex settings.


page 1

page 2

page 3

page 4


Compressibility: Power of PCA in Clustering Problems Beyond Dimensionality Reduction

In this paper we take a step towards understanding the impact of princip...

Randomized Dimension Reduction on Massive Data

Scalability of statistical estimators is of increasing importance in mod...

Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders

High-dimensional data sets are often analyzed and explored via the const...

Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated

Given a matrix of observed data, Principal Components Analysis (PCA) com...

Dataset Augmentation and Dimensionality Reduction of Pinna-Related Transfer Functions

Efficient modeling of the inter-individual variations of head-related tr...

Principal Polynomial Analysis

This paper presents a new framework for manifold learning based on a seq...