Robust Principal Component Analysis: A Median of Means Approach

02/05/2021
by   Debolina Paul, et al.
0

Principal Component Analysis (PCA) is a fundamental tool for data visualization, denoising, and dimensionality reduction. It is widely popular in Statistics, Machine Learning, Computer Vision, and related fields. However, PCA is well known to fall prey to the presence of outliers and often fails to detect the true underlying low-dimensional structure within the dataset. Recent supervised learning methods, following the Median of Means (MoM) philosophy, have shown great success in dealing with outlying observations without much compromise to their large sample theoretical properties. In this paper, we propose a PCA procedure based on the MoM principle. Called the Median of Means Principal Component Analysis (MoMPCA), the proposed method is not only computationally appealing but also achieves optimal convergence rates under minimal assumptions. In particular, we explore the non-asymptotic error bounds of the obtained solution via the aid of Vapnik-Chervonenkis theory and Rademacher complexity, while granting absolutely no assumption on the outlying observations. The efficacy of the proposal is also thoroughly showcased through simulations and real data applications.

READ FULL TEXT
research
10/22/2019

Principal Component Analysis: A Generalized Gini Approach

A principal component analysis based on the generalized Gini correlation...
research
08/07/2020

Modal Principal Component Analysis

Principal component analysis (PCA) is a widely used method for data proc...
research
08/31/2020

Directional Assessment of Traffic Flow Extremes

We analyze extremes of traffic flow profiles composed of traffic counts ...
research
09/11/2018

Visualization of High-dimensional Scalar Functions Using Principal Parameterizations

Insightful visualization of multidimensional scalar fields, in particula...
research
04/26/2019

Poisson PCA: Poisson Measurement Error corrected PCA, with Application to Microbiome Data

In this paper, we study the problem of computing a Principal Component A...
research
09/19/2022

Machine Learning Class Numbers of Real Quadratic Fields

We implement and interpret various supervised learning experiments invol...
research
01/03/2023

Deep Spectral Q-learning with Application to Mobile Health

Dynamic treatment regimes assign personalized treatments to patients seq...

Please sign up or login with your details

Forgot password? Click here to reset