Maximum Margin Principal Components

05/17/2017
by   Xianghui Luo, et al.
0

Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first K principal components minimizes the sum of squared errors between the original data and the projected data over all possible rank K projections. Thus, PCA provides optimal low-rank representations of data for least-squares linear regression under standard modeling assumptions. On the other hand, when the loss function for a prediction problem is not the least-squares error, PCA is typically a heuristic choice of dimensionality reduction -- in particular for classification problems under the zero-one loss. In this paper we target classification problems by proposing a straightforward alternative to PCA that aims to minimize the difference in margin distribution between the original and the projected data. Extensive experiments show that our simple approach typically outperforms PCA on any particular dataset, in terms of classification error, though this difference is not always statistically significant, and despite being a filter method is frequently competitive with Partial Least Squares (PLS) and Lasso on a wide range of datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2020

Improved Dimensionality Reduction of various Datasets using Novel Multiplicative Factoring Principal Component Analysis (MPCA)

Principal Component Analysis (PCA) is known to be the most widely applie...
research
01/03/2019

Projecting "better than randomly": How to reduce the dimensionality of very large datasets in a way that outperforms random projections

For very large datasets, random projections (RP) have become the tool of...
research
12/09/2020

Spatial noise-aware temperature retrieval from infrared sounder data

In this paper we present a combined strategy for the retrieval of atmosp...
research
10/08/2020

Dataset Augmentation and Dimensionality Reduction of Pinna-Related Transfer Functions

Efficient modeling of the inter-individual variations of head-related tr...
research
10/25/2017

DPCA: Dimensionality Reduction for Discriminative Analytics of Multiple Large-Scale Datasets

Principal component analysis (PCA) has well-documented merits for data e...
research
07/29/2015

Fast Robust PCA on Graphs

Mining useful clusters from high dimensional data has received significa...
research
11/08/2011

Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

Principal component analysis (PCA) is widely used for dimensionality red...

Please sign up or login with your details

Forgot password? Click here to reset