Maximally Correlated Principal Component Analysis

02/17/2017
by   Soheil Feizi, et al.
0

In the era of big data, reducing data dimensionality is critical in many areas of science. Widely used Principal Component Analysis (PCA) addresses this problem by computing a low dimensional data embedding that maximally explain variance of the data. However, PCA has two major weaknesses. Firstly, it only considers linear correlations among variables (features), and secondly it is not suitable for categorical data. We resolve these issues by proposing Maximally Correlated Principal Component Analysis (MCPCA). MCPCA computes transformations of variables whose covariance matrix has the largest Ky Fan norm. Variable transformations are unknown, can be nonlinear and are computed in an optimization. MCPCA can also be viewed as a multivariate extension of Maximal Correlation. For jointly Gaussian variables we show that the covariance matrix corresponding to the identity (or the negative of the identity) transformations majorizes covariance matrices of non-identity functions. Using this result we characterize global MCPCA optimizers for nonlinear functions of jointly Gaussian variables for every rank constraint. For categorical variables we characterize global MCPCA optimizers for the rank one constraint based on the leading eigenvector of a matrix computed using pairwise joint distributions. For a general rank constraint we propose a block coordinate descend algorithm and show its convergence to stationary points of the MCPCA optimization. We compare MCPCA with PCA and other state-of-the-art dimensionality reduction methods including Isomap, LLE, multilayer autoencoders (neural networks), kernel PCA, probabilistic PCA and diffusion maps on several synthetic and real datasets. We show that MCPCA consistently provides improved performance compared to other methods.

READ FULL TEXT
research
01/05/2021

A Linearly Convergent Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is the workhorse tool for dimensional...
research
05/30/2017

Decorrelation of Neutral Vector Variables: Theory and Applications

In this paper, we propose novel strategies for neutral vector variable d...
research
05/03/2019

Uncertainty-Aware Principal Component Analysis

We present a technique to perform dimensionality reduction on data that ...
research
04/07/2022

Covariance matrix preparation for quantum principal component analysis

Principal component analysis (PCA) is a dimensionality reduction method ...
research
04/15/2016

Probing the Intra-Component Correlations within Fisher Vector for Material Classification

Fisher vector (FV) has become a popular image representation. One notabl...
research
02/11/2018

Multi-set Canonical Correlation Analysis simply explained

There are a multitude of methods to perform multi-set correlated compone...
research
10/18/2022

Research on the impact of asteroid mining on global equity

In the future situation, aiming to seek more resources, human beings dec...

Please sign up or login with your details

Forgot password? Click here to reset