FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

08/27/2021
by   Arpita Gang, et al.
0

Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often reduced to dimension reduction, the purpose of PCA is actually two-fold: dimension reduction and feature learning. Furthermore, the enormity of the dimensions and sample size in the modern day datasets have rendered the centralized PCA solutions unusable. In that vein, this paper reconsiders the problem of PCA when data samples are distributed across nodes in an arbitrarily connected network. While a few solutions for distributed PCA exist those either overlook the feature learning part of the purpose, have communication overhead making them inefficient and/or lack exact convergence guarantees. To combat these aforementioned issues, this paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA). The proposed algorithm is efficient in terms of communication and can be proved to converge linearly and exactly to the principal components that lead to dimension reduction as well as uncorrelated features. Our claims are further supported by experimental results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/05/2021

A Linearly Convergent Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is the workhorse tool for dimensional...
research
05/06/2020

A Communication-Efficient Distributed Algorithm for Kernel Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental technology in machin...
research
03/31/2021

Dimension reduction of open-high-low-close data in candlestick chart based on pseudo-PCA

The (open-high-low-close) OHLC data is the most common data form in the ...
research
01/31/2017

Representation of big data by dimension reduction

Suppose the data consist of a set S of points x_j, 1 ≤ j ≤ J, distribute...
research
06/12/2023

FADI: Fast Distributed Principal Component Analysis With High Accuracy for Large-Scale Federated Data

Principal component analysis (PCA) is one of the most popular methods fo...
research
10/16/2018

Fast Randomized PCA for Sparse Data

Principal component analysis (PCA) is widely used for dimension reductio...
research
04/23/2021

Positive Definite Kernels, Algorithms, Frames, and Approximations

The main purpose of our paper is a new approach to design of algorithms ...

Please sign up or login with your details

Forgot password? Click here to reset