A Communication-Efficient Distributed Algorithm for Kernel Principal Component Analysis

05/06/2020
by   Fan He, et al.
5

Principal Component Analysis (PCA) is a fundamental technology in machine learning. Nowadays many high-dimension large datasets are acquired in a distributed manner, which precludes the use of centralized PCA due to the high communication cost and privacy risk. Thus, many distributed PCA algorithms are proposed, most of which, however, focus on linear cases. To efficiently extract non-linear features, this brief proposes a communication-efficient distributed kernel PCA algorithm, where linear and RBF kernels are applied. The key is to estimate the global empirical kernel matrix from the eigenvectors of local kernel matrices. The approximate error of the estimators is theoretically analyzed for both linear and RBF kernels. The result suggests that when eigenvalues decay fast, which is common for RBF kernels, the proposed algorithm gives high quality results with low communication cost. Results of simulation experiments verify our theory analysis and experiments on GSE2187 dataset show the effectiveness of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
08/27/2021

FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental data preprocessing t...
research
09/05/2020

Communication-efficient distributed eigenspace estimation

Distributed computing is a standard way to scale up machine learning and...
research
03/29/2023

Improvement of variables interpretability in kernel PCA

Kernel methods have been proven to be a powerful tool for the integratio...
research
11/30/2021

HyperPCA: a Powerful Tool to Extract Elemental Maps from Noisy Data Obtained in LIBS Mapping of Materials

Laser-induced breakdown spectroscopy is a preferred technique for fast a...
research
02/16/2018

Inferring relevant features: from QFT to PCA

In many-body physics, renormalization techniques are used to extract asp...
research
04/23/2021

Positive Definite Kernels, Algorithms, Frames, and Approximations

The main purpose of our paper is a new approach to design of algorithms ...
research
04/29/2022

Distributed Learning for Principle Eigenspaces without Moment Constraints

Distributed Principal Component Analysis (PCA) has been studied to deal ...

Please sign up or login with your details

Forgot password? Click here to reset