Federated PCA with Adaptive Rank Estimation

07/18/2019
by   Andreas Grammenos, et al.
8

In many online machine learning and data science tasks such as data summarisation and feature compression, d-dimensional vectors are usually distributed across a large number of clients in a decentralised network and collected in a streaming fashion. This is increasingly common in modern applications due to the sheer volume of data generated and the clients' constrained resources. In this setting, some clients are required to compute an update to a centralised target model independently using local data while other clients aggregate these updates with a low-complexity merging algorithm. However, some clients with limited storage might not be able to store all of the data samples if d is large, nor compute procedures requiring at least Ω(d^2) storage-complexity such as Principal Component Analysis, Subspace Tracking, or general Feature Correlation. In this work, we present a novel federated algorithm for PCA that is able to adaptively estimate the rank r of the dataset and compute its r leading principal components when only O(dr) memory is available. This inherent adaptability implies that r does not have to be supplied as a fixed hyper-parameter which is beneficial when the underlying data distribution is not known in advance, such as in a streaming setting. Numerical simulations show that, while using limited-memory, our algorithm exhibits state-of-the-art performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2018

History PCA: A New Algorithm for Streaming PCA

In this paper we propose a new algorithm for streaming principal compone...
research
04/27/2021

Pronto: Federated Task Scheduling

We present a federated, asynchronous, memory-limited algorithm for onlin...
research
06/28/2013

Memory Limited, Streaming PCA

We consider streaming, one-pass principal component analysis (PCA), in t...
research
01/11/2022

RFLBAT: A Robust Federated Learning Algorithm against Backdoor Attack

Federated learning (FL) is a distributed machine learning paradigm where...
research
03/31/2023

Scalable and Privacy-Preserving Federated Principal Component Analysis

Principal component analysis (PCA) is an essential algorithm for dimensi...
research
03/03/2022

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data

Despite enormous research interest and rapid application of federated le...
research
09/05/2020

Communication-efficient distributed eigenspace estimation

Distributed computing is a standard way to scale up machine learning and...

Please sign up or login with your details

Forgot password? Click here to reset