Scalable and Privacy-Preserving Federated Principal Component Analysis

03/31/2023
by   David Froelicher, et al.
0

Principal component analysis (PCA) is an essential algorithm for dimensionality reduction in many data science domains. We address the problem of performing a federated PCA on private data distributed among multiple data providers while ensuring data confidentiality. Our solution, SF-PCA, is an end-to-end secure system that preserves the confidentiality of both the original data and all intermediate results in a passive-adversary model with up to all-but-one colluding parties. SF-PCA jointly leverages multiparty homomorphic encryption, interactive protocols, and edge computing to efficiently interleave computations on local cleartext data with operations on collectively encrypted data. SF-PCA obtains results as accurate as non-secure centralized solutions, independently of the data distribution among the parties. It scales linearly or better with the dataset dimensions and with the number of data providers. SF-PCA is more precise than existing approaches that approximate the solution by combining local analysis results, and between 3x and 250x faster than privacy-preserving alternatives based solely on secure multiparty computation or homomorphic encryption. Our work demonstrates the practical applicability of secure and federated PCA on private distributed datasets.

READ FULL TEXT
research
05/17/2021

PPCA: Privacy-preserving Principal Component Analysis Using Secure Multiparty Computation(MPC)

Privacy-preserving data mining has become an important topic. People hav...
research
09/01/2020

POSEIDON: Privacy-Preserving Federated Neural Network Learning

In this paper, we address the problem of privacy-preserving training and...
research
11/03/2022

Towards federated multivariate statistical process control (FedMSPC)

The ongoing transition from a linear (produce-use-dispose) to a circular...
research
04/07/2018

Principal Component Analysis: A Natural Approach to Data Exploration

Principal component analysis (PCA) is often used for analysing data in t...
research
06/12/2023

FADI: Fast Distributed Principal Component Analysis With High Accuracy for Large-Scale Federated Data

Principal component analysis (PCA) is one of the most popular methods fo...
research
02/06/2020

Privacy Preserving PCA for Multiparty Modeling

In this paper, we present a general multiparty model-ing paradigm with P...
research
07/18/2019

Federated PCA with Adaptive Rank Estimation

In many online machine learning and data science tasks such as data summ...

Please sign up or login with your details

Forgot password? Click here to reset