Near-Optimal Algorithms for Differentially-Private Principal Components

07/12/2012
by   Kamalika Chaudhuri, et al.
0

Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Differentially private low-dimensional representation of high-dimensional data

Differentially private synthetic data provide a powerful mechanism to en...
research
11/18/2015

Wishart Mechanism for Differentially Private Principal Components Analysis

We propose a new input perturbation mechanism for publishing a covarianc...
research
09/06/2019

Differentially Private Precision Matrix Estimation

In this paper, we study the problem of precision matrix estimation when ...
research
05/27/2022

DP-PCA: Statistically Optimal and Differentially Private PCA

We study the canonical statistical task of computing the principal compo...
research
04/26/2018

Distributed Differentially-Private Algorithms for Matrix and Tensor Factorization

In many signal processing and machine learning applications, datasets co...
research
07/01/2022

When Does Differentially Private Learning Not Suffer in High Dimensions?

Large pretrained models can be privately fine-tuned to achieve performan...
research
10/28/2019

Improved Differentially Private Decentralized Source Separation for fMRI Data

Blind source separation algorithms such as independent component analysi...

Please sign up or login with your details

Forgot password? Click here to reset