M^2M: A general method to perform various data analysis tasks from a differentially private sketch

11/25/2022
by   Florimond Houssiau, et al.
0

Differential privacy is the standard privacy definition for performing analyses over sensitive data. Yet, its privacy budget bounds the number of tasks an analyst can perform with reasonable accuracy, which makes it challenging to deploy in practice. This can be alleviated by private sketching, where the dataset is compressed into a single noisy sketch vector which can be shared with the analysts and used to perform arbitrarily many analyses. However, the algorithms to perform specific tasks from sketches must be developed on a case-by-case basis, which is a major impediment to their use. In this paper, we introduce the generic moment-to-moment (M^2M) method to perform a wide range of data exploration tasks from a single private sketch. Among other things, this method can be used to estimate empirical moments of attributes, the covariance matrix, counting queries (including histograms), and regression models. Our method treats the sketching mechanism as a black-box operation, and can thus be applied to a wide variety of sketches from the literature, widening their ranges of applications without further engineering or privacy loss, and removing some of the technical barriers to the wider adoption of sketches for data exploration under differential privacy. We validate our method with data exploration tasks on artificial and real-world data, and show that it can be used to reliably estimate statistics and train classification models from private sketches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2022

Composition of Differential Privacy Privacy Amplification by Subsampling

This chapter is meant to be part of the book "Differential Privacy for A...
research
06/21/2022

Differentially Private Maximal Information Coefficients

The Maximal Information Coefficient (MIC) is a powerful statistic to ide...
research
12/29/2017

Private Exploration Primitives for Data Cleaning

Data cleaning is the process of detecting and repairing inaccurate or co...
research
01/27/2022

Plume: Differential Privacy at Scale

Differential privacy has become the standard for private data analysis, ...
research
03/08/2021

Efficient Accuracy Prediction for Differentially Private Algorithms

Differential privacy is a strong mathematical notion of privacy. Still, ...
research
06/22/2020

Overlook: Differentially Private Exploratory Visualization for Big Data

Data exploration systems that provide differential privacy must manage a...

Please sign up or login with your details

Forgot password? Click here to reset