Private measures, random walks, and synthetic data

04/20/2022
by   March Boedihardjo, et al.
0

Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to achieve it come with some serious limitations. Utility guarantees are usually provided only for a fixed, a priori specified set of queries. Moreover, there are no utility guarantees for more complex - but very common - machine learning tasks such as clustering or classification. In this paper we overcome some of these limitations. Working with metric privacy, a powerful generalization of differential privacy, we develop a polynomial-time algorithm that creates a private measure from a data set. This private measure allows us to efficiently construct private synthetic data that are accurate for a wide range of statistical analysis tools. Moreover, we prove an asymptotically sharp min-max result for private measures and synthetic data for general compact metric spaces. A key ingredient in our construction is a new superregular random walk, whose joint distribution of steps is as regular as that of independent random variables, yet which deviates from the origin logarithmicaly slowly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2021

Private sampling: a noiseless approach for generating differentially private synthetic data

In a world where artificial intelligence and data science become omnipre...
research
05/28/2022

MC-GEN:Multi-level Clustering for Private Synthetic Data Generation

Nowadays, machine learning is one of the most common technology to turn ...
research
11/06/2020

The Bayes Security Measure

Security system designers favor worst-case security measures, such as th...
research
02/19/2022

An Evaluation of Open-source Tools for the Provision of Differential Privacy

The concept of differential privacy has widely penetrated academia and i...
research
11/19/2018

How to Use Heuristics for Differential Privacy

We develop theory for using heuristics to solve computationally hard pro...
research
07/13/2021

Covariance's Loss is Privacy's Gain: Computationally Efficient, Private and Accurate Synthetic Data

The protection of private information is of vital importance in data-dri...
research
09/25/2019

Design of Algorithms under Policy-Aware Local Differential Privacy: Utility-Privacy Trade-offs

Local differential privacy (LDP) enables private data sharing and analyt...

Please sign up or login with your details

Forgot password? Click here to reset