New Oracle-Efficient Algorithms for Private Synthetic Data Release

07/10/2020
by   Giuseppe Vietri, et al.
0

We present three new algorithms for constructing differentially private synthetic data—a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. All three algorithms are oracle-efficient in the sense that they are computationally efficient when given access to an optimization oracle. Such an oracle can be implemented using many existing (non-private) optimization tools such as sophisticated integer program solvers. While the accuracy of the synthetic data is contingent on the oracle's optimization performance, the algorithms satisfy differential privacy even in the worst case. For all three algorithms, we provide theoretical guarantees for both accuracy and privacy. Through empirical evaluation, we demonstrate that our methods scale well with both the dimensionality of the data and the number of queries. Compared to the state-of-the-art method High-Dimensional Matrix Mechanism <cit.>, our algorithms provide better accuracy in the large workload and high privacy regime (corresponding to low privacy loss ε).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2018

d_X-Private Mechanisms for Linear Queries

Differential Privacy is one of the strongest privacy guarantees, which a...
research
06/14/2021

Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods

We study private synthetic data generation for query release, where the ...
research
11/19/2018

How to Use Heuristics for Differential Privacy

We develop theory for using heuristics to solve computationally hard pro...
research
03/11/2021

Differentially Private Query Release Through Adaptive Projection

We propose, implement, and evaluate a new algorithm for releasing answer...
research
07/13/2021

Covariance's Loss is Privacy's Gain: Computationally Efficient, Private and Accurate Synthetic Data

The protection of private information is of vital importance in data-dri...
research
06/05/2023

Generating Private Synthetic Data with Genetic Algorithms

We study the problem of efficiently generating differentially private sy...
research
02/17/2021

Leveraging Public Data for Practical Private Query Release

In many statistical problems, incorporating priors can significantly imp...

Please sign up or login with your details

Forgot password? Click here to reset