Generating Private Synthetic Data with Genetic Algorithms

06/05/2023
by   Terrance Liu, et al.
0

We study the problem of efficiently generating differentially private synthetic data that approximate the statistical properties of an underlying sensitive dataset. In recent years, there has been a growing line of work that approaches this problem using first-order optimization techniques. However, such techniques are restricted to optimizing differentiable objectives only, severely limiting the types of analyses that can be conducted. For example, first-order mechanisms have been primarily successful in approximating statistical queries only in the form of marginals for discrete data domains. In some cases, one can circumvent such issues by relaxing the task's objective to maintain differentiability. However, even when possible, these approaches impose a fundamental limitation in which modifications to the minimization problem become additional sources of error. Therefore, we propose Private-GSD, a private genetic algorithm based on zeroth-order optimization heuristics that do not require modifying the original objective. As a result, it avoids the aforementioned limitations of first-order optimization. We empirically evaluate Private-GSD against baseline algorithms on data derived from the American Community Survey across a variety of statistics–otherwise known as statistical queries–both for discrete and real-valued attributes. We show that Private-GSD outperforms the state-of-the-art methods on non-differential queries while matching accuracy in approximating differentiable ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

Private Synthetic Data with Hierarchical Structure

We study the problem of differentially private synthetic data generation...
research
06/13/2023

Continual Release of Differentially Private Synthetic Data

Motivated by privacy concerns in long-term longitudinal studies in medic...
research
01/29/2022

AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data

We propose AIM, a novel algorithm for differentially private synthetic d...
research
07/10/2020

New Oracle-Efficient Algorithms for Private Synthetic Data Release

We present three new algorithms for constructing differentially private ...
research
09/15/2023

DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms

Synthetic data generation methods, and in particular, private synthetic ...
research
06/14/2021

Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods

We study private synthetic data generation for query release, where the ...

Please sign up or login with your details

Forgot password? Click here to reset