Winning the NIST Contest: A scalable and general approach to differentially private synthetic data

08/11/2021
by   Ryan McKenna, et al.
0

We propose a general approach for differentially private synthetic data generation, that consists of three steps: (1) select a collection of low-dimensional marginals, (2) measure those marginals with a noise addition mechanism, and (3) generate synthetic data that preserves the measured marginals well. Central to this approach is Private-PGM, a post-processing method that is used to estimate a high-dimensional data distribution from noisy measurements of its marginals. We present two mechanisms, NIST-MST and MST, that are instances of this general approach. NIST-MST was the winning mechanism in the 2018 NIST differential privacy synthetic data competition, and MST is a new mechanism that can work in more general settings, while still performing comparably to NIST-MST. We believe our general approach should be of broad interest, and can be adopted in future mechanisms for synthetic data generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2023

When Synthetic Data Met Regulation

In this paper, we argue that synthetic data produced by Differentially P...
research
07/12/2022

dpart: Differentially Private Autoregressive Tabular, a General Framework for Synthetic Data Generation

We propose a general, flexible, and scalable framework dpart, an open so...
research
01/29/2022

AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data

We propose AIM, a novel algorithm for differentially private synthetic d...
research
01/21/2023

Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms

Marginal-based methods achieve promising performance in the synthetic da...
research
10/13/2022

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Legal and ethical restrictions on accessing relevant data inhibit data s...
research
12/30/2020

PrivSyn: Differentially Private Data Synthesis

In differential privacy (DP), a challenging problem is to generate synth...

Please sign up or login with your details

Forgot password? Click here to reset