PrivSyn: Differentially Private Data Synthesis

12/30/2020
by   Zhikun Zhang, et al.
0

In differential privacy (DP), a challenging problem is to generate synthetic datasets that efficiently capture the useful information in the private data. The synthetic dataset enables any task to be done without privacy concern and modification to existing algorithms. In this paper, we present PrivSyn, the first automatic synthetic data generation method that can handle general tabular datasets (with 100 attributes and domain size >2^500). PrivSyn is composed of a new method to automatically and privately identify correlations in the data, and a novel method to generate sample data from a dense graphic model. We extensively evaluate different methods on multiple datasets to demonstrate the performance of our method.

READ FULL TEXT

Authors

page 22

page 23

01/27/2020

DP-CGAN: Differentially Private Synthetic Data and Label Generation

Generative Adversarial Networks (GANs) are one of the well-known models ...
05/23/2018

pMSE Mechanism: Differentially Private Synthetic Data with Maximal Distributional Similarity

We propose a method for the release of differentially private synthetic ...
12/31/2020

Kamino: Constraint-Aware Differentially Private Data Synthesis

Organizations are increasingly relying on data to support decisions. Whe...
08/11/2021

Winning the NIST Contest: A scalable and general approach to differentially private synthetic data

We propose a general approach for differentially private synthetic data ...
06/14/2021

Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods

We study private synthetic data generation for query release, where the ...
09/03/2021

Privacy of synthetic data: a statistical framework

Privacy-preserving data analysis is emerging as a challenging problem wi...
12/22/2020

Differentially Private Synthetic Medical Data Generation using Convolutional GANs

Deep learning models have demonstrated superior performance in several a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.