Kamino: Constraint-Aware Differentially Private Data Synthesis

by   Chang Ge, et al.

Organizations are increasingly relying on data to support decisions. When data contains private and sensitive information, the data owner often desires to publish a synthetic database instance that is similarly useful as the true data, while ensuring the privacy of individual data records. Existing differentially private data synthesis methods aim to generate useful data based on applications, but they fail in keeping one of the most fundamental data properties of the structured data – the underlying correlations and dependencies among tuples and attributes (i.e., the structure of the data). This structure is often expressed as integrity and schema constraints, or with a probabilistic generative process. As a result, the synthesized data is not useful for any downstream tasks that require this structure to be preserved. This work presents Kamino, a data synthesis system to ensure differential privacy and to preserve the structure and correlations present in the original dataset. Kamino takes as input of a database instance, along with its schema (including integrity constraints), and produces a synthetic database instance with differential privacy and structure preservation guarantees. We empirically show that while preserving the structure of the data, Kamino achieves comparable and even better usefulness in applications of training classification models and answering marginal queries than the state-of-the-art methods of differentially private data synthesis.


page 1

page 2

page 3

page 4


Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output

We consider accurately answering smooth queries while preserving differe...

Learning Differentially Private Probabilistic Models for Privacy-Preserving Image Generation

A number of deep models trained on high-quality and valuable images have...

PrivSyn: Differentially Private Data Synthesis

In differential privacy (DP), a challenging problem is to generate synth...

PrivLava: Synthesizing Relational Data with Foreign Keys under Differential Privacy

Answering database queries while preserving privacy is an important prob...

Private Algorithms Can Always Be Extended

We consider the following fundamental question on ϵ-differential privacy...

A Differentially Private Multi-Output Deep Generative Networks Approach For Activity Diary Synthesis

In this work, we develop a privacy-by-design generative model for synthe...

Differentially Private Controller Synthesis With Metric Temporal Logic Specifications

Privacy is an important concern in various multiagent systems in which d...

Please sign up or login with your details

Forgot password? Click here to reset