Bayesian Pseudo Posterior Mechanism under Differential Privacy

09/25/2019
by   Terrance D. Savitsky, et al.
0

We propose a Bayesian pseudo posterior mechanism to generate record-level synthetic datasets with a Differential privacy (DP) guarantee from any proposed synthesizer model. The pseudo posterior mechanism employs a data record-indexed, risk-based weight vector with weights ∈ [0, 1] to surgically downweight high-risk records for the generation and release of record-level synthetic data. The pseudo posterior synthesizer constructs weights using Lipschitz bounds for the log-likelihood for each data record, which provides a practical, general formulation for using weights based on record-level sensitivities that we show achieves dramatic improvements in the DP guarantee as compared to the unweighted, non-private synthesizer. We compute a local sensitivity specific to our Consumer Expenditure Surveys (CE) dataset for family income, published by the U.S. Bureau of Labor Statistics, and reveal mild conditions that guarantees its contraction to a global sensitivity result over all x∈X. We show that utility is better preserved for our pseudo posterior mechanism as compared to the exponential mechanism (EM) estimated on the same non-private synthesizer. Our results may be applied to any synthesizing mechanism envisioned by the data analyst in a computationally tractable way that only involves estimation of a pseudo posterior distribution for θ unlike recent approaches that use naturally-bounded utility functions under application of the EM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2020

Re-weighting of Vector-weighted Mechanisms for Utility Maximization under Differential Privacy

We implement a pseudo posterior synthesizer for microdata dissemination ...
research
05/10/2022

Mechanisms for Global Differential Privacy under Bayesian Data Synthesis

This paper introduces a new method that embeds any Bayesian model used t...
research
01/15/2021

Private Tabular Survey Data Products through Synthetic Microdata Generation

We propose three synthetic microdata approaches to generate private tabu...
research
01/19/2019

Bayesian Pseudo Posterior Synthesis for Data Privacy Protection

Statistical agencies utilize models to synthesize respondent-level data ...
research
05/14/2019

Scaling Bayesian Probabilistic Record Linkage with Post-Hoc Blocking: An Application to the California Great Registers

Probabilistic record linkage (PRL) is the process of determining which r...
research
10/09/2022

Performances of Symmetric Loss for Private Data from Exponential Mechanism

This study explores the robustness of learning by symmetric loss on priv...
research
06/03/2020

One Step to Efficient Synthetic Data

We propose a general method of producing synthetic data, which is widely...

Please sign up or login with your details

Forgot password? Click here to reset