Exact and Efficient Bayesian Inference for Privacy Risk Quantification (Extended Version)

08/31/2023
by   Rasmus C. Rønneberg, et al.
0

Data analysis has high value both for commercial and research purposes. However, disclosing analysis results may pose severe privacy risk to individuals. Privug is a method to quantify privacy risks of data analytics programs by analyzing their source code. The method uses probability distributions to model attacker knowledge and Bayesian inference to update said knowledge based on observable outputs. Currently, Privug uses Markov Chain Monte Carlo (MCMC) to perform inference, which is a flexible but approximate solution. This paper presents an exact Bayesian inference engine based on multivariate Gaussian distributions to accurately and efficiently quantify privacy risks. The inference engine is implemented for a subset of Python programs that can be modeled as multivariate Gaussian models. We evaluate the method by analyzing privacy risks in programs to release public statistics. The evaluation shows that our method accurately and efficiently analyzes privacy risks, and outperforms existing methods. Furthermore, we demonstrate the use of our engine to analyze the effect of differential privacy in public statistics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2020

Privug: Quantifying Leakage using Probabilistic Programming for Privacy Risk Analysis

Disclosure of data analytics has important scientific and commercial jus...
research
03/10/2023

DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference

Bayesian inference provides a principled framework for learning from com...
research
05/01/2021

Bayesian Inference of a Dependent Competing Risk Data

Analysis of competing risks data plays an important role in the lifetime...
research
11/18/2022

A Unified Framework for Quantifying Privacy Risk in Synthetic Data

Synthetic data is often presented as a method for sharing sensitive info...
research
05/20/2014

Gaussian Approximation of Collective Graphical Models

The Collective Graphical Model (CGM) models a population of independent ...
research
09/26/2018

Bayesian Data Synthesis and Disclosure Risk Quantification: An Application to the Consumer Expenditure Surveys

The release of synthetic data generated from a model estimated on the da...
research
07/15/2019

Confidentiality and linked data

Data providers such as government statistical agencies perform a balanci...

Please sign up or login with your details

Forgot password? Click here to reset