On Consistent Bayesian Inference from Synthetic Data

05/26/2023
by   Ossi Räisä, et al.
0

Generating synthetic data, with or without differential privacy, has attracted significant attention as a potential solution to the dilemma between making data easily available, and the privacy of data subjects. Several works have shown that consistency of downstream analyses from synthetic data, including accurate uncertainty estimation, requires accounting for the synthetic data generation. There are very few methods of doing so, most of them for frequentist analysis. In this paper, we study how to perform consistent Bayesian inference from synthetic data. We prove that mixing posterior samples obtained separately from multiple large synthetic datasets converges to the posterior of the downstream analysis under standard regularity conditions when the analyst's model is compatible with the data provider's model. We show experimentally that this works in practice, unlocking consistent Bayesian inference from synthetic data while reusing existing downstream analysis methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/30/2021

Bayesian inference with scaled Brownian motion

We present a Bayesian inference scheme for scaled Brownian motion, and i...
research
08/01/2017

PROBE-GK: Predictive Robust Estimation using Generalized Kernels

Many algorithms in computer vision and robotics make strong assumptions ...
research
11/16/2020

Foundations of Bayesian Learning from Synthetic Data

There is significant growth and interest in the use of synthetic data as...
research
06/02/2023

Generation of Probabilistic Synthetic Data for Serious Games: A Case Study on Cyberbullying

Synthetic data generation has been a growing area of research in recent ...
research
09/12/2022

Rule-adhering synthetic data – the lingua franca of learning

AI-generated synthetic data allows to distill the general patterns of ex...
research
05/16/2023

Synthetic data, real errors: how (not) to publish and use synthetic data

Generating synthetic data through generative models is gaining interest ...
research
09/10/2023

A supervised generative optimization approach for tabular data

Synthetic data generation has emerged as a crucial topic for financial i...

Please sign up or login with your details

Forgot password? Click here to reset