Bayesian Adaptive Data Analysis Guarantees from Subgaussianity

10/31/2016
by   Sam Elder, et al.
0

The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static data analysis. In 2014, Dwork, Feldman, Hardt, Pitassi, Reingold and Roth introduced one potential model and proposed several solutions based on differential privacy. In previous work in 2016, we described a problem with this model and instead proposed a Bayesian variant, but also found that the analogous Bayesian methods cannot achieve the same statistical guarantees as in the static case. In this paper, we prove the first positive results for the Bayesian model, showing that with a Dirichlet prior, the posterior mean algorithm indeed matches the statistical guarantees of the static case. The main ingredient is a new theorem showing that the Beta(α,β) distribution is subgaussian with variance proxy O(1/(α+β+1)), a concentration result also of independent interest. We provide two proofs of this result: a probabilistic proof utilizing a simple condition for the raw moments of a positive random variable and a learning-theoretic proof based on considering the beta distribution as a posterior, both of which have implications to other related problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2016

Challenges in Bayesian Adaptive Data Analysis

Traditional statistical analysis requires that the analysis process and ...
research
10/03/2021

Differential Privacy of Dirichlet Posterior Sampling

Besides the Laplace distribution and the Gaussian distribution, there ar...
research
04/16/2021

A Bivariate Beta Distribution with Arbitrary Beta Marginals and its Generalization to a Correlated Dirichlet Distribution

We discuss a bivariate beta distribution that can model arbitrary beta-d...
research
12/19/2017

Calibrating Noise to Variance in Adaptive Data Analysis

Datasets are often used multiple times and each successive analysis may ...
research
09/09/2019

A New Analysis of Differential Privacy's Generalization Guarantees

We give a new proof of the "transfer theorem" underlying adaptive data a...
research
03/05/2019

A New Approach to Adaptive Data Analysis and Learning via Maximal Leakage

There is an increasing concern that most current published research find...
research
01/15/2023

A Simple Proof of Posterior Robustness

Conditions for Bayesian posterior robustness have been examined in recen...

Please sign up or login with your details

Forgot password? Click here to reset