Improved Differentially Private Analysis of Variance

03/01/2019
by   Marika Swanberg, et al.
0

Hypothesis testing is one of the most common types of data analysis and forms the backbone of scientific research in many disciplines. Analysis of variance (ANOVA) in particular is used to detect dependence between a categorical and a numerical variable. Here we show how one can carry out this hypothesis test under the restrictions of differential privacy. We show that the F-statistic, the optimal test statistic in the public setting, is no longer optimal in the private setting, and we develop a new test statistic F_1 with much higher statistical power. We show how to rigorously compute a reference distribution for the F_1 statistic and give an algorithm that outputs accurate p-values. We implement our test and experimentally optimize several parameters. We then compare our test to the only previous work on private ANOVA testing, using the same effect size as that work. We see an order of magnitude improvement, with our test requiring only 7

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2017

Differentially Private ANOVA Testing

Modern society generates an incredible amount of data about individuals,...
research
06/29/2022

Hypothesis Testing for Differentially Private Linear Regression

In this work, we design differentially private hypothesis tests for the ...
research
09/05/2018

A Differentially Private Wilcoxon Signed-Rank Test

Hypothesis tests are a crucial statistical tool for data mining and are ...
research
02/08/2023

The Test of Tests: A Framework For Differentially Private Hypothesis Testing

We present a generic framework for creating differentially private versi...
research
07/11/2018

Differentially Private False Discovery Rate Control

Differential privacy provides a rigorous framework for privacy-preservin...
research
11/26/2019

The spatiotemporal tau statistic: a review

Introduction The tau statistic is a recent second-order correlation fu...
research
11/02/2020

p-value peeking and estimating extrema

A pervasive issue in statistical hypothesis testing is that the reported...

Please sign up or login with your details

Forgot password? Click here to reset