DeepAI AI Chat
Log In Sign Up

Evaluating the Fairness Impact of Differentially Private Synthetic Data

05/09/2022
by   Blake Bullwinkel, et al.
0

Differentially private (DP) synthetic data is a promising approach to maximizing the utility of data containing sensitive information. Due to the suppression of underrepresented classes that is often required to achieve privacy, however, it may be in conflict with fairness. We evaluate four DP synthesizers and present empirical results indicating that three of these models frequently degrade fairness outcomes on downstream binary classification tasks. We draw a connection between fairness and the proportion of minority groups present in the generated synthetic data, and find that training synthesizers on data that are pre-processed via a multi-label undersampling method can promote more fair outcomes without degrading accuracy.

READ FULL TEXT
06/15/2021

An Analysis of the Deployment of Models Trained on Private Tabular Synthetic Data: Unexpected Surprises

Diferentially private (DP) synthetic datasets are a powerful approach fo...
12/20/2022

PreFair: Privately Generating Justifiably Fair Synthetic Data

When a database is protected by Differential Privacy (DP), its usability...
04/27/2022

Spending Privacy Budget Fairly and Wisely

Differentially private (DP) synthetic data generation is a practical met...
02/11/2021

Investigating Trade-offs in Utility, Fairness and Differential Privacy in Neural Networks

To enable an ethical and legal use of machine learning algorithms, they ...
10/17/2022

Stochastic Differentially Private and Fair Learning

Machine learning models are increasingly used in high-stakes decision-ma...
03/09/2022

Downstream Fairness Caveats with Synthetic Healthcare Data

This paper evaluates synthetically generated healthcare data for biases ...