Leveraging gradient-derived metrics for data selection and valuation in differentially private training

05/04/2023
by   Dmitrii Usynin, et al.
0

Obtaining high-quality data for collaborative training of machine learning models can be a challenging task due to A) the regulatory concerns and B) lack of incentive to participate. The first issue can be addressed through the use of privacy enhancing technologies (PET), one of the most frequently used one being differentially private (DP) training. The second challenge can be addressed by identifying which data points can be beneficial for model training and rewarding data owners for sharing this data. However, DP in deep learning typically adversely affects atypical (often informative) data samples, making it difficult to assess the usefulness of individual contributions. In this work we investigate how to leverage gradient information to identify training samples of interest in private training settings. We show that there exist techniques which are able to provide the clients with the tools for principled data selection even in strictest privacy settings.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

research
11/11/2020

Differentially Private Synthetic Data: Applied Evaluations and Enhancements

Machine learning practitioners frequently seek to leverage the most info...
research
06/15/2021

An Analysis of the Deployment of Models Trained on Private Tabular Synthetic Data: Unexpected Surprises

Diferentially private (DP) synthetic datasets are a powerful approach fo...
research
08/05/2022

DP^2-VAE: Differentially Private Pre-trained Variational Autoencoders

Modern machine learning systems achieve great success when trained on la...
research
06/05/2019

Interpretable and Differentially Private Predictions

Interpretable predictions, where it is clear why a machine learning mode...
research
12/01/2022

Differentially Private Adaptive Optimization with Delayed Preconditioners

Privacy noise may negate the benefits of using adaptive optimizers in di...
research
02/16/2022

Private Online Prefix Sums via Optimal Matrix Factorizations

Motivated by differentially-private (DP) training of machine learning mo...
research
06/15/2022

Disparate Impact in Differential Privacy from Gradient Misalignment

As machine learning becomes more widespread throughout society, aspects ...

Please sign up or login with your details

Forgot password? Click here to reset