Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform

09/04/2019
by   Mathias Lecuyer, et al.
3

Companies increasingly expose machine learning (ML) models trained over sensitive user data to untrusted domains, such as end-user devices and wide-access model stores. We present Sage, a differentially private (DP) ML platform that bounds the cumulative leakage of training data through models. Sage builds upon the rich literature on DP ML algorithms and contributes pragmatic solutions to two of the most pressing systems challenges of global DP: running out of privacy budget and the privacy-utility tradeoff. To address the former, we develop block composition, a new privacy loss accounting method that leverages the growing database regime of ML workloads to keep training models endlessly on a sensitive data stream while enforcing a global DP guarantee for the stream. To address the latter, we develop privacy-adaptive training, a process that trains a model on growing amounts of data and/or with increasing privacy parameters until, with high probability, the model meets developer-configured quality criteria. They illustrate how a systems focus on characteristics of ML workloads enables pragmatic solutions that are not apparent when one focuses on individual algorithms, as most DP ML literature does.

READ FULL TEXT
research
01/27/2023

Practical Differentially Private Hyperparameter Tuning with Subsampling

Tuning all the hyperparameters of differentially private (DP) machine le...
research
08/26/2022

DiVa: An Accelerator for Differentially Private Machine Learning

The widespread deployment of machine learning (ML) is raising serious co...
research
06/28/2023

Boost: Effective Caching in Differentially-Private Databases

Differentially private (DP) databases can enable privacy-preserving anal...
research
06/29/2021

Privacy Budget Scheduling

Machine learning (ML) models trained on personal data have been shown to...
research
08/30/2023

Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

Data valuation, a critical aspect of data-centric ML research, aims to q...
research
06/12/2023

"Private Prediction Strikes Back!” Private Kernelized Nearest Neighbors with Individual Renyi Filter

Most existing approaches of differentially private (DP) machine learning...
research
03/29/2020

Dealer: End-to-End Data Marketplace with Model-based Pricing

Data-driven machine learning (ML) has witnessed great successes across a...

Please sign up or login with your details

Forgot password? Click here to reset