Privacy Budget Scheduling

06/29/2021
by   Tao Luo, et al.
5

Machine learning (ML) models trained on personal data have been shown to leak information about users. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. This distinction forces a re-design of the scheduler. We present DPF (Dominant Private Block Fairness) – a variant of the popular Dominant Resource Fairness (DRF) algorithm – that is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. This is especially true for DPF over Rényi DP, a highly composable form of DP.

READ FULL TEXT

page 1

page 5

page 7

page 10

page 16

page 17

page 18

page 22

research
12/26/2022

Packing Privacy Budget Efficiently

Machine learning (ML) models can leak information about users, and diffe...
research
09/04/2019

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform

Companies increasingly expose machine learning (ML) models trained over ...
research
12/24/2021

DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Differential Privacy (DP) has emerged as a rigorous formalism to reason ...
research
06/27/2023

Probing the Transition to Dataset-Level Privacy in ML Models Using an Output-Specific and Data-Resolved Privacy Profile

Differential privacy (DP) is the prevailing technique for protecting use...
research
10/25/2021

DP-XGBoost: Private Machine Learning at Scale

The big-data revolution announced ten years ago does not seem to have fu...
research
03/29/2020

Dealer: End-to-End Data Marketplace with Model-based Pricing

Data-driven machine learning (ML) has witnessed great successes across a...

Please sign up or login with your details

Forgot password? Click here to reset