Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss

05/27/2022
by   Jason M. Altschuler, et al.
0

A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (a.k.a. Stochastic Gradient Langevin Dynamics). However, foundational theoretical questions about this algorithm's privacy loss remain open – even in the seemingly simple setting of smooth convex losses over a bounded domain. Our main result resolves these questions: for a large range of parameters, we characterize the differential privacy up to a constant factor. This result reveals that all previous analyses for this setting have the wrong qualitative behavior. Specifically, while previous privacy analyses increase ad infinitum in the number of iterations, we show that after a small burn-in period, running SGD longer leaks no further privacy. Our analysis departs completely from previous approaches based on fast mixing, instead using techniques based on optimal transport (namely, Privacy Amplification by Iteration) and the Sampled Gaussian Mechanism (namely, Privacy Amplification by Sampling). Our techniques readily extend to other settings, e.g., strongly convex losses, non-uniform stepsizes, arbitrary batch sizes, and random or cyclic choice of batches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Differential Privacy Guarantees for Stochastic Gradient Langevin Dynamics

We analyse the privacy leakage of noisy stochastic gradient descent by m...
research
02/11/2021

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

We model the dynamics of privacy loss in Langevin diffusion and extend i...
research
07/11/2019

Amplifying Rényi Differential Privacy via Shuffling

Differential privacy is a useful tool to build machine learning models w...
research
05/17/2023

Privacy Loss of Noisy Stochastic Gradient Descent Might Converge Even for Non-Convex Losses

The Noisy-SGD algorithm is widely used for privately training machine le...
research
10/16/2022

Resolving the Mixing Time of the Langevin Algorithm to its Stationary Distribution for Log-Concave Sampling

Sampling from a high-dimensional distribution is a fundamental task in s...
research
06/20/2021

Privacy Amplification via Iteration for Shuffled and Online PNSGD

In this paper, we consider the framework of privacy amplification via it...
research
06/17/2021

Shuffle Private Stochastic Convex Optimization

In shuffle privacy, each user sends a collection of randomized messages ...

Please sign up or login with your details

Forgot password? Click here to reset