Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization

06/27/2022
by   Xiaodong Yang, et al.
0

By ensuring differential privacy in the learning algorithms, one can rigorously mitigate the risk of large models memorizing sensitive training data. In this paper, we study two algorithms for this purpose, i.e., DP-SGD and DP-NSGD, which first clip or normalize per-sample gradients to bound the sensitivity and then add noise to obfuscate the exact information. We analyze the convergence behavior of these two algorithms in the non-convex optimization setting with two common assumptions and achieve a rate 𝒪(√(dlog(1/δ)/N^2ϵ^2)) of the gradient norm for a d-dimensional model, N samples and (ϵ,δ)-DP, which improves over previous bounds under much weaker assumptions. Specifically, we introduce a regularizing factor in DP-NSGD and show that it is crucial in the convergence proof and subtly controls the bias and noise trade-off. Our proof deliberately handles the per-sample gradient clipping and normalization that are specified for the private setting. Empirically, we demonstrate that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD and hence may help further save the privacy budget when accounting the tuning effort.

READ FULL TEXT

page 10

page 24

research
06/24/2020

Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds

We study differentially private (DP) algorithms for stochastic non-conve...
research
06/14/2022

Automatic Clipping: Differentially Private Deep Learning Made Easier and Stronger

Per-example gradient clipping is a key algorithmic step that enables pra...
research
06/16/2022

On Private Online Convex Optimization: Optimal Algorithms in ℓ_p-Geometry and High Dimensional Contextual Bandits

Differentially private (DP) stochastic convex optimization (SCO) is ubiq...
research
11/26/2019

Gradient Perturbation is Underrated for Differentially Private Convex Optimization

Gradient perturbation, widely used for differentially private optimizati...
research
04/21/2023

DP-Adam: Correcting DP Bias in Adam's Second Moment Estimation

We observe that the traditional use of DP with the Adam optimizer introd...
research
01/26/2020

Boosted and Differentially Private Ensembles of Decision Trees

Boosted ensemble of decision tree (DT) classifiers are extremely popular...
research
02/28/2023

Arbitrary Decisions are a Hidden Cost of Differentially-Private Training

Mechanisms used in privacy-preserving machine learning often aim to guar...

Please sign up or login with your details

Forgot password? Click here to reset