Large Scale Private Learning via Low-rank Reparametrization

06/17/2021
by   Da Yu, et al.
0

We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dependence. Specifically, we reparametrize each weight matrix with two gradient-carrier matrices of small dimension and a residual weight matrix. We argue that such reparametrization keeps the forward/backward process unchanged while enabling us to compute the projected gradient without computing the gradient itself. To learn with differential privacy, we design reparametrized gradient perturbation (RGP) that perturbs the gradients on gradient-carrier matrices and reconstructs an update for the original weight from the noisy gradients. Importantly, we use historical updates to find the gradient-carrier matrices, whose optimality is rigorously justified under linear regression and empirically verified with deep learning tasks. RGP significantly reduces the memory cost and improves the utility. For example, we are the first able to apply differential privacy on the BERT model and achieve an average accuracy of 83.9% on four downstream tasks with ϵ=8, which is within 5% loss compared to the non-private baseline but enjoys much lower privacy leakage risk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2021

Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning

The privacy leakage of the model about the training data can be bounded ...
research
07/06/2022

Scaling Private Deep Learning with Low-Rank and Sparse Gradients

Applying Differentially Private Stochastic Gradient Descent (DPSGD) to t...
research
12/25/2021

Gradient Leakage Attack Resilient Deep Learning

Gradient leakage attacks are considered one of the wickedest privacy thr...
research
11/01/2022

On the Interaction Between Differential Privacy and Gradient Compression in Deep Learning

While differential privacy and gradient compression are separately well-...
research
09/07/2020

Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

Recent work on Renyi Differential Privacy has shown the feasibility of a...
research
02/12/2022

Private Adaptive Optimization with Side Information

Adaptive optimization methods have become the default solvers for many m...
research
06/27/2020

Understanding Gradient Clipping in Private SGD: A Geometric Perspective

Deep learning models are increasingly popular in many machine learning a...

Please sign up or login with your details

Forgot password? Click here to reset