Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

09/07/2020
by   Jaewoo Lee, et al.
12

Recent work on Renyi Differential Privacy has shown the feasibility of applying differential privacy to deep learning tasks. Despite their promise, however, differentially private deep networks often lag far behind their non-private counterparts in accuracy, showing the need for more research in model architectures, optimizers, etc. One of the barriers to this expanded research is the training time – often orders of magnitude larger than training non-private networks. The reason for this slowdown is a crucial privacy-related step called "per-example gradient clipping" whose naive implementation undoes the benefits of batch training with GPUs. By analyzing the back-propagation equations we derive new methods for per-example gradient clipping that are compatible with auto-differentiation (e.g., in PyTorch and TensorFlow) and provide better GPU utilization. Our implementation in PyTorch showed significant training speed-ups (by factors of 54x - 94x for training various models with batch sizes of 128). These techniques work for a variety of architectural choices including convolutional layers, recurrent networks, attention, residual blocks, etc.

READ FULL TEXT

page 13

page 14

page 15

research
10/08/2020

Differentially Private Deep Learning with Direct Feedback Alignment

Standard methods for differentially private training of deep neural netw...
research
03/18/2021

Super-convergence and Differential Privacy: Training faster with better privacy guarantees

The combination of deep neural networks and Differential Privacy has bee...
research
09/24/2021

NanoBatch DPSGD: Exploring Differentially Private learning on ImageNet with low batch sizes on the IPU

Differentially private SGD (DPSGD) has recently shown promise in deep le...
research
12/12/2019

Efficient Per-Example Gradient Computations in Convolutional Neural Networks

Deep learning frameworks leverage GPUs to perform massively-parallel com...
research
11/01/2022

On the Interaction Between Differential Privacy and Gradient Compression in Deep Learning

While differential privacy and gradient compression are separately well-...
research
06/17/2021

Large Scale Private Learning via Low-rank Reparametrization

We propose a reparametrization scheme to address the challenges of apply...
research
05/08/2023

Differentially Private Attention Computation

Large language models (LLMs) have had a profound impact on numerous aspe...

Please sign up or login with your details

Forgot password? Click here to reset