Differentially private training of residual networks with scale normalisation

03/01/2022
by   Helena Klause, et al.
0

We investigate the optimal choice of replacement layer for Batch Normalisation (BN) in residual networks (ResNets) for training with Differentially Private Stochastic Gradient Descent (DP-SGD) and study the phenomenon of scale mixing in residual blocks, whereby the activations on the two branches are scaled differently. Our experimental evaluation indicates that a hyperparameter search over 1-64 Group Normalisation (GN) groups improves the accuracy of ResNet-9 and ResNet-50 considerably in both benchmark (CIFAR-10) and large-image (ImageNette) tasks. Moreover, Scale Normalisation, a simple modification to the model architecture by which an additional normalisation layer is introduced after the residual block's addition operation further improves the utility of ResNets allowing us to achieve state-of-the-art results on CIFAR-10.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2022

Unlocking High-Accuracy Differentially Private Image Classification through Scale

Differential Privacy (DP) provides a formal privacy guarantee preventing...
research
07/09/2021

Differentially private training of neural networks with Langevin dynamics for calibrated predictive uncertainty

We show that differentially private stochastic gradient descent (DP-SGD)...
research
08/03/2021

Large-Scale Differentially Private BERT

In this work, we study the large-scale pretraining of BERT-Large with di...
research
10/15/2022

A Closer Look at the Calibration of Differentially Private Learners

We systematically study the calibration of classifiers trained with diff...
research
05/22/2020

Arbitrary-sized Image Training and Residual Kernel Learning: Towards Image Fraud Identification

Preserving original noise residuals in images are critical to image frau...
research
11/03/2022

Single SMPC Invocation DPHelmet: Differentially Private Distributed Learning on a Large Scale

Distributing machine learning predictors enables the collection of large...
research
07/21/2023

Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent

Each round in Differential Private Stochastic Gradient Descent (DPSGD) t...

Please sign up or login with your details

Forgot password? Click here to reset