An Empirical Analysis of the Shift and Scale Parameters in BatchNorm

03/22/2023
by   Yashna Peerthum, et al.
0

Batch Normalization (BatchNorm) is a technique that improves the training of deep neural networks, especially Convolutional Neural Networks (CNN). It has been empirically demonstrated that BatchNorm increases performance, stability, and accuracy, although the reasons for such improvements are unclear. BatchNorm includes a normalization step as well as trainable shift and scale parameters. In this paper, we empirically examine the relative contribution to the success of BatchNorm of the normalization step, as compared to the re-parameterization via shifting and scaling. To conduct our experiments, we implement two new optimizers in PyTorch, namely, a version of BatchNorm that we refer to as AffineLayer, which includes the re-parameterization step without normalization, and a version with just the normalization step, that we call BatchNorm-minus. We compare the performance of our AffineLayer and BatchNorm-minus implementations to standard BatchNorm, and we also compare these to the case where no batch normalization is used. We experiment with four ResNet architectures (ResNet18, ResNet34, ResNet50, and ResNet101) over a standard image dataset and multiple batch sizes. Among other findings, we provide empirical evidence that the success of BatchNorm may derive primarily from improved weight initialization.

READ FULL TEXT

page 6

page 20

page 21

research
05/27/2018

Towards a Theoretical Understanding of Batch Normalization

Normalization techniques such as Batch Normalization have been applied v...
research
03/14/2023

Context Normalization for Robust Image Classification

Normalization is a pre-processing step that converts the data into a mor...
research
02/29/2020

Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

Batch normalization (BatchNorm) has become an indispensable tool for tra...
research
10/26/2021

Revisiting Batch Normalization

Batch normalization (BN) is comprised of a normalization component follo...
research
07/09/2019

Mean Spectral Normalization of Deep Neural Networks for Embedded Automation

Deep Neural Networks (DNNs) have begun to thrive in the field of automat...
research
09/27/2018

Smooth Inter-layer Propagation of Stabilized Neural Networks for Classification

Recent work has studied the reasons for the remarkable performance of de...
research
04/26/2016

Scale Normalization

One of the difficulties of training deep neural networks is caused by im...

Please sign up or login with your details

Forgot password? Click here to reset