Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification

09/24/2017
by   Igor Gitman, et al.
0

Batch normalization (BN) has become a de facto standard for training deep convolutional networks. However, BN accounts for a significant fraction of training run-time and is difficult to accelerate, since it is a memory-bandwidth bounded operation. Such a drawback of BN motivates us to explore recently proposed weight normalization algorithms (WN algorithms), i.e. weight normalization, normalization propagation and weight normalization with translated ReLU. These algorithms don't slow-down training iterations and were experimentally shown to outperform BN on relatively small networks and datasets. However, it is not clear if these algorithms could replace BN in practical, large-scale applications. We answer this question by providing a detailed comparison of BN and WN algorithms using ResNet-50 network trained on ImageNet. We found that although WN achieves better training accuracy, the final test accuracy is significantly lower (≈ 6%) than that of BN. This result demonstrates the surprising strength of the BN regularization effect which we were unable to compensate for using standard regularization techniques like dropout and weight decay. We also found that training of deep networks with WN algorithms is significantly less stable compared to BN, limiting their practical applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2017

L2 Regularization versus Batch and Weight Normalization

Batch Normalization is a commonly used trick to improve the training of ...
research
03/03/2019

Accelerating Training of Deep Neural Networks with a Standardization Loss

A significant advance in accelerating neural network training has been t...
research
08/12/2021

Logit Attenuating Weight Normalization

Over-parameterized deep networks trained using gradient-based optimizers...
research
11/18/2016

Improving training of deep neural networks via Singular Value Bounding

Deep learning methods achieve great success recently on many computer vi...
research
02/11/2021

High-Performance Large-Scale Image Recognition Without Normalization

Batch normalization is a key component of most image classification mode...
research
02/18/2019

LocalNorm: Robust Image Classification through Dynamically Regularized Normalization

While modern convolutional neural networks achieve outstanding accuracy ...
research
02/13/2023

How to Use Dropout Correctly on Residual Networks with Batch Normalization

For the stable optimization of deep neural networks, regularization meth...

Please sign up or login with your details

Forgot password? Click here to reset