Weight Standardization

03/25/2019
by   Siyuan Qiao, et al.
0

In this paper, we propose Weight Standardization (WS) to accelerate deep network training. WS is targeted at the micro-batch training setting where each GPU typically has only 1-2 images for training. The micro-batch training setting is hard because small batch sizes are not enough for training networks with Batch Normalization (BN), while other normalization methods that do not rely on batch knowledge still have difficulty matching the performances of BN in large-batch training. Our WS ends this problem because when used with Group Normalization and trained with 1 image/GPU, WS is able to match or outperform the performances of BN trained with large batch sizes with only 2 more lines of code. In micro-batch training, WS significantly outperforms other normalization methods. WS achieves these superior results by standardizing the weights in the convolutional layers, which we show is able to smooth the loss landscape by reducing the Lipschitz constants of the loss and the gradients. The effectiveness of WS is verified on many tasks, including image classification, object detection, instance segmentation, video recognition, semantic segmentation, and point cloud recognition. The code is available here: https://github.com/joe-siyuan-qiao/WeightStandardization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2019

Rethinking Normalization and Elimination Singularity in Neural Networks

In this paper, we study normalization methods for neural networks from t...
research
05/11/2020

Normalized Convolutional Neural Network

In this paper, we propose Normalized Convolutional Neural Network(NCNN)....
research
08/12/2019

Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the feature...
research
03/19/2020

Exemplar Normalization for Learning Deep Representation

Normalization techniques are important in different advanced neural netw...
research
06/15/2020

Slowing Down the Weight Norm Increase in Momentum-based Optimizers

Normalization techniques, such as batch normalization (BN), have led to ...
research
08/31/2019

Towards Improving Generalization of Deep Networks via Consistent Normalization

Batch Normalization (BN) was shown to accelerate training and improve ge...
research
10/19/2020

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Substantial experiments have validated the success of Batch Normalizatio...

Please sign up or login with your details

Forgot password? Click here to reset