Filtered Batch Normalization

10/16/2020
by   Andras Horvath, et al.
0

It is a common assumption that the activation of different layers in neural networks follow Gaussian distribution. This distribution can be transformed using normalization techniques, such as batch-normalization, increasing convergence speed and improving accuracy. In this paper we would like to demonstrate, that activations do not necessarily follow Gaussian distribution in all layers. Neurons in deeper layers are more selective and specific which can result extremely large, out-of-distribution activations. We will demonstrate that one can create more consistent mean and variance values for batch normalization during training by filtering out these activations which can further improve convergence speed and yield higher validation accuracy.

READ FULL TEXT

page 5

page 6

page 10

research
06/11/2020

Optimization Theory for ReLU Neural Networks Trained with Normalization Layers

The success of deep neural networks is in part due to the use of normali...
research
06/10/2021

Beyond BatchNorm: Towards a General Understanding of Normalization in Deep Learning

Inspired by BatchNorm, there has been an explosion of normalization laye...
research
02/21/2018

Batch Normalization and the impact of batch structure on the behavior of deep convolution networks

Batch normalization was introduced in 2015 to speed up training of deep ...
research
06/01/2018

Understanding Batch Normalization

Batch normalization is a ubiquitous deep learning technique that normali...
research
11/21/2018

Regularizing by the Variance of the Activations' Sample-Variances

Normalization techniques play an important role in supporting efficient ...
research
02/13/2020

Regularizing activations in neural networks via distribution matching with the Wasserstein metric

Regularization and normalization have become indispensable components in...

Please sign up or login with your details

Forgot password? Click here to reset