Understanding Regularization in Batch Normalization

09/04/2018
by   Ping Luo, et al.
0

Batch Normalization (BN) makes output of hidden neuron had zero mean and unit variance, improving convergence and generalization when training neural networks. This work understands these phenomena theoretically. We analyze BN by using a building block of neural networks, which consists of a weight layer, a BN layer, and a nonlinear activation function. This simple network helps us understand the characteristics of BN, where the results are generalized to deep models in numerical studies. We explore BN in three aspects. First, by viewing BN as a stochastic process, an analytical form of regularization inherited in BN is derived. Second, the optimization dynamic with this regularization shows that BN enables training converged with large maximum and effective learning rates. Third, BN's generalization with regularization is explored by using random matrix theory and statistical mechanics. Both simulations and experiments support our analyses.

READ FULL TEXT
research
05/03/2019

Static Activation Function Normalization

Recent seminal work at the intersection of deep neural networks practice...
research
02/20/2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks

Traditionally, multi-layer neural networks use dot product between the o...
research
02/05/2018

Learning Compact Neural Networks with Regularization

We study the impact of regularization for learning neural networks. Our ...
research
08/24/2021

Improving Generalization of Batch Whitening by Convolutional Unit Optimization

Batch Whitening is a technique that accelerates and stabilizes training ...
research
10/19/2020

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Substantial experiments have validated the success of Batch Normalizatio...
research
09/15/2022

Theroretical Insight into Batch Normalization: Data Dependant Auto-Tuning of Regularization Rate

Batch normalization is widely used in deep learning to normalize interme...
research
10/02/2020

The Efficacy of L_1 Regularization in Two-Layer Neural Networks

A crucial problem in neural networks is to select the most appropriate n...

Please sign up or login with your details

Forgot password? Click here to reset