Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization

10/15/2018
by   Xiangru Lian, et al.
0

Batch Normalization (BN) has been used extensively in deep learning to achieve faster training process and better resulting models. However, whether BN works strongly depends on how the batches are constructed during training and it may not converge to a desired solution if the statistics on a batch are not close to the statistics over the whole dataset. In this paper, we try to understand BN from an optimization perspective by formulating the optimization problem which motivates BN. We show when BN works and when BN does not work by analyzing the optimization problem. We then propose a refinement of BN based on compositional optimization techniques called Full Normalization (FN) to alleviate the issues of BN when the batches are not constructed ideally. We provide convergence analysis for FN and empirically study its effectiveness to refine BN.

READ FULL TEXT
research
08/18/2020

Training Deep Neural Networks Without Batch Normalization

Training neural networks is an optimization problem, and finding a decen...
research
05/27/2018

Towards a Theoretical Understanding of Batch Normalization

Normalization techniques such as Batch Normalization have been applied v...
research
03/14/2023

Context Normalization for Robust Image Classification

Normalization is a pre-processing step that converts the data into a mor...
research
02/25/2016

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

We present weight normalization: a reparameterization of the weight vect...
research
06/29/2021

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Despite the conventional wisdom that using batch normalization with weig...
research
05/29/2018

How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)

Batch Normalization (BatchNorm) is a widely adopted technique that enabl...
research
06/09/2018

An Optimization Approach to the Langberg-Médard Multiple Unicast Conjecture

The Langberg-Médard multiple unicast conjecture claims that for any stro...

Please sign up or login with your details

Forgot password? Click here to reset