Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

02/11/2015
by   Christian Szegedy, et al.
0

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9 validation error (and 4.8

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2018

Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches

As an indispensable component, Batch Normalization (BN) has successfully...
research
10/27/2019

Inherent Weight Normalization in Stochastic Neural Networks

Multiplicative stochasticity such as Dropout improves the robustness and...
research
08/21/2020

A constrained recursion algorithm for batch normalization of tree-sturctured LSTM

Tree-structured LSTM is promising way to consider long-distance interact...
research
10/04/2019

Farkas layers: don't shift the data, fix the geometry

Successfully training deep neural networks often requires either batch n...
research
02/11/2021

High-Performance Large-Scale Image Recognition Without Normalization

Batch normalization is a key component of most image classification mode...
research
01/09/2020

An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs

Batch Normalization (BN) techniques have been proposed to reduce the so-...
research
05/29/2018

How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)

Batch Normalization (BatchNorm) is a widely adopted technique that enabl...

Please sign up or login with your details

Forgot password? Click here to reset