An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs

01/09/2020
by   You Huang, et al.
25

Batch Normalization (BN) techniques have been proposed to reduce the so-called Internal Covariate Shift (ICS) by attempting to keep the distributions of layer outputs unchanged. Experiments have shown their effectiveness on training deep neural networks. However, since only the first two moments are controlled in these BN techniques, it seems that a weak constraint is imposed on layer distributions and furthermore whether such constraint can reduce ICS is unknown. Thus this paper proposes a measure for ICS by using the Earth Mover (EM) distance and then derives the upper and lower bounds for the measure to provide a theoretical analysis of BN. The upper bound has shown that BN techniques can control ICS only for the outputs with low dimensions and small noise whereas their control is NOT effective in other cases. This paper also proves that such control is just a bounding of ICS rather than a reduction of ICS. Meanwhile, the analysis shows that the high-order moments and noise, which BN cannot control, have great impact on the lower bound. Based on such analysis, this paper furthermore proposes an algorithm that unitizes the outputs with an adjustable parameter to further bound ICS in order to cope with the problems of BN. The upper bound for the proposed unitization is noise-free and only dominated by the parameter. Thus, the parameter can be trained to tune the bound and further to control ICS. Besides, the unitization is embedded into the framework of BN to reduce the information loss. The experiments show that this proposed algorithm outperforms existing BN techniques on CIFAR-10, CIFAR-100 and ImageNet datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2018

How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift)

Batch Normalization (BatchNorm) is a widely adopted technique that enabl...
research
02/11/2015

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Training Deep Neural Networks is complicated by the fact that the distri...
research
12/11/2018

Controlling Covariate Shift using Equilibrium Normalization of Weights

We introduce a new normalization technique that exhibits the fast conver...
research
05/15/2011

Bounds on the Bayes Error Given Moments

We show how to compute lower bounds for the supremum Bayes error if the ...
research
06/01/2022

Lower and Upper Bounds for Numbers of Linear Regions of Graph Convolutional Networks

The research for characterizing GNN expressiveness attracts much attenti...
research
09/16/2020

A priori guarantees of finite-time convergence for Deep Neural Networks

In this paper, we perform Lyapunov based analysis of the loss function t...
research
04/28/2023

Deep Neural-network Prior for Orbit Recovery from Method of Moments

Orbit recovery problems are a class of problems that often arise in prac...

Please sign up or login with your details

Forgot password? Click here to reset