New Interpretations of Normalization Methods in Deep Learning

06/16/2020
by   Jiacheng Sun, et al.
0

In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc. However, mathematical tools to analyze all these normalization methods are lacking. In this paper, we first propose a lemma to define some necessary tools. Then, we use these tools to make a deep analysis on popular normalization methods and obtain the following conclusions: 1) Most of the normalization methods can be interpreted in a unified framework, namely normalizing pre-activations or weights onto a sphere; 2) Since most of the existing normalization methods are scaling invariant, we can conduct optimization on a sphere with scaling symmetry removed, which can help stabilize the training of network; 3) We prove that training with these normalization methods can make the norm of weights increase, which could cause adversarial vulnerability as it amplifies the attack. Finally, a series of experiments are conducted to verify these claims.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2020

Weight and Gradient Centralization in Deep Neural Networks

Batch normalization is currently the most widely used variant of interna...
research
09/08/2022

Training Scale-Invariant Neural Networks on the Sphere Can Happen in Three Regimes

A fundamental property of deep learning normalization techniques, such a...
research
04/26/2016

Scale Normalization

One of the difficulties of training deep neural networks is caused by im...
research
01/01/2020

A Comprehensive and Modularized Statistical Framework for Gradient Norm Equality in Deep Neural Networks

In recent years, plenty of metrics have been proposed to identify networ...
research
05/01/2021

Normalization of regressor excitation as a part of dynamic regressor extension and mixing procedure

The method of excitation normalization of the regressor, which is used i...
research
09/28/2022

Breaking Time Invariance: Assorted-Time Normalization for RNNs

Methods such as Layer Normalization (LN) and Batch Normalization (BN) ha...
research
08/05/2023

On problematic practice of using normalization in Self-modeling/Multivariate Curve Resolution (S/MCR)

The paper is briefly dealing with greater or lesser misused normalizatio...

Please sign up or login with your details

Forgot password? Click here to reset