Normalization Techniques in Training DNNs: Methodology, Analysis and Application

09/27/2020
by   Lei Huang, et al.
8

Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs), and have successfully been used in various applications. This paper reviews and comments on the past, present and future of normalization methods in the context of DNN training. We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them. Specifically, we decompose the pipeline of the most representative normalizing activation methods into three components: the normalization area partitioning, normalization operation and normalization representation recovery. In doing so, we provide insight for designing new normalization technique. Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively solve the key issues.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/27/2018

Towards a Theoretical Understanding of Batch Normalization

Normalization techniques such as Batch Normalization have been applied v...
05/23/2018

A Unified Framework for Training Neural Networks

The lack of mathematical tractability of Deep Neural Networks (DNNs) has...
06/10/2021

Beyond BatchNorm: Towards a General Understanding of Normalization in Deep Learning

Inspired by BatchNorm, there has been an explosion of normalization laye...
04/20/2020

Towards Understanding Normalization in Neural ODEs

Normalization is an important and vastly investigated technique in deep ...
09/05/2021

Statistical computation methods for microbiome compositional data network inference

Microbes can affect processes from food production to human health. Such...
06/15/2020

Weighted Optimization: better generalization by smoother interpolation

We provide a rigorous analysis of how implicit bias towards smooth inter...