Stochastic Whitening Batch Normalization

06/03/2021
by   Shengdong Zhang, et al.
0

Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). BN uses scaling and shifting to normalize activations of mini-batches to accelerate convergence and improve generalization. The recently proposed Iterative Normalization (IterNorm) method improves these properties by whitening the activations iteratively using Newton's method. However, since Newton's method initializes the whitening matrix independently at each training step, no information is shared between consecutive steps. In this work, instead of exact computation of whitening matrix at each time step, we estimate it gradually during training in an online fashion, using our proposed Stochastic Whitening Batch Normalization (SWBN) algorithm. We show that while SWBN improves the convergence rate and generalization of DNNs, its computational overhead is less than that of IterNorm. Due to the high efficiency of the proposed method, it can be easily employed in most DNN architectures with a large number of layers. We provide comprehensive experiments and comparisons between BN, IterNorm, and SWBN layers to demonstrate the effectiveness of the proposed technique in conventional (many-shot) image classification and few-shot classification tasks.

READ FULL TEXT
research
03/14/2023

Context Normalization for Robust Image Classification

Normalization is a pre-processing step that converts the data into a mor...
research
04/23/2018

Decorrelated Batch Normalization

Batch Normalization (BN) is capable of accelerating the training of deep...
research
11/04/2022

LightNorm: Area and Energy-Efficient Batch Normalization Hardware for On-Device DNN Training

When training early-stage deep neural networks (DNNs), generating interm...
research
07/26/2022

Reconstruction of inhomogeneous media by iterative reconstruction algorithm with learned projector

This paper is concerned with the inverse problem of scattering of time-h...
research
02/26/2022

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

While end-to-end training of Deep Neural Networks (DNNs) yields state of...
research
04/02/2020

Controllable Orthogonalization in Training DNNs

Orthogonality is widely used for training deep neural networks (DNNs) du...
research
02/26/2019

Recurrent Convolution for Compact and Cost-Adjustable Neural Networks: An Empirical Study

Recurrent convolution (RC) shares the same convolutional kernels and unr...

Please sign up or login with your details

Forgot password? Click here to reset