Training Faster by Separating Modes of Variation in Batch-normalized Models

06/07/2018
by   Mahdi M. Kalayeh, et al.
0

Batch Normalization (BN) is essential to effectively train state-of-the-art deep Convolutional Neural Networks (CNN). It normalizes inputs to the layers during training using the statistics of each mini-batch. In this work, we study BN from the viewpoint of Fisher kernels. We show that assuming samples within a mini-batch are from the same probability density function, then BN is identical to the Fisher vector of a Gaussian distribution. That means BN can be explained in terms of kernels that naturally emerge from the probability density function of the underlying data distribution. However, given the rectifying non-linearities employed in CNN architectures, distribution of inputs to the layers show heavy tail and asymmetric characteristics. Therefore, we propose approximating underlying data distribution not with one, but a mixture of Gaussian densities. Deriving Fisher vector for a Gaussian Mixture Model (GMM), reveals that BN can be improved by independently normalizing with respect to the statistics of disentangled sub-populations. We refer to our proposed soft piecewise version of BN as Mixture Normalization (MN). Through extensive set of experiments on CIFAR-10 and CIFAR-100, we show that MN not only effectively accelerates training image classification and Generative Adversarial networks, but also reaches higher quality models.

READ FULL TEXT

page 13

page 14

research
07/31/2016

Deep FisherNet for Object Classification

Despite the great success of convolutional neural networks (CNN) for the...
research
02/09/2018

Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches

As an indispensable component, Batch Normalization (BN) has successfully...
research
06/08/2020

Passive Batch Injection Training Technique: Boosting Network Performance by Injecting Mini-Batches from a different Data Distribution

This work presents a novel training technique for deep neural networks t...
research
06/30/2020

PriorGAN: Real Data Prior for Generative Adversarial Nets

Generative adversarial networks (GANs) have achieved rapid progress in l...
research
12/02/2022

Compound Batch Normalization for Long-tailed Image Classification

Significant progress has been made in learning image classification neur...
research
05/20/2022

Kernel Normalized Convolutional Networks

Existing deep convolutional neural network (CNN) architectures frequentl...
research
07/12/2019

Estimating densities with nonlinear support using Fisher-Gaussian kernels

Current tools for multivariate density estimation struggle when the dens...

Please sign up or login with your details

Forgot password? Click here to reset