Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach

06/04/2018
by   Ryo Karakida, et al.
0

This study analyzes the Fisher information matrix (FIM) by applying mean-field theory to deep neural networks with random weights. We theoretically find novel statistics of the FIM, which are universal among a wide class of deep networks with any number of layers and various activation functions. Although most of the FIM's eigenvalues are close to zero, the maximum eigenvalue takes on a huge value and the eigenvalue distribution has an extremely long tail. These statistics suggest that the shape of a loss landscape is locally flat in most dimensions, but strongly distorted in the other dimensions. Moreover, our theory of the FIM leads to quantitative evaluation of learning in deep networks. First, the maximum eigenvalue enables us to estimate an appropriate size of a learning rate for steepest gradient methods to converge. Second, the flatness induced by the small eigenvalues is connected to generalization ability through a norm-based capacity measure.

READ FULL TEXT

page 7

page 17

research
03/06/2019

Mean-field Analysis of Batch Normalization

Batch Normalization (BatchNorm) is an extremely useful component of mode...
research
10/14/2019

Pathological spectra of the Fisher information metric and its variants in deep neural networks

The Fisher information matrix (FIM) plays an essential role in statistic...
research
10/09/2018

Information Geometry of Orthogonal Initializations and Training

Recently mean field theory has been successfully used to analyze propert...
research
11/30/2018

Measure, Manifold, Learning, and Optimization: A Theory Of Neural Networks

We present a formal measure-theoretical theory of neural networks (NN) b...
research
06/16/2016

Exponential expressivity in deep neural networks through transient chaos

We combine Riemannian geometry with the mean field theory of high dimens...
research
10/13/2019

Large Deviation Analysis of Function Sensitivity in Random Deep Neural Networks

Mean field theory has been successfully used to analyze deep neural netw...
research
10/27/2021

Does the Data Induce Capacity Control in Deep Learning?

This paper studies how the dataset may be the cause of the anomalous gen...

Please sign up or login with your details

Forgot password? Click here to reset