Pathological spectra of the Fisher information metric and its variants in deep neural networks

10/14/2019
by   Ryo Karakida, et al.
29

The Fisher information matrix (FIM) plays an essential role in statistics and machine learning as a Riemannian metric tensor. Focusing on the FIM and its variants in deep neural networks (DNNs), we reveal their characteristic behavior when the network is sufficiently wide and has random weights and biases. Various FIMs asymptotically show pathological eigenvalue spectra in the sense that a small number of eigenvalues take on large values while most of them are close to zero. This implies that the local shape of the parameter space or loss landscape is very steep in a few specific directions and almost flat in the other directions. Similar pathological spectra appear in other variants of FIMs: one is the neural tangent kernel; another is a metric for the input signal and feature space that arises from feedforward signal propagation. The quantitative understanding of the FIM and its variants provided here offers important perspectives on learning and signal processing in large-scale DNNs.

READ FULL TEXT

page 3

page 6

page 11

page 12

page 13

page 15

page 20

page 21

research
06/04/2018

Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach

This study analyzes the Fisher information matrix (FIM) by applying mean...
research
05/27/2019

Lightlike Neuromanifolds, Occam's Razor and Deep Learning

Why do deep neural networks generalize with a very high dimensional para...
research
10/06/2021

Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective

Deep neural networks (DNNs) often rely on easy-to-learn discriminatory f...
research
06/14/2020

The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry

The Fisher information matrix (FIM) is fundamental for understanding the...
research
06/07/2019

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Normalization methods play an important role in enhancing the performanc...
research
07/24/2019

A Fine-Grained Spectral Perspective on Neural Networks

Are neural networks biased toward simple functions? Does depth always he...
research
03/24/2021

Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case

Free Probability Theory (FPT) provides rich knowledge for handling mathe...

Please sign up or login with your details

Forgot password? Click here to reset