DeepAI AI Chat
Log In Sign Up

Beyond Random Matrix Theory for Deep Networks

by   Diego Granziol, et al.

We investigate whether the Wigner semi-circle and Marcenko-Pastur distributions, often used for deep neural network theoretical analysis, match empirically observed spectral densities. We find that even allowing for outliers, the observed spectral shapes strongly deviate from such theoretical predictions. This raises major questions about the usefulness of these models in deep learning. We further show that theoretical results, such as the layered nature of critical points, are strongly dependent on the use of the exact form of these limiting spectral densities. We consider two new classes of matrix ensembles; random Wigner/Wishart ensemble products and percolated Wigner/Wishart ensembles, both of which better match observed spectra. They also give large discrete spectral peaks at the origin, providing a theoretical explanation for the observation that various optima can be connected by one dimensional of low loss values. We further show that, in the case of a random matrix product, the weight of the discrete spectral component at 0 depends on the ratio of the dimensions of the weight matrices.


Spectral Ergodicity in Deep Learning Architectures via Surrogate Random Matrices

In this work a novel method to quantify spectral ergodicity for random m...

The Perturbative Resolvent Method: spectral densities of random matrix ensembles via perturbation theory

We present a simple, perturbative approach for calculating spectral dens...

Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra

Numerous researchers recently applied empirical spectral analysis to the...

Applicability of Random Matrix Theory in Deep Learning

We investigate the local spectral statistics of the loss surface Hessian...

Limiting Distributions of Spectral Radii for Product of Matrices from the Spherical Ensemble

Consider the product of m independent n× n random matrices from the sphe...

Random matrix analysis of deep neural network weight matrices

Neural networks have been used successfully in a variety of fields, whic...

The Emergence of Spectral Universality in Deep Networks

Recent work has shown that tight concentration of the entire spectrum of...