Extended critical regimes of deep neural networks

03/24/2022
by   Cheng Kevin Qu, et al.
0

Deep neural networks (DNNs) have been successfully applied to many real-world problems, but a complete understanding of their dynamical and computational principles is still lacking. Conventional theoretical frameworks for analysing DNNs often assume random networks with coupling weights obeying Gaussian statistics. However, non-Gaussian, heavy-tailed coupling is a ubiquitous phenomenon in DNNs. Here, by weaving together theories of heavy-tailed random matrices and non-equilibrium statistical physics, we develop a new type of mean field theory for DNNs which predicts that heavy-tailed weights enable the emergence of an extended critical regime without fine-tuning parameters. In this extended critical regime, DNNs exhibit rich and complex propagation dynamics across layers. We further elucidate that the extended criticality endows DNNs with profound computational advantages: balancing the contraction as well as expansion of internal neural representations and speeding up training processes, hence providing a theoretical guide for the design of efficient neural architectures.

READ FULL TEXT

page 2

page 5

page 6

page 7

page 8

page 12

research
01/24/2019

Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks

Given two or more Deep Neural Networks (DNNs) with the same or similar a...
research
01/24/2019

Traditional and Heavy-Tailed Self Regularization in Neural Network Models

Random Matrix Theory (RMT) is applied to analyze the weight matrices of ...
research
06/01/2019

A mean-field limit for certain deep neural networks

Understanding deep neural networks (DNNs) is a key challenge in the theo...
research
10/02/2018

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning

Random Matrix Theory (RMT) is applied to analyze weight matrices of Deep...
research
10/06/2021

Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks

In this paper, we interpret Deep Neural Networks with Complex Network Th...
research
07/28/2016

Variational perturbation and extended Plefka approaches to dynamics on random networks: the case of the kinetic Ising model

We describe and analyze some novel approaches for studying the dynamics ...
research
03/24/2021

Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case

Free Probability Theory (FPT) provides rich knowledge for handling mathe...

Please sign up or login with your details

Forgot password? Click here to reset