Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels

12/04/2022
by   Kangyu Weng, et al.
0

In deep learning, neural networks serve as noisy channels between input data and its representation. This perspective naturally relates deep learning with the pursuit of constructing channels with optimal performance in information transmission and representation. While considerable efforts are concentrated on realizing optimal channel properties during network optimization, we study a frequently overlooked possibility that neural networks can be initialized toward optimal channels. Our theory, consistent with experimental validation, identifies primary mechanics underlying this unknown possibility and suggests intrinsic connections between statistical physics and deep learning. Unlike the conventional theories that characterize neural networks applying the classic mean-filed approximation, we offer analytic proof that this extensively applied simplification scheme is not valid in studying neural networks as information channels. To fill this gap, we develop a corrected mean-field framework applicable for characterizing the limiting behaviors of information propagation in neural networks without strong assumptions on inputs. Based on it, we propose an analytic theory to prove that mutual information maximization is realized between inputs and propagated signals when neural networks are initialized at dynamic isometry, a case where information transmits via norm-preserving mappings. These theoretical predictions are validated by experiments on real neural networks, suggesting the robustness of our theory against finite-size effects. Finally, we analyze our findings with information bottleneck theory to confirm the precise relations among dynamic isometry, mutual information maximization, and optimal channel properties in deep learning.

READ FULL TEXT
research
03/22/2022

A Perspective on Neural Capacity Estimation: Viability and Reliability

Recently, several methods have been proposed for estimating the mutual i...
research
02/08/2021

Mutual Information of Neural Network Initialisations: Mean Field Approximations

The ability to train randomly initialised deep neural networks is known ...
research
11/01/2018

Critical initialisation for deep signal propagation in noisy rectifier neural networks

Stochastic regularisation is an important weapon in the arsenal of a dee...
research
07/21/2022

Deep Sufficient Representation Learning via Mutual Information

We propose a mutual information-based sufficient representation learning...
research
12/31/2019

On the Difference Between the Information Bottleneck and the Deep Information Bottleneck

Combining the Information Bottleneck model with deep learning by replaci...
research
05/24/2018

Entropy and mutual information in models of deep neural networks

We examine a class of deep learning models with a tractable method to co...
research
09/30/2022

Information Removal at the bottleneck in Deep Neural Networks

Deep learning models are nowadays broadly deployed to solve an incredibl...

Please sign up or login with your details

Forgot password? Click here to reset