DeepAI AI Chat
Log In Sign Up

Initial Guessing Bias: How Untrained Networks Favor Some Classes

by   Emanuele Francazi, et al.

The initial state of neural networks plays a central role in conditioning the subsequent training dynamics. In the context of classification problems, we provide a theoretical analysis demonstrating that the structure of a neural network can condition the model to assign all predictions to the same class, even before the beginning of training, and in the absence of explicit biases. We show that the presence of this phenomenon, which we call "Initial Guessing Bias" (IGB), depends on architectural choices such as activation functions, max-pooling layers, and network depth. Our analysis of IGB has practical consequences, in that it guides architecture selection and initialization. We also highlight theoretical consequences, such as the breakdown of node-permutation symmetry, the violation of self-averaging, the validity of some mean-field approximations, and the non-trivial differences arising with depth.


page 1

page 2

page 3

page 4


On Symmetry and Initialization for Neural Networks

This work provides an additional step in the theoretical understanding o...

A Weight Initialization Based on the Linear Product Structure for Neural Networks

Weight initialization plays an important role in training neural network...

Training Integrable Parameterizations of Deep Neural Networks in the Infinite-Width Limit

To theoretically understand the behavior of trained deep neural networks...

Cooperative Initialization based Deep Neural Network Training

Researchers have proposed various activation functions. These activation...

Towards causally linking architectural parametrizations to algorithmic bias in neural networks

Training dataset biases are by far the most scrutinized factors when exp...

Towards strong pruning for lottery tickets with non-zero biases

The strong lottery ticket hypothesis holds the promise that pruning rand...