
Deep learning generalizes because the parameterfunction map is biased towards simple functions
Deep neural networks generalize remarkably well without explicit regular...
read it

Fourier EntropyInfluence Conjecture for Random Linear Threshold Functions
The FourierEntropy Influence (FEI) Conjecture states that for any Boole...
read it

Average Bias and Polynomial Sources
We identify a new notion of pseudorandomness for randomness sources, whi...
read it

Towards Understanding the Spectral Bias of Deep Learning
An intriguing phenomenon observed during training neural networks is the...
read it

On the Fourier Entropy Influence Conjecture for Extremal Classes
The Fourier EntropyInfluence (FEI) Conjecture of Friedgut and Kalai sta...
read it

FullJacobian Representation of Neural Networks
Nonlinear functions such as neural networks can be locally approximated...
read it

Entropy Estimation of Physically Unclonable Functions via Chow Parameters
A physically unclonable function (PUF) is an electronic circuit that pro...
read it
Neural networks are a priori biased towards Boolean functions with low entropy
Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks  a singlelayer perceptron with n input neurons, one output neuron, and no threshold bias term  we prove that upon random initialisation of weights, the a priori probability P(t) that it represents a Boolean function that classifies t points in {0,1}^n as 1 has a remarkably simple form: P(t) = 2^n for 0≤ t < 2^n. Since a perceptron can express far fewer Boolean functions with small or large values of t (low "entropy") than with intermediate values of t (high "entropy") there is, on average, a strong intrinsic apriori bias towards individual functions with low entropy. Furthermore, within a class of functions with fixed t, we often observe a further intrinsic bias towards functions of lower complexity. Finally, we prove that, regardless of the distribution of inputs, the bias towards low entropy becomes monotonically stronger upon adding ReLU layers, and empirically show that increasing the variance of the bias term has a similar effect.
READ FULL TEXT
Comments
There are no comments yet.