Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks

05/17/2023
by   Chanwoo Chun, et al.
0

We investigate how sparse neural activity affects the generalization performance of a deep Bayesian neural network at the large width limit. To this end, we derive a neural network Gaussian Process (NNGP) kernel with rectified linear unit (ReLU) activation and a predetermined fraction of active neurons. Using the NNGP kernel, we observe that the sparser networks outperform the non-sparse networks at shallow depths on a variety of datasets. We validate this observation by extending the existing theory on the generalization error of kernel-ridge regression.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset