On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks

03/27/2022
by   Hongru Yang, et al.
0

We study the behavior of ultra-wide neural networks when their weights are randomly pruned at the initialization, through the lens of neural tangent kernels (NTKs). We show that for fully-connected neural networks when the network is pruned randomly at the initialization, as the width of each layer grows to infinity, the empirical NTK of the pruned neural network converges to that of the original (unpruned) network with some extra scaling factor. Further, if we apply some appropriate scaling after pruning at the initialization, the empirical NTK of the pruned network converges to the exact NTK of the original network, and we provide a non-asymptotic bound on the approximation error in terms of pruning probability. Moreover, when we apply our result to an unpruned network (i.e., we set the probability of pruning a given weight to be zero), our analysis is optimal up to a logarithmic factor in width compared with the result in <cit.>. We conduct experiments to validate our theoretical results. We further test our theory by evaluating random pruning across different architectures via image classification on MNIST and CIFAR-10 and compare its performance with other pruning strategies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2019

A Signal Propagation Perspective for Pruning Neural Networks at Initialization

Network pruning is a promising avenue for compressing deep neural networ...
research
06/25/2022

A Fast, Well-Founded Approximation to the Empirical Neural Tangent Kernel

Empirical neural tangent kernels (eNTKs) can provide a good understandin...
research
01/01/2023

Theoretical Characterization of How Neural Network Pruning Affects its Generalization

It has been observed in practice that applying pruning-at-initialization...
research
06/14/2020

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

The strong lottery ticket hypothesis (LTH) postulates that one can appro...
research
06/30/2022

A note on Linear Bottleneck networks and their Transition to Multilinearity

Randomly initialized wide neural networks transition to linear functions...
research
11/02/2021

Subquadratic Overparameterization for Shallow Neural Networks

Overparameterization refers to the important phenomenon where the width ...
research
06/21/2022

Renormalized Sparse Neural Network Pruning

Large neural networks are heavily over-parameterized. This is done becau...

Please sign up or login with your details

Forgot password? Click here to reset