A Signal Propagation Perspective for Pruning Neural Networks at Initialization

06/14/2019
by   Namhoon Lee, et al.
0

Network pruning is a promising avenue for compressing deep neural networks. A typical approach to pruning starts by training a model and removing unnecessary parameters while minimizing the impact on what is learned. Alternatively, a recent approach shows that pruning can be done at initialization prior to training. However, it remains unclear exactly why pruning an untrained, randomly initialized neural network is effective. In this work, we consider the pruning problem from a signal propagation perspective, formally characterizing initialization conditions that ensure faithful signal propagation throughout a network. Based on singular values of a network's input-output Jacobian, we find that orthogonal initialization enables more faithful signal propagation compared to other initialization schemes, thereby enhancing pruning results on a range of modern architectures and datasets. Also, we empirically study the effect of supervision for pruning at initialization, and show that often unsupervised pruning can be as effective as the supervised pruning. Furthermore, we demonstrate that our signal propagation perspective, combined with unsupervised pruning, can indeed be useful in various scenarios where pruning is applied to non-standard arbitrarily-designed architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2023

Pruning at Initialization – A Sketching Perspective

The lottery ticket hypothesis (LTH) has increased attention to pruning n...
research
03/27/2022

On the Neural Tangent Kernel Analysis of Randomly Pruned Wide Neural Networks

We study the behavior of ultra-wide neural networks when their weights a...
research
01/26/2021

A Unified Paths Perspective for Pruning at Initialization

A number of recent approaches have been proposed for pruning neural netw...
research
10/21/2021

Towards strong pruning for lottery tickets with non-zero biases

The strong lottery ticket hypothesis holds the promise that pruning rand...
research
02/18/2022

Amenable Sparse Network Investigator

As the optimization problem of pruning a neural network is nonconvex and...
research
05/12/2021

Dynamical Isometry: The Missing Ingredient for Neural Network Pruning

Several recent works [40, 24] observed an interesting phenomenon in neur...
research
10/22/2020

PHEW: Paths with higher edge-weights give "winning tickets" without training data

Sparse neural networks have generated substantial interest recently beca...

Please sign up or login with your details

Forgot password? Click here to reset