Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity

by   Artem Vysogorets, et al.

Neural network pruning is a fruitful area of research with surging interest in high sparsity regimes. Benchmarking in this domain heavily relies on faithful representation of the sparsity of subnetworks, which has been traditionally computed as the fraction of removed connections (direct sparsity). This definition, however, fails to recognize unpruned parameters that detached from input or output layers of underlying subnetworks, potentially underestimating actual effective sparsity: the fraction of inactivated connections. While this effect might be negligible for moderately pruned networks (up to 10-100 compression rates), we find that it plays an increasing role for thinner subnetworks, greatly distorting comparison between different pruning algorithms. For example, we show that effective compression of a randomly pruned LeNet-300-100 can be orders of magnitude larger than its direct counterpart, while no discrepancy is ever observed when using SynFlow for pruning [Tanaka et al., 2020]. In this work, we adopt the lens of effective sparsity to reevaluate several recent pruning algorithms on common benchmark architectures (e.g., LeNet-300-100, VGG-19, ResNet-18) and discover that their absolute and relative performance changes dramatically in this new and more appropriate framework. To aim for effective, rather than direct, sparsity, we develop a low-cost extension to most pruning algorithms. Further, equipped with effective sparsity as a reference frame, we partially reconfirm that random pruning with appropriate sparsity allocation across layers performs as well or better than more sophisticated algorithms for pruning at initialization [Su et al., 2020]. In response to this observation, using a simple analogy of pressure distribution in coupled cylinders from physics, we design novel layerwise sparsity quotas that outperform all existing baselines in the context of random pruning.


page 1

page 2

page 3

page 4


Pruning Neural Networks at Initialization: Why are We Missing the Mark?

Recent work has explored the possibility of pruning neural networks at i...

Plant 'n' Seek: Can You Find the Winning Ticket?

The lottery ticket hypothesis has sparked the rapid development of pruni...

The State of Sparsity in Deep Neural Networks

We rigorously evaluate three state-of-the-art techniques for inducing sp...

Studying the impact of magnitude pruning on contrastive learning methods

We study the impact of different pruning techniques on the representatio...

Data pruning and neural scaling laws: fundamental limitations of score-based algorithms

Data pruning algorithms are commonly used to reduce the memory and compu...

How Erdös and Rényi Win the Lottery

Random masks define surprisingly effective sparse neural network models,...

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints

The performance of trained neural networks is robust to harsh levels of ...

Please sign up or login with your details

Forgot password? Click here to reset