The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

02/05/2022
by   Shiwei Liu, et al.
7

Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks. Without any delicate pruning criteria or carefully pursued sparsity structures, we empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. There are two key factors that contribute to this revival: (i) the network sizes matter: as the original dense networks grow wider and deeper, the performance of training a randomly pruned sparse network will quickly grow to matching that of its dense equivalent, even at high sparsity ratios; (ii) appropriate layer-wise sparsity ratios can be pre-chosen for sparse training, which shows to be another important performance booster. Simple as it looks, a randomly pruned subnetwork of Wide ResNet-50 can be sparsely trained to outperforming a dense Wide ResNet-50, on ImageNet. We also observed such randomly pruned networks outperform dense counterparts in other favorable aspects, such as out-of-distribution detection, uncertainty estimation, and adversarial robustness. Overall, our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning. Our source code can be found at https://github.com/VITA-Group/Random_Pruning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Learning Activation Functions for Sparse Neural Networks

Sparse Neural Networks (SNNs) can potentially demonstrate similar perfor...
research
03/03/2023

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

Sparse Neural Networks (SNNs) have received voluminous attention predomi...
research
06/21/2023

Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training

Dynamic Sparse Training (DST) is a rapidly evolving area of research tha...
research
08/23/2022

Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost

Lottery tickets (LTs) is able to discover accurate and sparse subnetwork...
research
04/04/2022

APP: Anytime Progressive Pruning

With the latest advances in deep learning, there has been a lot of focus...
research
01/09/2020

Campfire: Compressible, Regularization-Free, Structured Sparse Training for Hardware Accelerators

This paper studies structured sparse training of CNNs with a gradual pru...
research
03/08/2022

Dual Lottery Ticket Hypothesis

Fully exploiting the learning capacity of neural networks requires overp...

Please sign up or login with your details

Forgot password? Click here to reset