Rigging the Lottery: Making All Tickets Winners

11/25/2019
by   Utku Evci, et al.
0

Sparse neural networks have been shown to be more parameter and compute efficient compared to dense networks and in some cases are used to decrease wall clock inference times. There is a large body of work on training dense networks to yield sparse networks for inference. This limits the size of the largest trainable sparse model to that of the largest trainable dense model. In this paper we introduce a method to train sparse neural networks with a fixed parameter count and a fixed computational cost throughout training, without sacrificing accuracy relative to existing dense-to-sparse training methods. Our method updates the topology of the network during training by using parameter magnitudes and infrequent gradient calculations. We show that this approach requires fewer floating-point operations (FLOPs) to achieve a given level of accuracy compared to prior techniques. Importantly, by adjusting the topology it can start from any initialization - not just "lucky" ones. We demonstrate state-of-the-art sparse training results with ResNet-50, MobileNet v1 and MobileNet v2 on the ImageNet-2012 dataset, WideResNets on the CIFAR-10 dataset and RNNs on the WikiText-103 dataset. Finally, we provide some insights into why allowing the topology to change during the optimization can overcome local minima encountered when the topology remains static.

READ FULL TEXT

page 16

page 20

research
01/22/2021

Selfish Sparse RNN Training

Sparse neural networks have been widely applied to reduce the necessary ...
research
03/29/2021

[Reproducibility Report] Rigging the Lottery: Making All Tickets Winners

RigL, a sparse training algorithm, claims to directly train sparse netwo...
research
02/02/2021

Truly Sparse Neural Networks at Scale

Recently, sparse training methods have started to be established as a de...
research
05/30/2022

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

Recent works on sparse neural network training (sparse training) have sh...
research
06/17/2021

On the training of sparse and dense deep neural networks: less parameters, same performance

Deep neural networks can be trained in reciprocal space, by acting on th...
research
03/17/2019

Evolving and Understanding Sparse Deep Neural Networks using Cosine Similarity

Training sparse neural networks with adaptive connectivity is an active ...
research
03/08/2022

Dual Lottery Ticket Hypothesis

Fully exploiting the learning capacity of neural networks requires overp...

Please sign up or login with your details

Forgot password? Click here to reset