DeepAI
Log In Sign Up

The Lottery Ticket Hypothesis: Finding Small, Trainable Neural Networks

03/09/2018
by   Jonathan Frankle, et al.
0

Neural network compression techniques are able to reduce the parameter counts of trained networks by over 90 inference performance--without compromising accuracy. However, contemporary experience is that it is difficult to train small architectures from scratch, which would similarly improve training performance. We articulate a new conjecture to explain why it is easier to train large networks: the "lottery ticket hypothesis." It states that large networks that train successfully contain subnetworks that--when trained in isolation--converge in a comparable number of iterations to comparable accuracy. These subnetworks, which we term "winning tickets," have won the initialization lottery: their connections have initial weights that make training particularly effective. We find that a standard technique for pruning unnecessary network weights naturally uncovers a subnetwork which, at the start of training, comprised a winning ticket. We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis. We consistently find winning tickets that are less than 20 fully-connected, convolutional, and residual architectures for MNIST and CIFAR10. Furthermore, winning tickets at moderate levels of pruning (20-50 the original network size) converge up to 6.7x faster than the original network and exhibit higher test accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/09/2018

The Lottery Ticket Hypothesis: Training Pruned Neural Networks

Recent work on neural network pruning indicates that, at training time, ...
05/10/2022

Robust Learning of Parsimonious Deep Neural Networks

We propose a simultaneous learning and pruning algorithm capable of iden...
06/23/2020

Principal Component Networks: Parameter Reduction Early in Training

Recent works show that overparameterized networks contain small subnetwo...
05/03/2019

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed ...
01/31/2022

Signing the Supermask: Keep, Hide, Invert

The exponential growth in numbers of parameters of neural networks over ...
12/13/2021

On the Compression of Natural Language Models

Deep neural networks are effective feature extractors but they are prohi...
02/24/2022

Rare Gems: Finding Lottery Tickets at Initialization

It has been widely observed that large neural networks can be pruned to ...

Code Repositories

lottery-ticket-hypothesis

Lottery Ticker Hypothesis in Chainer


view repo