Provably Efficient Lottery Ticket Discovery

07/31/2021
by   Cameron R. Wolfe, et al.
0

The lottery ticket hypothesis (LTH) claims that randomly-initialized, dense neural networks contain (sparse) subnetworks that, when trained an equal amount in isolation, can match the dense network's performance. Although LTH is useful for discovering efficient network architectures, its three-step process – pre-training, pruning, and re-training – is computationally expensive, as the dense model must be fully pre-trained. Luckily, "early-bird" tickets can be discovered within neural networks that are minimally pre-trained, allowing for the creation of efficient, LTH-inspired training procedures. Yet, no theoretical foundation of this phenomenon exists. We derive an analytical bound for the number of pre-training iterations that must be performed for a winning ticket to be discovered, thus providing a theoretical understanding of when and why such early-bird tickets exist. By adopting a greedy forward selection pruning strategy, we directly connect the pruned network's performance to the loss of the dense network from which it was derived, revealing a threshold in the number of pre-training iterations beyond which high-performing subnetworks are guaranteed to exist. We demonstrate the validity of our theoretical results across a variety of architectures and datasets, including multi-layer perceptrons (MLPs) trained on MNIST and several deep convolutional neural network (CNN) architectures trained on CIFAR10 and ImageNet.

READ FULL TEXT
research
06/02/2022

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

A striking observation about iterative magnitude pruning (IMP; Frankle e...
research
12/07/2021

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

We propose a novel, structured pruning algorithm for neural networks – t...
research
06/06/2021

Efficient Lottery Ticket Finding: Less Data is More

The lottery ticket hypothesis (LTH) reveals the existence of winning tic...
research
03/03/2020

Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

Recent empirical works show that large deep neural networks are often hi...
research
02/19/2020

Pruning untrained neural networks: Principles and Analysis

Overparameterized neural networks display state-of-the art performance. ...
research
11/22/2021

On the Existence of Universal Lottery Tickets

The lottery ticket hypothesis conjectures the existence of sparse subnet...
research
01/11/2022

Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics

Efficient model selection for identifying a suitable pre-trained neural ...

Please sign up or login with your details

Forgot password? Click here to reset