Drawing early-bird tickets: Towards more efficient training of deep networks

09/26/2019
by   Haoran You, et al.
0

(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e.g., early stopping and low-precision training) at large learning rates. Our finding of EB tickets is consistent with recently reported observations that the key connectivity patterns of neural networks emerge early. Furthermore, we propose a mask distance metric that can be used to identify EB tickets with low computational overhead, without needing to know the true winning tickets that emerge after the full training. Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy. Experiments based on various deep networks and datasets validate: 1) the existence of EB tickets, and the effectiveness of mask distance in efficiently identifying them; and 2) that the proposed efficient training via EB tickets can achieve up to 4.7x energy savings while maintaining comparable or even better accuracy, demonstrating a promising and easily adopted method for tackling cost-prohibitive deep network training.

READ FULL TEXT

page 6

page 11

research
03/01/2021

GEBT: Drawing Early-Bird Tickets in Graph Convolutional Network Training

Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art...
research
06/06/2021

Efficient Lottery Ticket Finding: Less Data is More

The lottery ticket hypothesis (LTH) reveals the existence of winning tic...
research
03/01/2021

Class Means as an Early Exit Decision Mechanism

State-of-the-art neural networks with early exit mechanisms often need c...
research
03/01/2021

Statistically Significant Stopping of Neural Network Training

The general approach taken when training deep learning classifiers is to...
research
09/30/2022

Effective Early Stopping of Point Cloud Neural Networks

Early stopping techniques can be utilized to decrease the time cost, how...
research
06/30/2021

Understanding and Improving Early Stopping for Learning with Noisy Labels

The memorization effect of deep neural network (DNN) plays a pivotal rol...
research
07/04/2020

DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths

Over-parameterization is ubiquitous nowadays in training neural networks...

Please sign up or login with your details

Forgot password? Click here to reset