Lottery Ticket Implies Accuracy Degradation, Is It a Desirable Phenomenon?

02/19/2021
by   Ning Liu, et al.
8

In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle Carbin, 2018) pointed out that there could exist a winning ticket (i.e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network. However, it is not easy to observe such winning property in many scenarios, where for example, a relatively large learning rate is used even if it benefits training the original dense model. In this work, we investigate the underlying condition and rationale behind the winning property, and find that the underlying reason is largely attributed to the correlation between initialized weights and final-trained weights when the learning rate is not sufficiently large. Thus, the existence of winning property is correlated with an insufficient DNN pretraining, and is unlikely to occur for a well-trained DNN. To overcome this limitation, we propose the "pruning fine-tuning" method that consistently outperforms lottery ticket sparse training under the same pruning algorithm and the same total training epochs. Extensive experiments over multiple deep models (VGG, ResNet, MobileNet-v2) on different datasets have been conducted to justify our proposals.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 6

page 9

page 11

page 12

research
03/05/2020

Comparing Rewinding and Fine-tuning in Neural Network Pruning

Many neural network pruning algorithms proceed in three steps: train the...
research
03/19/2021

Cascade Weight Shedding in Deep Neural Networks: Benefits and Pitfalls for Network Pruning

We report, for the first time, on the cascade weight shedding phenomenon...
research
05/07/2021

Network Pruning That Matters: A Case Study on Retraining Variants

Network pruning is an effective method to reduce the computational expen...
research
09/20/2021

Reproducibility Study: Comparing Rewinding and Fine-tuning in Neural Network Pruning

Scope of reproducibility: We are reproducing Comparing Rewinding and Fin...
research
03/06/2020

Towards Practical Lottery Ticket Hypothesis for Adversarial Training

Recent research has proposed the lottery ticket hypothesis, suggesting t...
research
06/12/2020

How many winning tickets are there in one DNN?

The recent lottery ticket hypothesis proposes that there is one sub-netw...
research
07/01/2021

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

There have been long-standing controversies and inconsistencies over the...

Please sign up or login with your details

Forgot password? Click here to reset