Strong Lottery Ticket Hypothesis with ε–perturbation

10/29/2022
by   Zheyang Xiong, et al.
0

The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training. We extend the theoretical guarantee of the strong LTH literature to a scenario more similar to the original LTH, by generalizing the weight change in the pre-training step to some perturbation around initialization. In particular, we focus on the following open questions: By allowing an ε-scale perturbation on the random initial weights, can we reduce the over-parameterization requirement for the candidate network in the strong LTH? Furthermore, does the weight change by SGD coincide with a good set of such perturbation? We answer the first question by first extending the theoretical result on subset sum to allow perturbation on the candidates. Applying this result to the neural network setting, we show that such ε-perturbation reduces the over-parameterization requirement of the strong LTH. To answer the second question, we show via experiments that the perturbed weight achieved by the projected SGD shows better performance under the strong LTH pruning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2020

Optimal Lottery Tickets via SubsetSum: Logarithmic Over-Parameterization is Sufficient

The strong lottery ticket hypothesis (LTH) postulates that one can appro...
research
02/03/2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need

The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a ...
research
03/05/2019

The Lottery Ticket Hypothesis at Scale

Recent work on the "lottery ticket hypothesis" proposes that randomly-in...
research
02/02/2022

Robust Training of Neural Networks using Scale Invariant Architectures

In contrast to SGD, adaptive gradient methods like Adam allow robust tra...
research
08/08/2022

Understanding Weight Similarity of Neural Networks via Chain Normalization Rule and Hypothesis-Training-Testing

We present a weight similarity measure method that can quantify the weig...
research
08/04/2020

Recovering a perturbation of a matrix polynomial from a perturbation of its linearization

A number of theoretical and computational problems for matrix polynomial...
research
06/22/2020

Logarithmic Pruning is All You Need

The Lottery Ticket Hypothesis is a conjecture that every large neural ne...

Please sign up or login with your details

Forgot password? Click here to reset