PHEW: Paths with higher edge-weights give "winning tickets" without training data

10/22/2020
by   Shreyas Malakarjun Patil, et al.
0

Sparse neural networks have generated substantial interest recently because they can be more efficient in learning and inference, without any significant drop in performance. The "lottery ticket hypothesis" has showed the existence of such sparse subnetworks at initialization. Given a fully-connected initialized architecture, our aim is to find such "winning ticket" networks, without any training data. We first show the advantages of forming input-output paths, over pruning individual connections, to avoid bottlenecks in gradient propagation. Then, we show that Paths with Higher Edge-Weights (PHEW) at initialization have higher loss gradient magnitude, resulting in more efficient training. Selecting such paths can be performed without any data. We empirically validate the effectiveness of the proposed approach against pruning-before-training methods on CIFAR10, CIFAR100 and Tiny-ImageNet for VGG-Net and ResNet. PHEW achieves significant improvements on the current state-of-the-art methods at 10%, 5% and 2% network density. We also evaluate the structural similarity relationship between PHEW networks and pruned networks constructed through Iterated Magnitude Pruning (IMP), concluding that the former belong in the family of winning tickets networks.

READ FULL TEXT
research
06/19/2021

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

Works on lottery ticket hypothesis (LTH) and single-shot network pruning...
research
02/18/2020

Picking Winning Tickets Before Training by Preserving Gradient Flow

Overparameterization has been shown to benefit both the optimization and...
research
06/14/2019

A Signal Propagation Perspective for Pruning Neural Networks at Initialization

Network pruning is a promising avenue for compressing deep neural networ...
research
02/18/2022

Amenable Sparse Network Investigator

As the optimization problem of pruning a neural network is nonconvex and...
research
01/26/2021

A Unified Paths Perspective for Pruning at Initialization

A number of recent approaches have been proposed for pruning neural netw...
research
06/08/2023

Magnitude Attention-based Dynamic Pruning

Existing pruning methods utilize the importance of each weight based on ...
research
09/24/2018

Dense neural networks as sparse graphs and the lightning initialization

Even though dense networks have lost importance today, they are still us...

Please sign up or login with your details

Forgot password? Click here to reset