Pruning neural networks without any data by iteratively conserving synaptic flow

06/09/2020
by   Hidenori Tanaka, et al.
6

Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time. Recent works have identified, through an expensive sequence of training and pruning cycles, the existence of winning lottery tickets or sparse trainable subnetworks at initialization. This raises a foundational question: can we identify highly sparse trainable subnetworks at initialization, without ever training, or indeed without ever looking at the data? We provide an affirmative answer to this question through theory driven algorithm design. We first mathematically formulate and experimentally verify a conservation law that explains why existing gradient-based pruning algorithms at initialization suffer from layer-collapse, the premature pruning of an entire layer rendering a network untrainable. This theory also elucidates how layer-collapse can be entirely avoided, motivating a novel pruning algorithm Iterative Synaptic Flow Pruning (SynFlow). This algorithm can be interpreted as preserving the total flow of synaptic strengths through the network at initialization subject to a sparsity constraint. Notably, this algorithm makes no reference to the training data and consistently outperforms existing state-of-the-art pruning algorithms at initialization over a range of models (VGG and ResNet), datasets (CIFAR-10/100 and Tiny ImageNet), and sparsity constraints (up to 99.9 percent). Thus our data-agnostic pruning algorithm challenges the existing paradigm that data must be used to quantify which synapses are important.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2020

Picking Winning Tickets Before Training by Preserving Gradient Flow

Overparameterization has been shown to benefit both the optimization and...
research
02/18/2022

Amenable Sparse Network Investigator

As the optimization problem of pruning a neural network is nonconvex and...
research
02/16/2022

Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

Pruning neural networks at initialization would enable us to find sparse...
research
03/14/2023

Sr-init: An interpretable layer pruning method

Despite the popularization of deep neural networks (DNNs) in many fields...
research
03/30/2021

The Elastic Lottery Ticket Hypothesis

Lottery Ticket Hypothesis raises keen attention to identifying sparse tr...
research
05/31/2023

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability

Is the lottery ticket phenomenon an idiosyncrasy of gradient-based train...
research
03/29/2021

[Reproducibility Report] Rigging the Lottery: Making All Tickets Winners

RigL, a sparse training algorithm, claims to directly train sparse netwo...

Please sign up or login with your details

Forgot password? Click here to reset