FreezeNet: Full Performance by Reduced Storage Costs

11/28/2020
by   Paul Wimmer, et al.
0

Pruning generates sparse networks by setting parameters to zero. In this work we improve one-shot pruning methods, applied before training, without adding any additional storage costs while preserving the sparse gradient computations. The main difference to pruning is that we do not sparsify the network's weights but learn just a few key parameters and keep the other ones fixed at their random initialized value. This mechanism is called freezing the parameters. Those frozen weights can be stored efficiently with a single 32bit random seed number. The parameters to be frozen are determined one-shot by a single for- and backward pass applied before training starts. We call the introduced method FreezeNet. In our experiments we show that FreezeNets achieve good results, especially for extreme freezing rates. Freezing weights preserves the gradient flow throughout the network and consequently, FreezeNets train better and have an increased capacity compared to their pruned counterparts. On the classification tasks MNIST and CIFAR-10/100 we outperform SNIP, in this setting the best reported one-shot pruning method, applied before training. On MNIST, FreezeNet achieves 99.2 architecture, while compressing the number of trained and stored parameters by a factor of x 157.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2022

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

State-of-the-art deep learning models have a parameter count that reache...
research
07/27/2021

COPS: Controlled Pruning Before Training Starts

State-of-the-art deep neural network (DNN) pruning techniques, applied o...
research
07/01/2020

Single Shot Structured Pruning Before Training

We introduce a method to speed up training by 2x and inference by 3x in ...
research
06/21/2022

Renormalized Sparse Neural Network Pruning

Large neural networks are heavily over-parameterized. This is done becau...
research
09/22/2020

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot

Network pruning is a method for reducing test-time computational resourc...
research
09/06/2022

What to Prune and What Not to Prune at Initialization

Post-training dropout based approaches achieve high sparsity and are wel...
research
12/27/2022

DeepCuts: Single-Shot Interpretability based Pruning for BERT

As language models have grown in parameters and layers, it has become mu...

Please sign up or login with your details

Forgot password? Click here to reset