Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

02/15/2019
by   Hesham Mostafa, et al.
0

Deep neural networks are typically highly over-parameterized with pruning techniques able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic re-allocation of non-zero parameters have emerged for training sparse networks directly without having to train a large dense model beforehand. We present a parameter re-allocation scheme that addresses the limitations of previous methods such as their high computational cost and the fixed number of parameters they allocate to each layer. We investigate the performance of these dynamic re-allocation methods in deep convolutional networks and show that our method outperforms previous static and dynamic parameterization methods, yielding the best accuracy for a given number of training parameters, and performing on par with networks obtained by iteratively pruning a trained dense model. We further investigated the mechanisms underlying the superior performance of the resulting sparse networks. We found that neither the structure, nor the initialization of the sparse networks discovered by our parameter reallocation scheme are sufficient to explain their superior generalization performance. Rather, it is the continuous exploration of different sparse network structures during training that is critical to effective learning. We show that it is more fruitful to explore these structural degrees of freedom than to add extra parameters to the network.

READ FULL TEXT

page 7

page 8

research
07/15/2016

DSD: Dense-Sparse-Dense Training for Deep Neural Networks

Modern deep neural networks have a large number of parameters, making th...
research
03/29/2021

[Reproducibility Report] Rigging the Lottery: Making All Tickets Winners

RigL, a sparse training algorithm, claims to directly train sparse netwo...
research
07/13/2021

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

A variety of recent works, spanning pruning, lottery tickets, and traini...
research
11/11/2019

Structural Pruning in Deep Neural Networks: A Small-World Approach

Deep Neural Networks (DNNs) are usually over-parameterized, causing exce...
research
02/04/2021

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

In this paper, we introduce a new perspective on training deep neural ne...
research
03/27/2018

Incremental Training of Deep Convolutional Neural Networks

We propose an incremental training method that partitions the original n...
research
06/28/2021

FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity

Recent works on sparse neural networks have demonstrated that it is poss...

Please sign up or login with your details

Forgot password? Click here to reset