Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

02/04/2021
by   Shiwei Liu, et al.
10

In this paper, we introduce a new perspective on training deep neural networks capable of state-of-the-art performance without the need for the expensive over-parameterization by proposing the concept of In-Time Over-Parameterization (ITOP) in sparse training. By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization in the space-time manifold, closing the gap in the expressibility between sparse training and dense training. We further use ITOP to understand the underlying mechanism of Dynamic Sparse Training (DST) and indicate that the benefits of DST come from its ability to consider across time all possible parameters when searching for the optimal sparse connectivity. As long as there are sufficient parameters that have been reliably explored during training, DST can outperform the dense neural network by a large margin. We present a series of experiments to support our conjecture and achieve the state-of-the-art sparse training performance with ResNet-50 on ImageNet. More impressively, our method achieves dominant performance over the overparameterization-based sparse methods at extreme sparsity levels. When trained on CIFAR-100, our method can match the performance of the dense model even at an extreme sparsity (98

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2021

FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity

Recent works on sparse neural networks have demonstrated that it is poss...
research
11/30/2022

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

Over-parameterization of deep neural networks (DNNs) has shown high pred...
research
01/18/2022

Design Space Exploration of Dense and Sparse Mapping Schemes for RRAM Architectures

The impact of device and circuit-level effects in mixed-signal Resistive...
research
01/22/2021

Selfish Sparse RNN Training

Sparse neural networks have been widely applied to reduce the necessary ...
research
05/30/2022

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

Recent works on sparse neural network training (sparse training) have sh...
research
07/10/2019

Sparse Networks from Scratch: Faster Training without Losing Performance

We demonstrate the possibility of what we call sparse learning: accelera...
research
02/15/2019

Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Deep neural networks are typically highly over-parameterized with prunin...

Please sign up or login with your details

Forgot password? Click here to reset