Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

11/30/2022
by   Shaoyi Huang, et al.
6

Over-parameterization of deep neural networks (DNNs) has shown high prediction accuracy for many applications. Although effective, the large number of parameters hinders its popularity on resource-limited devices and has an outsize environmental impact. Sparse training (using a fixed number of nonzero weights in each iteration) could significantly mitigate the training costs by reducing the model size. However, existing sparse training methods mainly use either random-based or greedy-based drop-and-grow strategies, resulting in local minimal and low accuracy. In this work, to assist explainable sparse training, we propose important weights Exploitation and coverage Exploration to characterize Dynamic Sparse Training (DST-EE), and provide quantitative analysis of these two metrics. We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property. Experimental results show that sparse models (up to 98% sparsity) obtained by our proposed method outperform the SOTA sparse training methods on a wide variety of deep learning tasks. On VGG-19 / CIFAR-100, ResNet-50 / CIFAR-10, ResNet-50 / CIFAR-100, our method has even higher accuracy than dense models. On ResNet-50 / ImageNet, the proposed method has up to 8.2% accuracy improvement compared to SOTA sparse training methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

Biologically inspired Spiking Neural Networks (SNNs) have attracted sign...
research
02/04/2021

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

In this paper, we introduce a new perspective on training deep neural ne...
research
08/14/2023

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Sparse neural networks are a key factor in developing resource-efficient...
research
05/27/2022

Spartan: Differentiable Sparsity via Regularized Transportation

We present Spartan, a method for training sparse neural network models w...
research
05/29/2019

Less is More: An Exploration of Data Redundancy with Active Dataset Subsampling

Deep Neural Networks (DNNs) often rely on very large datasets for traini...
research
06/08/2023

Magnitude Attention-based Dynamic Pruning

Existing pruning methods utilize the importance of each weight based on ...
research
02/18/2023

Calibrating the Rigged Lottery: Making All Tickets Reliable

Although sparse training has been successfully used in various resource-...

Please sign up or login with your details

Forgot password? Click here to reset