Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

02/08/2021
by   Aojun Zhou, et al.
0

Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple individual weights distributed across the neural network, and structured coarse-grained sparsity which prunes blocks of sub-networks of a neural network. Fine-grained sparsity can achieve a high compression ratio but is not hardware friendly and hence receives limited speed gains. On the other hand, coarse-grained sparsity cannot concurrently achieve both apparent acceleration on modern GPUs and decent performance. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network, which can maintain the advantages of both unstructured fine-grained sparsity and structured coarse-grained sparsity simultaneously on specifically designed GPUs. Specifically, a 2:4 sparse network could achieve 2x speed-up without performance drop on Nvidia A100 GPUs. Furthermore, we propose a novel and effective ingredient, sparse-refined straight-through estimator (SR-STE), to alleviate the negative influence of the approximated gradients computed by vanilla STE during optimization. We also define a metric, Sparse Architecture Divergence (SAD), to measure the sparse network's topology change during the training process. Finally, We justify SR-STE's advantages with SAD and demonstrate the effectiveness of SR-STE by performing comprehensive experiments on various tasks. Source codes and models are available at https://github.com/NM-sparsity/NM-sparsity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2017

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

Sparsity helps reduce the computational complexity of deep neural networ...
research
05/30/2023

Dynamic Sparsity Is Channel-Level Sparsity Learner

Sparse training has received an upsurging interest in machine learning d...
research
06/14/2022

Learning Best Combination for Efficient N:M Sparsity

By forcing at most N out of M consecutive weights to be non-zero, the re...
research
10/08/2021

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

When deploying deep learning models to a device, it is traditionally ass...
research
03/25/2022

Deformable Butterfly: A Highly Structured and Sparse Linear Transform

We introduce a new kind of linear transform named Deformable Butterfly (...
research
02/16/2021

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Recently, researchers proposed pruning deep neural network weights (DNNs...
research
07/14/2023

Learning Sparse Neural Networks with Identity Layers

The sparsity of Deep Neural Networks is well investigated to maximize th...

Please sign up or login with your details

Forgot password? Click here to reset