Spatial Re-parameterization for N:M Sparsity

06/09/2023
by   Yuxin Zhang, et al.
0

This paper presents a Spatial Re-parameterization (SpRe) method for the N:M sparsity in CNNs. SpRe is stemmed from an observation regarding the restricted variety in spatial sparsity present in N:M sparsity compared with unstructured sparsity. Particularly, N:M sparsity exhibits a fixed sparsity rate within the spatial domains due to its distinctive pattern that mandates N non-zero components among M successive weights in the input channel dimension of convolution filters. On the contrary, we observe that unstructured sparsity displays a substantial divergence in sparsity across the spatial domains, which we experimentally verified to be very crucial for its robust performance retention compared with N:M sparsity. Therefore, SpRe employs the spatial-sparsity distribution of unstructured sparsity to assign an extra branch in conjunction with the original N:M branch at training time, which allows the N:M sparse network to sustain a similar distribution of spatial sparsity with unstructured sparsity. During inference, the extra branch can be further re-parameterized into the main N:M branch, without exerting any distortion on the sparse pattern or additional computation costs. SpRe has achieved a commendable feat by matching the performance of N:M sparsity methods with state-of-the-art unstructured sparsity methods across various benchmarks. Code and models are anonymously available at <https://github.com/zyxxmu/SpRe>.

READ FULL TEXT
research
05/03/2023

Dynamic Sparse Training with Structured Sparsity

DST methods achieve state-of-the-art results in sparse neural network tr...
research
05/30/2023

Dynamic Sparsity Is Channel-Level Sparsity Learner

Sparse training has received an upsurging interest in machine learning d...
research
01/11/2021

RepVGG: Making VGG-style ConvNets Great Again

We present a simple but powerful architecture of convolutional neural ne...
research
06/08/2020

Neural Sparse Representation for Image Restoration

Inspired by the robustness and efficiency of sparse representation in sp...
research
05/31/2021

1×N Block Pattern for Network Sparsity

Though network sparsity emerges as a promising direction to overcome the...
research
12/02/2022

Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training?

Turning the weights to zero when training a neural network helps in redu...
research
02/09/2022

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

The lottery ticket hypothesis (LTH) has shown that dense models contain ...

Please sign up or login with your details

Forgot password? Click here to reset