Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

02/09/2022
by   Tianlong Chen, et al.
3

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i.e., winning tickets) that can be trained in isolation to match full accuracy. Despite many exciting efforts being made, there is one "commonsense" seldomly challenged: a winning ticket is found by iterative magnitude pruning (IMP) and hence the resultant pruned subnetworks have only unstructured sparsity. That gap limits the appeal of winning tickets in practice, since the highly irregular sparse patterns are challenging to accelerate on hardware. Meanwhile, directly substituting structured pruning for unstructured pruning in IMP damages performance more severely and is usually unable to locate winning tickets. In this paper, we demonstrate the first positive result that a structurally sparse winning ticket can be effectively found in general. The core idea is to append "post-processing techniques" after each round of (unstructured) IMP, to enforce the formation of structural sparsity. Specifically, we first "re-fill" pruned elements back in some channels deemed to be important, and then "re-group" non-zero elements to create flexible group-wise structural patterns. Both our identified channel- and group-wise structural subnetworks win the lottery, with substantial inference speedups readily supported by existing hardware. Extensive experiments, conducted on diverse datasets across multiple network backbones, consistently validate our proposal, showing that the hardware acceleration roadblock of LTH is now removed. Specifically, the structural winning tickets obtain up to 64.93 savings at 36 while maintaining comparable accuracy. Codes are available in https://github.com/VITA-Group/Structure-LTH.

READ FULL TEXT

page 3

page 8

page 13

page 14

research
05/30/2023

Dynamic Sparsity Is Channel-Level Sparsity Learner

Sparse training has received an upsurging interest in machine learning d...
research
05/03/2023

Dynamic Sparse Training with Structured Sparsity

DST methods achieve state-of-the-art results in sparse neural network tr...
research
08/26/2020

SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference

In recent years, there has been a flurry of research in deep neural netw...
research
03/30/2021

The Elastic Lottery Ticket Hypothesis

Lottery Ticket Hypothesis raises keen attention to identifying sparse tr...
research
01/30/2023

DepGraph: Towards Any Structural Pruning

Structural pruning enables model acceleration by removing structurally-g...
research
08/29/2020

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

Network pruning can reduce the high computation cost of deep neural netw...
research
06/09/2023

Spatial Re-parameterization for N:M Sparsity

This paper presents a Spatial Re-parameterization (SpRe) method for the ...

Please sign up or login with your details

Forgot password? Click here to reset