FSCNN: A Fast Sparse Convolution Neural Network Inference System

12/17/2022
by   Bo Ji, et al.
0

Convolution neural networks (CNNs) have achieved remarkable success, but typically accompany high computation cost and numerous redundant weight parameters. To reduce the FLOPs, structure pruning is a popular approach to remove the entire hidden structures via introducing coarse-grained sparsity. Meanwhile, plentiful pruning works leverage fine-grained sparsity instead (sparsity are randomly distributed), whereas their sparse models lack special designed computing library for potential speedup. In this technical report, we study and present an efficient convolution neural network inference system to accelerate its forward pass by utilizing the fine-grained sparsity of compressed CNNs. Our developed FSCNN is established based on a set of specialized designed sparse data structures, operators and associated algorithms. Experimentally, we validate that FSCNN outperforms standard deep learning library PyTorch on popular CNN architectures such as VGG16 if sufficiently high sparsity exhibits. However, due to the contiguity issue of sparse operators, FSCNN is typically not comparable with highly optimized dense operator. Therefore, coarse-grained (structured) sparsity is our recommendation for generic model compression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2019

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices

Model compression techniques on Deep Neural Network (DNN) have been wide...
research
11/01/2018

Balanced Sparsity for Efficient DNN Inference on GPU

In trained deep neural networks, unstructured pruning can reduce redunda...
research
10/26/2019

Cross-Channel Intragroup Sparsity Neural Network

Modern deep neural network models generally build upon heavy over-parame...
research
06/15/2021

S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks

Convolutional neural networks (CNNs) have achieved great success in perf...
research
04/03/2021

Tight Compression: Compressing CNN Through Fine-Grained Pruning and Weight Permutation for Efficient Implementation

The unstructured sparsity after pruning poses a challenge to the efficie...
research
08/04/2016

Faster CNNs with Direct Sparse Convolutions and Guided Pruning

Phenomenally successful in practical inference problems, convolutional n...
research
03/21/2022

Optimal Fine-Grained N:M sparsity for Activations and Neural Gradients

In deep learning, fine-grained N:M sparsity reduces the data footprint a...

Please sign up or login with your details

Forgot password? Click here to reset