SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference

08/26/2020
by   Ziheng Wang, et al.
0

In recent years, there has been a flurry of research in deep neural network pruning and compression. Early approaches prune weights individually. However, it is difficult to take advantage of the resulting unstructured sparsity patterns on modern hardware like GPUs. As a result, pruning strategies which impose sparsity structures in the weights have become more popular. However,these structured pruning approaches typically lead to higher losses in accuracy than unstructured pruning. In this paper, we present SparseRT, a code generator that leverage unstructured sparsity to accelerate sparse linear algebra operations in deep learning inference on GPUs. For 1x1 convolutions and fully connected layers, we demonstrate geometric mean of speedups of 3.4x over the equivalent dense computation at 90 evaluated on hundreds of test cases in deep learning. For sparse 3x3 convolutions, we show speedups of over 5x on use cases in ResNet-50.

READ FULL TEXT
research
11/12/2020

When deep learning models on GPU can be accelerated by taking advantage of unstructured sparsity

This paper is focused on the improvement the efficiency of the sparse co...
research
05/24/2019

Structured Compression by Unstructured Pruning for Sparse Quantized Neural Networks

Model compression techniques, such as pruning and quantization, are beco...
research
06/21/2022

Winning the Lottery Ahead of Time: Efficient Early Network Pruning

Pruning, the task of sparsifying deep neural networks, received increasi...
research
02/09/2022

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

The lottery ticket hypothesis (LTH) has shown that dense models contain ...
research
12/28/2021

Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction

This work is focused on the pruning of some convolutional neural network...
research
10/03/2022

Sparsity by Redundancy: Solving L_1 with a Simple Reparametrization

We identify and prove a general principle: L_1 sparsity can be achieved ...
research
02/06/2023

Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers

This article does not propose any novel algorithm or new hardware for sp...

Please sign up or login with your details

Forgot password? Click here to reset