Sparse GPU Kernels for Deep Learning

06/18/2020
by   Trevor Gale, et al.
11

Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because these applications have relatively moderate levels of sparsity that are not sufficient for existing sparse kernels to outperform their dense counterparts. In this work, we study sparse matrices from deep learning applications and identify favorable properties that can be exploited to accelerate computation. Based on these insights, we develop high-performance GPU kernels for two sparse matrix operations widely applicable in neural networks: sparse matrix-dense matrix multiplication and sampled dense-dense matrix multiplication. Our kernels reach 27 V100 GPUs. Using our kernels, we demonstrate sparse Transformer and MobileNet models that achieve 1.2-2.1x speedups and up to 12.8x memory savings without sacrificing accuracy.

READ FULL TEXT

page 7

page 8

page 9

page 11

research
09/29/2020

Accelerating Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

Sparse general matrix-matrix multiplication (spGEMM) is an essential com...
research
07/28/2020

At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation

This paper presents GPU performance optimization and scaling results for...
research
03/29/2023

PopSparse: Accelerated block sparse matrix multiplication on IPU

Reducing the computational cost of running large scale neural networks u...
research
06/30/2020

Adaptive SpMV/SpMSpV on GPUs for Input Vectors of Varied Sparsity

Despite numerous efforts for optimizing the performance of Sparse Matrix...
research
06/30/2021

Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction

Sparse matrix-vector and matrix-matrix multiplication (SpMV and SpMM) ar...
research
05/29/2020

Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format

Multiplication of a sparse matrix to a dense matrix (SpDM) is widely use...
research
12/10/2022

Optimized Sparse Matrix Operations for Reverse Mode Automatic Differentiation

Sparse matrix representations are ubiquitous in computational science an...

Please sign up or login with your details

Forgot password? Click here to reset