Design Principles for Sparse Matrix Multiplication on the GPU

03/22/2018
by   Carl Yang, et al.
0

We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion. While previous SpMM work concentrates on thread-level parallelism, we additionally focus on latency hiding with instruction-level parallelism and load-balancing. We show, both theoretically and experimentally, that the proposed SpMM is a better fit for the GPU than previous approaches. We identify a key memory access pattern that allows efficient access into both input and output matrices that is crucial to getting excellent performance on SpMM. By combining these two ingredients---(i) merge-based load-balancing and (ii) row-major coalesced memory access---we demonstrate a 3.6x peak speedup and a 23.5 real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2020

GE-SpMM: General-purpose Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks

Graph Neural Networks (GNNs) have achieved significant improvements in v...
research
06/06/2023

Towards Memory-Efficient Training for Extremely Large Output Spaces – Learning with 500k Labels on a Single Commodity GPU

In classification problems with large output spaces (up to millions of l...
research
06/02/2019

Sparse Matrix to Matrix Multiplication: A Representation and Architecture for Acceleration (long version)

Accelerators for sparse matrix multiplication are important components i...
research
02/20/2020

SpArch: Efficient Architecture for Sparse Matrix Multiplication

Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous...
research
07/24/2023

Entropy Maximization in Sparse Matrix by Vector Multiplication (max_E SpMV)

The peak performance of any SpMV depends primarily on the available memo...
research
06/14/2022

Accelerating CPU-Based Sparse General Matrix Multiplication With Binary Row Merging

Sparse general matrix multiplication (SpGEMM) is a fundamental building ...
research
03/15/2022

Distributed-Memory Sparse Kernels for Machine Learning

Sampled Dense Times Dense Matrix Multiplication (SDDMM) and Sparse Times...

Please sign up or login with your details

Forgot password? Click here to reset