Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures

01/09/2018
by   Mehmet Deveci, et al.
0

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.

READ FULL TEXT

page 7

page 8

research
02/23/2023

A simple division-free algorithm for computing Pfaffians

We present a very simple algorithm for computing Pfaffians which uses no...
research
04/02/2018

Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures : Algorithms and Experiments

Architectures with multiple classes of memory media are becoming a commo...
research
02/26/2020

A Systematic Survey of General Sparse Matrix-Matrix Multiplication

SpGEMM (General Sparse Matrix-Matrix Multiplication) has attracted much ...
research
05/29/2018

Optimizing Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures

Sparse matrix vector multiplication (SpMV) is one of the most common ope...
research
02/24/2020

Optimizing High Performance Markov Clustering for Pre-Exascale Architectures

HipMCL is a high-performance distributed memory implementation of the po...
research
02/07/2018

High Performance Rearrangement and Multiplication Routines for Sparse Tensor Arithmetic

Researchers are increasingly incorporating numeric high-order data, i.e....
research
02/26/2020

Bandwidth-Optimized Parallel Algorithms for Sparse Matrix-Matrix Multiplication using Propagation Blocking

Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in ...

Please sign up or login with your details

Forgot password? Click here to reset