Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

12/23/2011
by   Moritz Kreutzer, et al.
0

Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia "Fermi" class of GPGPUs. A new "padded jagged diagonals storage" (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70 Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.

READ FULL TEXT

page 4

page 5

page 7

research
09/29/2020

Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX

The A64FX CPU powers the current number one supercomputer on the Top500 ...
research
11/07/2022

AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse Matrices

Sparse Matrix-Vector multiplication (SpMV) is an essential computational...
research
12/18/2018

MatRox: A Model-Based Algorithm with an Efficient Storage Format for Parallel HSS-Structured Matrix Approximations

We present MatRox, a novel model-based algorithm and implementation of H...
research
09/05/2022

Orthogonal layers of parallelism in large-scale eigenvalue computations

We address the communication overhead of distributed sparse matrix-(mult...
research
05/11/2021

Accelerating the SpMV kernel on standard CPUs by exploiting the partially diagonal structures

Sparse Matrix Vector multiplication (SpMV) is one of basic building bloc...
research
12/10/2020

Efficient Distributed Transposition Of Large-Scale Multigraphs And High-Cardinality Sparse Matrices

Graph-based representations underlie a wide range of scientific problems...
research
11/22/2020

Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads

Sparse matrices are the key ingredients of several application domains, ...

Please sign up or login with your details

Forgot password? Click here to reset