SlimSell: A Vectorizable Graph Representation for Breadth-First Search

10/19/2020
by   Maciej Besta, et al.
0

Vectorization and GPUs will profoundly change graph processing. Traditional graph algorithms tuned for 32- or 64-bit based memory accesses will be inefficient on architectures with 512-bit wide (or larger) instruction units that are already present in the Intel Knights Landing (KNL) manycore CPU. Anticipating this shift, we propose SlimSell: a vectorizable graph representation to accelerate Breadth-First Search (BFS) based on sparse-matrix dense-vector (SpMV) products. SlimSell extends and combines the state-of-the-art SIMD-friendly Sell-C-sigma matrix storage format with tropical, real, boolean, and sel-max semiring operations. The resulting design reduces the necessary storage (by up to 50 subsystem. We augment SlimSell with the SlimWork and SlimChunk schemes that reduce the amount of work and improve load balance, further accelerating BFS. We evaluate all the schemes on Intel Haswell multicore CPUs, the state-of-the-art Intel Xeon Phi KNL manycore CPUs, and NVIDIA Tesla GPUs. Our experiments indicate which semiring offers highest speedups for BFS and illustrate that SlimSell accelerates a tuned Graph500 BFS code by up to 33 This work shows that vectorization can secure high-performance in BFS based on SpMV products; the proposed principles and designs can be extended to other graph algorithms.

READ FULL TEXT

page 3

page 7

research
07/23/2013

A unified sparse matrix data format for efficient general sparse matrix-vector multiply on modern processors with wide SIMD units

Sparse matrix-vector multiplication (spMVM) is the most time-consuming k...
research
01/21/2022

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU

In a general graph data structure like an adjacency matrix, when edges a...
research
04/06/2019

Load-Balanced Sparse MTTKRP on GPUs

Sparse matricized tensor times Khatri-Rao product (MTTKRP) is one of the...
research
04/28/2022

Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques

The Black-Scholes option pricing problem is one of the widely used finan...
research
10/29/2020

Log(Graph): A Near-Optimal High-Performance Graph Representation

Today's graphs used in domains such as machine learning or social networ...
research
03/10/2022

Heterogeneous Sparse Matrix-Vector Multiplication via Compressed Sparse Row Format

Sparse matrix-vector multiplication (SpMV) is one of the most important ...
research
07/16/2019

Coprocessors: failures and successes

The appearance and disappearance of coprocessors by integration into the...

Please sign up or login with your details

Forgot password? Click here to reset