SIMD-X: Programming and Processing of Graph Algorithms on GPUs

12/10/2018
by   Hang Liu, et al.
0

With high computation power and memory bandwidth, graphics processing units (GPUs) lend themselves to accelerate data-intensive analytics, especially when such applications fit the single instruction multiple data (SIMD) model. However, graph algorithms such as breadth-first search and k-core, often fail to take full advantage of GPUs, due to irregularity in memory access and control flow. To address this challenge, we have developed SIMD-X, for programming and processing of single instruction multiple, complex, data on GPUs. Specifically, the new Active-Compute-Combine (ACC) model not only provides ease of programming to programmers, but more importantly creates opportunities for system-level optimizations. To this end, SIMD-X utilizes just-in-time task management which filters out inactive vertices at runtime and intelligently maps various tasks to different amount of GPU cores in pursuit of workload balancing. In addition, SIMD-X leverages push-pull based kernel fusion that, with the help of a new deadlock-free global barrier, reduces a large number of computation kernels to very few. Using SIMD-X, a user can program a graph algorithm in tens of lines of code, while achieving 3?, 6?, 24?, 3? speedup over Gunrock, Galois, CuSha, and Ligra, respectively.

READ FULL TEXT
research
05/21/2019

Low Overhead Instruction Latency Characterization for NVIDIA GPGPUs

The last decade has seen a shift in the computer systems industry where ...
research
02/11/2022

Lightning: Scaling the GPU Programming Model Beyond a Single GPU

The GPU programming model is primarily aimed at the development of appli...
research
05/27/2020

GraFS: Graph Analytics Fusion and Synthesis

Graph analytics elicits insights from large graphs to inform critical de...
research
09/09/2020

GPA: A GPU Performance Advisor Based on Instruction Sampling

Developing efficient GPU kernels can be difficult because of the complex...
research
05/21/2019

Instructions' Latencies Characterization for NVIDIA GPGPUs

The last decade has seen a shift in the computer systems industry where ...
research
03/11/2018

Salable Breadth-First Search on a GPU Cluster

On a GPU cluster, the ratio of high computing power to communication ban...
research
03/11/2018

Scalable Breadth-First Search on a GPU Cluster

On a GPU cluster, the ratio of high computing power to communication ban...

Please sign up or login with your details

Forgot password? Click here to reset