A parallel priority queue with fast updates for GPU architectures

08/25/2019
by   John Iacono, et al.
0

The high computational throughput of modern graphics processing units (GPUs) make them the de-facto architecture for high-performance computing applications. However, to achieve peak performance, GPUs require highly parallel workloads, as well as memory access patterns that exhibit good locality of reference. As a result, many state-of-the-art algorithms and data structures designed for GPUs sacrifice work-optimality to achieve the necessary parallelism. Furthermore, some abstract data types are avoided completely due to there being no corresponding data structure that performs well on the GPU. One such abstract data type is the priority queue. Many well-known algorithms rely on priority queue operations as a building block. While various priority queue structures have been developed that are parallel, cache-aware, or cache-oblivious, none has been shown to be efficient on GPUs. In this paper, we present the parBucketHeap, a parallel, cache-efficient data structure designed for modern GPU architectures that supports standard priority queue operations, as well as bulk update. We analyze the structure in several well-known computational models and show that it provides both optimal parallelism and is cache-efficient. We implement the parBucketHeap and, using it, we solve the single-source shortest path (SSSP) problem. Experimental results indicate that, for sufficiently large, dense graphs with high diameter, we out-perform current state-of-the-art SSSP algorithms on the GPU by up to a factor of 5. Unlike existing GPU SSSP algorithms, our approach is work-optimal and places significantly less load on the GPU, reducing power consumption.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2018

A Faster External Memory Priority Queue with DecreaseKeys

A priority queue is a fundamental data structure that maintains a dynami...
research
09/02/2021

Multi-Queues Can Be State-of-the-Art Priority Schedulers

Designing and implementing efficient parallel priority schedulers is an ...
research
02/13/2021

Cache Bypassing for Machine Learning Algorithms

Graphics Processing Units (GPUs) were once used solely for graphical com...
research
11/17/2017

Algorithms and Data Structures to Accelerate Network Analysis

As the sheer amount of computer generated data continues to grow exponen...
research
06/15/2019

Accelerating Concurrent Heap on GPUs

Priority queue, often implemented as a heap, is an abstract data type th...
research
06/14/2023

GraphVine: A Data Structure to Optimize Dynamic Graph Processing on GPUs

Graph processing on GPUs is gaining momentum due to the high throughputs...
research
09/17/2018

AlSub: Fully Parallel Subdivision for Modeling and Rendering

Mesh subdivision is a key geometric modeling task which forges smooth, s...

Please sign up or login with your details

Forgot password? Click here to reset