Ripple : Simplified Large-Scale Computation on Heterogeneous Architectures with Polymorphic Data Layout

04/17/2021
by   Robert Clucas, et al.
0

GPUs are now used for a wide range of problems within HPC. However, making efficient use of the computational power available with multiple GPUs is challenging. The main challenges in achieving good performance are memory layout, affecting memory bandwidth, effective use of the memory spaces with a GPU, inter-GPU communication, and synchronization. We address these problems with the Ripple library, which provides a unified view of the computational space across multiple dimensions and multiple GPUs, allows polymorphic data layout, and provides a simple graph interface to describe an algorithm from which inter-GPU data transfers can be optimally scheduled. We describe the abstractions provided by Ripple to allow complex computations to be described simply, and to execute efficiently across many GPUs with minimal overhead. We show performance results for a number of examples, from particle motion to finite-volume methods and the eikonal equation, as well as showing good strong and weak scaling results across multiple GPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2018

SoaAlloc: A Lock-free Hierarchical Bitmap-based Object Allocator for GPUs

Designing dynamic memory allocators for GPUs is challenging because appl...
research
08/16/2019

Memory-Efficient Object-Oriented Programming on GPUs

Object-oriented programming is often regarded as too inefficient for hig...
research
09/20/2018

SoaAlloc: Accelerating Single-Method Multiple-Objects Applications on GPUs

We propose SoaAlloc, a dynamic object allocator for Single-Method Multip...
research
12/13/2020

Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures

Designing efficient and scalable sparse linear algebra kernels on modern...
research
10/11/2017

Synkhronos: a Multi-GPU Theano Extension for Data Parallelism

We present Synkhronos, an extension to Theano for multi-GPU computations...
research
07/07/2023

CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs

Data compression and decompression have become vital components of big-d...
research
03/11/2018

Scalable Breadth-First Search on a GPU Cluster

On a GPU cluster, the ratio of high computing power to communication ban...

Please sign up or login with your details

Forgot password? Click here to reset