Towards Performance Portable Programming for Distributed Heterogeneous Systems

10/03/2022
by   Polykarpos Thomadakis, et al.
0

Hardware heterogeneity is here to stay for high-performance computing. Large-scale systems are currently equipped with multiple GPU accelerators per compute node and are expected to incorporate more specialized hardware in the future. This shift in the computing ecosystem offers many opportunities for performance improvement; however, it also increases the complexity of programming for such architectures. This work introduces a runtime framework that enables effortless programming for heterogeneous systems while efficiently utilizing hardware resources. The framework is integrated within a distributed and scalable runtime system to facilitate performance portability across heterogeneous nodes. Along with the design, this paper describes the implementation and optimizations performed, achieving up to 300 a shared memory benchmark and up to 10 times in distributed device communication. Preliminary results indicate that our software incurs low overhead and achieves 40 while hiding the idiosyncrasies of the hardware.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2023

Runtime Support for Performance Portability on Heterogeneous Distributed Platforms

Hardware heterogeneity is here to stay for high-performance computing. L...
research
05/18/2020

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

The pervasive adoption of Deep Learning (DL) and Graph Processing (GP) m...
research
03/24/2022

GX-Plug: a Middleware for Plugging Accelerators to Distributed Graph Processing

Recently, research communities highlight the necessity of formulating a ...
research
10/28/2018

FFT, FMM, and Multigrid on the Road to Exascale: performance challenges and opportunities

FFT, FMM, and multigrid methods are widely used fast and highly scalable...
research
05/07/2018

EngineCL: Usability and Performance in Heterogeneous Computing

Heterogeneous systems composed by a CPU and a set of hardware accelerato...
research
02/26/2018

Tornado: A Practical And Efficient Heterogeneous Programming Framework For Managed Languages

This paper describes our experiences creating Tornado: a practical and e...
research
03/27/2021

Effective GPU Parallelization of Distributed and Localized Model Predictive Control

To effectively control large-scale distributed systems online, model pre...

Please sign up or login with your details

Forgot password? Click here to reset