DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime

12/17/2020
by   Alberto Parravicini, et al.
0

GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using GPUs to their full capabilities requires expert knowledge of asynchronous programming. In this work, we present a novel GPU run time scheduler for multi-task GPU computations that transparently provides asynchronous execution, space-sharing, and transfer-computation overlap without requiring in advance any information about the program dependency structure. We leverage the GrCUDA polyglot API to integrate our scheduler with multiple high-level languages and provide a platform for fast prototyping and easy GPU acceleration. We validate our work on 6 benchmarks created to evaluate task-parallelism and show an average of 44 slowdown compared to hand-optimized host code written using the C++ CUDA Graphs API.

READ FULL TEXT

page 1

page 9

research
10/26/2018

Integration of CUDA Processing within the C++ library for parallelism and concurrency (HPX)

Experience shows that on today's high performance systems the utilizatio...
research
10/30/2020

Transparent Compiler and Runtime Specializations for Accelerating Managed Languages on FPGAs

In recent years, heterogeneous computing has emerged as the vital way to...
research
06/14/2019

A Performance Study of the 2D Ising Model on GPUs

The simulation of the two-dimensional Ising model is used as a benchmark...
research
10/27/2021

JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization

The rapid development in computing technology has paved the way for dire...
research
01/04/2021

Implementing CUDA Streams into AstroAccelerate – A Case Study

To be able to run tasks asynchronously on NVIDIA GPUs a programmer must ...
research
02/20/2021

A Python Framework for Fast Modelling and Simulation of Cellular Nonlinear Networks and other Finite-difference Time-domain Systems

This paper introduces and evaluates a freely available cellular nonlinea...
research
06/30/2023

Safe, Seamless, And Scalable Integration Of Asynchronous GPU Streams In PETSc

Leveraging Graphics Processing Units (GPUs) to accelerate scientific sof...

Please sign up or login with your details

Forgot password? Click here to reset