Implementing CUDA Streams into AstroAccelerate – A Case Study

01/04/2021
by   Jan Novotný, et al.
0

To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous computation performed on input data n-1, over the PCIe bus whilst computing the result for input data n, by placing different tasks in different CUDA streams. The benefit of such an approach is that the time taken for the data transfer between the host and device can be hidden with computation. This case study deals with the implementation of CUDA streams into AstroAccelerate. AstroAccelerate is a GPU accelerated real-time signal processing pipeline for time-domain radio astronomy.

READ FULL TEXT

page 2

page 3

research
06/30/2023

Safe, Seamless, And Scalable Integration Of Asynchronous GPU Streams In PETSc

Leveraging Graphics Processing Units (GPUs) to accelerate scientific sof...
research
07/06/2020

Multi-tenant Pub/Sub Processing for Real-time Data Streams

Devices and sensors generate streams of data across a diversity of locat...
research
06/16/2021

mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing

In this paper, we present a new Python library called mPyPl, which is in...
research
12/17/2020

DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime

GPUs are readily available in cloud computing and personal devices, but ...
research
10/13/2016

GPU-accelerated real-time stixel computation

The Stixel World is a medium-level, compact representation of road scene...
research
08/29/2022

MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming

The hybrid MPI+X programming paradigm, where X refers to threads or GPUs...
research
03/04/2020

Array relocation approach for radial scanning algorithms on multi-GPU systems: total viewshed problem as a case study

In geographic information systems, Digital Elevation Models (DEMs) are c...

Please sign up or login with your details

Forgot password? Click here to reset