Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

05/21/2021
by   Houssam Eddine Zahaf, et al.
0

In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made available in novel GPU architectures is steadily increasing, hence, investigating suitable scheduling approaches is now mandatory. Such scheduling approaches are related to mapping different and concurrent compute kernels within the GPU computing clusters, hence grouping GPU computing clusters into schedulable partitions. In this paper we propose novel techniques to define GPU partitions; this allows us to define suitable task-to-partition allocation mechanisms in which tasks are GPU compute kernels featuring different timing requirements. Such mechanisms will take into account the interference that GPU kernels experience when running in overlapping time windows. Hence, an effective and simple way to quantify the magnitude of such interference is also presented. We demonstrate the efficiency of the proposed approaches against the classical techniques that considered the GPU as a single, non-partitionable resource.

READ FULL TEXT
research
01/25/2021

RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks with Fine-Grain Utilization

Many emerging cyber-physical systems, such as autonomous vehicles and ro...
research
05/06/2016

A Graph-based Model for GPU Caching Problems

Modeling data sharing in GPU programs is a challenging task because of t...
research
09/13/2022

Deep Learning Training on Multi-Instance GPUs

Deep learning training is an expensive process that extensively uses GPU...
research
04/01/2021

Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters

Energy conservation of large data centers for high-performance computing...
research
06/23/2014

Preemptive Thread Block Scheduling with Online Structural Runtime Prediction for Concurrent GPGPU Kernels

Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kern...
research
07/06/2017

Cooperative Kernels: GPU Multitasking for Blocking Algorithms (Extended Version)

There is growing interest in accelerating irregular data-parallel algori...
research
03/15/2023

A Two-level GPU-Accelerated Incomplete LU Preconditioner for General Sparse Linear Systems

This paper presents a parallel preconditioning approach based on incompl...

Please sign up or login with your details

Forgot password? Click here to reset