Hybrid Workload Scheduling on HPC Systems

09/12/2021
by   Yuping Fan, et al.
0

Traditionally, on-demand, rigid, and malleable applications have been scheduled and executed on separate systems. The ever-growing workload demands and rapidly developing HPC infrastructure trigger the interest of converging these applications on a single HPC system. Although allocating the hybrid workloads within one system could potentially improve system efficiency, it is difficult to balance the tradeoff between the responsiveness of on-demand requests, the incentive for malleable jobs, and the performance of rigid applications. In this study, we present several scheduling mechanisms to address the issues involved in co-scheduling on-demand, rigid, and malleable jobs on a single HPC system. We extensively evaluate and compare their performance under various configurations and workloads. Our experimental results show that our proposed mechanisms are capable of serving on-demand workloads with minimal delay, offering incentives for declaring malleability, and improving system performance.

READ FULL TEXT

page 6

page 7

research
10/20/2019

RLScheduler: Learn to Schedule HPC Batch Jobs Using Deep Reinforcement Learning

We present RLScheduler, a deep reinforcement learning based job schedule...
research
02/11/2021

Deep Reinforcement Agent for Scheduling in HPC

Cluster scheduler is crucial in high-performance computing (HPC). It det...
research
07/23/2018

Measuring the Impact of Spectre and Meltdown

The Spectre and Meltdown flaws in modern microprocessors represent a new...
research
12/10/2020

Scheduling Beyond CPUs for HPC

High performance computing (HPC) is undergoing significant changes. The ...
research
05/30/2018

Predictive Performance Modeling for Distributed Computing using Black-Box Monitoring and Machine Learning

In many domains, the previous decade was characterized by increasing dat...
research
04/12/2022

The MIT Supercloud Workload Classification Challenge

High-Performance Computing (HPC) centers and cloud providers support an ...
research
05/29/2019

Evaluation of pilot jobs for Apache Spark applications on HPC clusters

Big Data has become prominent throughout many scientific fields and, as ...

Please sign up or login with your details

Forgot password? Click here to reset