Scheduling Beyond CPUs for HPC

12/10/2020
by   Yuping Fan, et al.
0

High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from emerging data-intensive applications, burst buffers are deployed in production systems. Existing HPC schedulers are mainly CPU-centric. The extreme heterogeneity of hardware devices, combined with workload changes, forces the schedulers to consider multiple resources (e.g., burst buffers) beyond CPUs, in decision making. In this study, we present a multi-resource scheduling scheme named BBSched that schedules user jobs based on not only their CPU requirements, but also other schedulable resources such as burst buffer. BBSched formulates the scheduling problem into a multi-objective optimization (MOO) problem and rapidly solves the problem using a multi-objective genetic algorithm. The multiple solutions generated by BBSched enables system managers to explore potential tradeoffs among various resources, and therefore obtains better utilization of all the resources. The trace-driven simulations with real system workloads demonstrate that BBSched improves scheduling performance by up to 41 methods, indicating that explicitly optimizing multiple resources beyond CPUs is essential for HPC scheduling.

READ FULL TEXT

page 7

page 8

page 9

page 10

research
08/18/2021

ROME: A Multi-Resource Job Scheduling Framework for Exascale HPC Systems

High-performance computing (HPC) is undergoing significant changes. Next...
research
01/09/2023

Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

The diversity of workload requirements and increasing hardware heterogen...
research
02/25/2021

Optimized Memoryless Fair-Share HPC Resources Scheduling using Transparent Checkpoint-Restart Preemption

Common resource management methods in supercomputing systems usually inc...
research
10/14/2022

Probabilistic Scheduling of Dynamic I/O Requests via Application Clustering for Burst-Buffer Equipped HPC

Burst-Buffering is a promising storage solution that introduces an inter...
research
07/27/2022

On-Device CPU Scheduling for Sense-React Systems

Sense-react systems (e.g. robotics and AR/VR) have to take highly respon...
research
09/12/2021

Hybrid Workload Scheduling on HPC Systems

Traditionally, on-demand, rigid, and malleable applications have been sc...
research
03/16/2021

Intelligent colocation of HPC workloads

Many HPC applications suffer from a bottleneck in the shared caches, ins...

Please sign up or login with your details

Forgot password? Click here to reset