Probabilistic Scheduling of Dynamic I/O Requests via Application Clustering for Burst-Buffer Equipped HPC

10/14/2022
by   Benbo Zha, et al.
0

Burst-Buffering is a promising storage solution that introduces an intermediate highthroughput storage buffer layer to mitigate the I/O bottleneck problem that the current High-Performance Computing (HPC) platforms suffer. The existing Markov-Chain based probabilistic I/O scheduling utilizes the load state of Burst-Buffers and the periodical characteristics of applications to reduce I/O congestion due to the limited capacity of Burst-Buffers. However, this probabilistic approach requires consistent I/O characteristics of applications, including similar I/O duration and long application length, in order to obtain an accurate I/O load estimation. These consistency conditions do not often hold in realistic situations. In this paper, we propose a generic framework of dynamic probabilistic I/O scheduling based on application clustering (DPSAC) to make applications meet the consistency requirements. According to the I/O phrase length of each application, our scheme first deploys a one-dimensional K-means clustering algorithm to cluster the applications into clusters. Next, it calculates the expected workload of each cluster through the probabilistic model of applications and then partitions the Burst-Buffers proportionally. Then, to handle dynamic changes (join and exit) of applications, it updates the clusters based on a heuristic strategy. Finally, it applies the probabilistic I/O scheduling, which is based on the distribution of application workload and the state of Burst-Buffers, to schedule I/O for all the concurrent applications to mitigate I/O congestion. The simulation results on synthetic data show that our DPSAC is effective and efficient.

READ FULL TEXT
research
09/29/2021

Optimisation of job scheduling for supercomputers with burst buffers

The ever-increasing gap between compute and I/O performance in HPC platf...
research
12/10/2020

Scheduling Beyond CPUs for HPC

High performance computing (HPC) is undergoing significant changes. The ...
research
12/14/2020

Application-aware Congestion Mitigation for High-Performance Computing Systems

High-performance computing (HPC) systems frequently experience congestio...
research
08/31/2021

Plan-based Job Scheduling for Supercomputers with Shared Burst Buffers

The ever-increasing gap between compute and I/O performance in HPC platf...
research
05/16/2018

Client-side Straggler-Aware I/O Scheduler for Object-based Parallel File Systems

Object-based parallel file systems have emerged as promising storage sol...
research
05/07/2015

Development of a Burst Buffer System for Data-Intensive Applications

Modern parallel filesystems such as Lustre are designed to provide high,...
research
11/02/2021

Towards Enabling I/O Awareness in Task-based Programming Models

Storage systems have not kept the same technology improvement rate as co...

Please sign up or login with your details

Forgot password? Click here to reset