rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks

05/20/2019
by   Ali Mohammed, et al.
0

Scientific applications often contain large and computationally intensive parallel loops. Dynamic loop self scheduling (DLS) is used to achieve a balanced load execution of such applications on high performance computing (HPC) systems. Large HPC systems are vulnerable to processors or node failures and perturbations in the availability of resources. Most self-scheduling approaches do not consider fault-tolerant scheduling or depend on failure or perturbation detection and react by rescheduling failed tasks. In this work, a robust dynamic load balancing (rDLB) approach is proposed for the robust self scheduling of independent tasks. The proposed approach is proactive and does not depend on failure or perturbation detection. The theoretical analysis of the proposed approach shows that it is linearly scalable and its cost decrease quadratically by increasing the system size. rDLB is integrated into an MPI DLS library to evaluate its performance experimentally with two computationally intensive scientific applications. Results show that rDLB enables the tolerance of up to (P minus one) processor failures, where P is the number of processors executing an application. In the presence of perturbations, rDLB boosted the robustness of DLS techniques up to 30 times and decreased application execution time up to 7 times compared to their counterparts without rDLB.

READ FULL TEXT

page 11

page 12

page 27

research
05/20/2019

Online Research Report: rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks

Scientific applications often contain large and computationally intensiv...
research
10/15/2019

An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems

Scientific applications often contain large, computationally-intensive, ...
research
11/15/2019

Two-level Dynamic Load Balancing for High Performance Scientific Applications

Scientific applications are often complex, irregular, and computationall...
research
12/04/2019

SimAS: A Simulation-assisted Approach for the Scheduling Algorithm Selection under Perturbations

Many scientific applications consist of large and computationally-intens...
research
12/14/2018

Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access

Scientific applications often contain large computationally-intensive pa...
research
11/14/2021

Practical Scheduling for Real-World Serverless Computing

Serverless computing has seen rapid growth due to the ease-of-use and co...
research
10/24/2018

A Bag-of-Tasks Scheduler Tolerant to Temporal Failures in Clouds

Cloud platforms have emerged as a prominent environment to execute high ...

Please sign up or login with your details

Forgot password? Click here to reset