An Efficient Fault Tolerant Workflow Scheduling Approach using Replication Heuristics and Checkpointing in the Cloud

10/15/2018
by   S. Jaya Nirmala, et al.
0

Scientific workflows have been predominantly used for complex and large scale data analysis and scientific computation/automation and the need for robust workflow scheduling techniques has grown considerably. But, most of the existing workflow scheduling algorithms do not provide the required reliability and robustness. In this paper, a new fault tolerant workflow scheduling algorithm that learns replication heuristics in an unsupervised manner has been proposed. Furthermore, the use of light weight synchronized checkpointing enables efficient resubmission of failed tasks and ensures workflow completion even in precarious environments. The proposed technique improves upon metrics like Resource Wastage and Resource Usage in comparison to the Replicate-All algorithm, while maintaining an acceptable increase in Makespan as compared to the vanilla Heterogeneous Earliest Finish Time (HEFT).

READ FULL TEXT
research
07/04/2022

KubeAdaptor: A Docking Framework for Workflow Containerization on Kubernetes

As Kubernetes becomes the infrastructure of the cloud-native era, the in...
research
04/14/2022

Analysis of Workflow Schedulers in Simulated Distributed Environments

Task graphs provide a simple way to describe scientific workflows (sets ...
research
01/14/2022

Energy-efficient workflow scheduling based on workflow structures under deadline and budget constraints in the cloud

The utilization of cloud environments to deploy scientific workflow appl...
research
02/15/2023

How Workflow Engines Should Talk to Resource Managers: A Proposal for a Common Workflow Scheduling Interface

Scientific workflow management systems (SWMSs) and resource managers tog...
research
05/26/2018

Data-Aware Approximate Workflow Scheduling

Optimization of data placement in complex scientific workflows has becom...
research
12/19/2022

A Makespan and Energy-Aware Scheduling Algorithm for Workflows under Reliability Constraint on a Multiprocessor Platform

Many scientific workflows can be modeled as a Directed Acyclic Graph (he...
research
11/15/2017

Modular Resource Centric Learning for Workflow Performance Prediction

Workflows provide an expressive programming model for fine-grained contr...

Please sign up or login with your details

Forgot password? Click here to reset