How Workflow Engines Should Talk to Resource Managers: A Proposal for a Common Workflow Scheduling Interface

02/15/2023
by   Fabian Lehmann, et al.
0

Scientific workflow management systems (SWMSs) and resource managers together ensure that tasks are scheduled on provisioned resources so that all dependencies are obeyed, and some optimization goal, such as makespan minimization, is achieved. In practice, however, there is no clear separation of scheduling responsibilities between an SWMS and a resource manager because there exists no agreed-upon separation of concerns between their different components. This has two consequences. First, the lack of a standardized API to exchange scheduling information between SWMSs and resource managers hinders portability. It incurs costly adaptations when a component should be replaced by a different one (e.g., an SWMS with another SWMS on the same resource manager). Second, due to overlapping functionalities, current installations often actually have two schedulers, both making partial scheduling decisions under incomplete information, leading to suboptimal workflow scheduling. In this paper, we propose a simple REST interface between SWMSs and resource managers, which allows any SWMS to pass dynamic workflow information to a resource manager, enabling maximally informed scheduling decisions. We provide an implementation of this API as an example, using Nextflow as an SWMS and Kubernetes as a resource manager. Our experiments with nine real-world workflows show that this strategy reduces makespan by up to 25.1 average compared to the standard Nextflow/Kubernetes configuration. Furthermore, a more widespread implementation of this API would enable leaner code bases, a simpler exchange of components of workflow systems, and a unified place to implement new scheduling algorithms.

READ FULL TEXT

page 1

page 8

research
07/04/2022

KubeAdaptor: A Docking Framework for Workflow Containerization on Kubernetes

As Kubernetes becomes the infrastructure of the cloud-native era, the in...
research
10/15/2018

An Efficient Fault Tolerant Workflow Scheduling Approach using Replication Heuristics and Checkpointing in the Cloud

Scientific workflows have been predominantly used for complex and large ...
research
03/24/2017

Calendar.help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop

Although information workers may complain about meetings, they are an es...
research
12/15/2010

Customer Appeasement Scheduling

Almost all of the current process scheduling algorithms which are used i...
research
05/26/2018

Data-Aware Approximate Workflow Scheduling

Optimization of data placement in complex scientific workflows has becom...
research
04/14/2022

Analysis of Workflow Schedulers in Simulated Distributed Environments

Task graphs provide a simple way to describe scientific workflows (sets ...
research
08/13/2018

A Reference Architecture for Datacenter Scheduling: Extended Technical Report

Datacenters act as cloud-infrastructure to stakeholders across industry,...

Please sign up or login with your details

Forgot password? Click here to reset