Scheduling Coflows with Dependency Graph

12/21/2020
by   Mehrnoosh Shafiee, et al.
0

Applications in data-parallel computing typically consist of multiple stages. In each stage, a set of intermediate parallel data flows (Coflow) is produced and transferred between servers to enable starting of next stage. While there has been much research on scheduling isolated coflows, the dependency between coflows in multi-stage jobs has been largely ignored. In this paper, we consider scheduling coflows of multi-stage jobs represented by general DAGs (Directed Acyclic Graphs) in a shared data center network, so as to minimize the total weighted completion time of jobs. This problem is significantly more challenging than the traditional coflow scheduling, as scheduling even a single multi-stage job to minimize its completion time is shown to be NP-hard. In this paper, we propose a polynomial-time algorithm with approximation ratio of O(μlog(m)/log(log(m))), where μ is the maximum number of coflows in a job and m is the number of servers. For the special case that the jobs' underlying dependency graphs are rooted trees, we modify the algorithm and improve its approximation ratio. To verify the performance of our algorithms, we present simulation results using real traffic traces that show up to 53 % improvement over the prior approach. We conclude the paper by providing a result concerning an optimality gap for scheduling coflows with general DAGs.

READ FULL TEXT

page 1

page 8

research
05/05/2022

Scheduling Coflows with Precedence Constraints for Minimizing the Total Weighted Completion Time in Identical Parallel Networks

Coflow is a recently proposed network abstraction for data-parallel comp...
research
01/27/2018

On Scheduling Two-Stage Jobs on Multiple Two-Stage Flowshops

Motivated by the current research in data centers and cloud computing, w...
research
05/28/2022

An efficient polynomial-time approximation scheme for parallel multi-stage open shops

Various new scheduling problems have been arising from practical product...
research
12/21/2021

A Scalable Deep Reinforcement Learning Model for Online Scheduling Coflows of Multi-Stage Jobs for High Performance Computing

Coflow is a recently proposed networking abstraction to help improve the...
research
01/17/2019

Scheduling Jobs with Random Resource Requirements in Computing Clusters

We consider a natural scheduling problem which arises in many distribute...
research
05/12/2020

Data-driven Algorithm for Scheduling with Total Tardiness

In this paper, we investigate the use of deep learning for solving a cla...
research
04/01/2020

Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints

Motivated by modern parallel computing applications, we consider the pro...

Please sign up or login with your details

Forgot password? Click here to reset