Coflow Scheduling in Data Centers: Routing and Bandwidth Allocation

12/17/2018
by   Li Shi, et al.
0

In distributed computing frameworks like MapReduce, Spark, and Dyrad, a coflow is a set of flows transferring data between two stages of a job. The job cannot start its next stage unless all flows in the coflow finish. To improve the execution performance of such a job, it is crucial to reduce the completion time of a coflow which can contribute more than 50 While several schedulers have been proposed, we observe that routing, as a factor greatly impacting the Coflow Completion Time (CCT), has not been well considered. In this paper, we focus on the coflow scheduling problem and jointly consider routing and bandwidth allocation. We first provide an analytical solution to the problem of optimal bandwidth allocation with pre-determined routes. We then formulate the coflow scheduling problem as a Mixed Integer Non-linear Programming problem and present its relaxed convex optimization problem. We further propose two algorithms, CoRBA and its simplified version: CoRBA-fast, that jointly perform routing and bandwidth allocation for a given coflow while minimizes the CCT. Through both offline and online simulations, we demonstrate that CoRBA reduces the CCT by 40 compared to the state-of-the-art algorithms. Simulation results also show that CoRBA-fast can be tens of times faster than all other algorithms with around 10 CoRBA-fast very applicable in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2022

Optimal Job Scheduling and Bandwidth Augmentation in Hybrid Data Center Networks

Optimizing data transfers is critical for improving job performance in d...
research
05/12/2019

Routing and Scheduling of Network Flows with Deadlines and Discrete Capacity Allocation

Joint scheduling and routing of data flows with deadline constraints in ...
research
11/26/2018

Eco-friendly Power Cost Minimization for Geo-distributed Data Centers Considering Workload Scheduling

The rapid development of renewable energy in the energy Internet is expe...
research
11/30/2018

Joint Information Freshness and Completion Time Optimization for Vehicular Networks

The demand for real-time cloud applications has seen an unprecedented gr...
research
11/16/2021

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension

Coflow scheduling improves data-intensive application performance by imp...
research
08/08/2020

Optimizing Co-flows Scheduling and Routing in Data Centre Networks for Big Data Applications

This paper optimizes the scheduling and routing of the co-flows of MapRe...
research
01/02/2018

QuickCast: Fast and Efficient Inter-Datacenter Transfers using Forwarding Tree Cohorts

Large inter-datacenter transfers are crucial for cloud service efficienc...

Please sign up or login with your details

Forgot password? Click here to reset