Do the Hard Stuff First: Scheduling Dependent Computations in Data-Analytics Clusters

04/25/2016
by   Robert Grandl, et al.
0

We present a scheduler that improves cluster utilization and job completion times by packing tasks having multi-resource requirements and inter-dependencies. While the problem is algorithmically very hard, we achieve near-optimality on the job DAGs that appear in production clusters at a large enterprise and in benchmarks such as TPC-DS. A key insight is that carefully handling the long-running tasks and those with tough-to-pack resource needs will produce good-enough schedules. However, which subset of tasks to treat carefully is not clear (and intractable to discover). Hence, we offer a search procedure that evaluates various possibilities and outputs a preferred schedule order over tasks. An online component enforces the schedule orders desired by the various jobs running on the cluster. In addition, it packs tasks, overbooks the fungible resources and guarantees bounded unfairness for a variety of desirable fairness schemes. Relative to the state-of-the art schedulers, we speed up 50

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2018

PingAn: An Insurance Scheme for Job Acceleration in Geo-distributed Big Data Analytics System

Geo-distributed data analysis in a cloud-edge system is emerging as a da...
research
05/22/2019

Two stage cluster for resource optimization with Apache Mesos

As resource estimation for jobs is difficult, users often overestimate t...
research
02/14/2021

Hugo: A Cluster Scheduler that Efficiently Learns to Select Complementary Data-Parallel Jobs

Distributed data processing systems like MapReduce, Spark, and Flink are...
research
02/09/2019

Linear Time Algorithms for Multiple Cluster Scheduling and Multiple Strip Packing

We study the Multiple Cluster Scheduling problem and the Multiple Strip ...
research
08/24/2021

The Case for Task Sampling based Learning for Cluster Job Scheduling

The ability to accurately estimate job runtime properties allows a sched...
research
05/28/2021

A Sum-of-Ratios Multi-Dimensional-Knapsack Decomposition for DNN Resource Scheduling

In recent years, to sustain the resource-intensive computational needs f...
research
12/07/2019

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters

Simultaneously supporting latency- and throughout-sensitive workloads in...

Please sign up or login with your details

Forgot password? Click here to reset