Scheduling Jobs with Random Resource Requirements in Computing Clusters

01/17/2019
by   Konstantinos Psychas, et al.
0

We consider a natural scheduling problem which arises in many distributed computing frameworks. Jobs with diverse resource requirements (e.g. memory requirements) arrive over time and must be served by a cluster of servers, each with a finite resource capacity. To improve throughput and delay, the scheduler can pack as many jobs as possible in the servers subject to their capacity constraints. Motivated by the ever-increasing complexity of workloads in shared clusters, we consider a setting where the jobs' resource requirements belong to a very large number of diverse types or, in the extreme, even infinitely many types, e.g. when resource requirements are drawn from an unknown distribution over a continuous support. The application of classical scheduling approaches that crucially rely on a predefined finite set of types is discouraging in this high (or infinite) dimensional setting. We first characterize a fundamental limit on the maximum throughput in such setting, and then develop oblivious scheduling algorithms that have low complexity and can achieve at least 1/2 and 2/3 of the maximum throughput, without the knowledge of traffic or resource requirement distribution. Extensive simulation results, using both synthetic and real traffic traces, are presented to verify the performance of our algorithms.

READ FULL TEXT
research
06/13/2021

Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints

The scheduling literature has traditionally focused on a single type of ...
research
07/02/2018

On Non-Preemptive VM Scheduling in the Cloud

We study the problem of scheduling VMs (Virtual Machines) in a distribut...
research
12/21/2020

Scheduling Coflows with Dependency Graph

Applications in data-parallel computing typically consist of multiple st...
research
04/17/2018

Communication-Aware Scheduling of Serial Tasks for Dispersed Computing

There is a growing interest in development of in-network dispersed compu...
research
08/26/2022

Affinity-Aware Resource Provisioning for Long-Running Applications in Shared Clusters

Resource provisioning plays a pivotal role in determining the right amou...
research
02/02/2022

GADGET: Online Resource Optimization for Scheduling Ring-All-Reduce Learning Jobs

Fueled by advances in distributed deep learning (DDL), recent years have...
research
08/31/2021

A log-linear (2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource

As the gap between compute and I/O performance tends to grow, modern Hig...

Please sign up or login with your details

Forgot password? Click here to reset