Sharp Waiting-Time Bounds for Multiserver Jobs

by   Yige Hong, et al.

Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in a scaling regime where the total number of servers in the system becomes large and meanwhile both the system load and the number of servers that a job needs scale with the total number of servers. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first consider the commonly used First-Come-First-Serve (FCFS) policy and characterize the exact order of its mean waiting time. We then prove a lower bound on the mean waiting time of all policies, and demonstrate that there is an order gap between this lower bound and the mean waiting time under FCFS. We finally complement the lower bound with an achievability result: we show that under a priority policy that we call P-Priority, the mean waiting time achieves the order of the lower bound. This achievability result implies the tightness of the lower bound, the asymptotic optimality of P-Priority, and the strict suboptimality of FCFS.


page 1

page 2

page 3

page 4


heSRPT: Optimal Parallel Scheduling of Jobs With Known Sizes

When parallelizing a set of jobs across many servers, one must balance a...

Zero Queueing for Multi-Server Jobs

Cloud computing today is dominated by multi-server jobs. These are jobs ...

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

We consider a distributed server system consisting of a large number of ...

Delay Asymptotics and Bounds for Multi-Task Parallel Jobs

We study delay of jobs that consist of multiple parallel tasks, which is...

Bankrupting DoS Attackers Despite Uncertainty

On-demand provisioning in the cloud allows for services to remain availa...

Stability, memory, and messaging tradeoffs in heterogeneous service systems

We consider a heterogeneous distributed service system, consisting of n ...

Differential Approximation and Sprinting for Multi-Priority Big Data Engines

Today's big data clusters based on the MapReduce paradigm are capable of...