Learning to Dispatch Multi-Server Jobs in Bipartite Graphs with Unknown Service Rates

04/09/2022
by   Hailiang Zhao, et al.
0

Multi-server jobs are imperative in modern cloud computing systems. A multi-server job has multiple components and requests multiple servers for being served. How to allocate restricted computing devices to jobs is a topic of great concern, which leads to the job scheduling and load balancing algorithms thriving. However, current job dispatching algorithms require the service rates to be changeless and knowable, which is difficult to realize in production systems. Besides, for multi-server jobs, the dispatching decision for each job component follows the All-or-Nothing property under service locality constraints and resource capacity limits, which is not well supported by mainstream algorithms. In this paper, we propose a dispatching algorithm for multi-server jobs that learns the unknown service rates and simultaneously maximizes the expected Accumulative Social Welfare (Asw). We formulate the Asw as the sum of utilities of jobs and servers achieved over each time slot. The utility of a job is proportional to the valuation for being served, which is mainly impacted by the fluctuating but unknown service rates. We maximize the Asw without knowing the exact valuations, but approximate them with exploration-exploitation. From this, we bring in several evolving statistics and maximize the statistical Asw with dynamic programming. The proposed algorithm is proved to have a polynomial complexity and a State-of-the-Art regret. We validate it with extensive simulations and the results show that the proposed algorithm outperforms several benchmark policies with improvements by up to 73

READ FULL TEXT
research
09/02/2022

MaxWeight With Discounted UCB: A Provably Stable Scheduling Policy for Nonstationary Multi-Server Systems With Unknown Statistics

Multi-server queueing systems are widely used models for job scheduling ...
research
05/28/2020

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

We consider a distributed server system consisting of a large number of ...
research
05/11/2023

Scheduling Multi-Server Jobs with Sublinear Regrets via Online Learning

Nowadays, multi-server jobs, which request multiple computing devices an...
research
10/05/2016

10-millisecond Computing

Despite computation becomes much complex on data with an unprecedented s...
research
12/05/2021

Online Social Welfare Maximization with Spatio-Temporal Resource Mesh for Serverless

Serverless computing is leading the way to a simplified and general purp...
research
03/11/2020

Covert Cycle Stealing in a Single FIFO Server

Consider a setting where Willie generates a Poisson stream of jobs and r...
research
05/27/2020

Threshold-based rerouting and replication for resolving job-server affinity relations

We consider a system with several job types and two parallel server pool...

Please sign up or login with your details

Forgot password? Click here to reset