A Theory of Auto-Scaling for Resource Reservation in Cloud Services

05/28/2020
by   Konstantinos Psychas, et al.
0

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor L, as L becomes large. We propose a resource reservation policy that asymptotically achieves at least 1/2, and under certain monotone property on jobs' rewards and resources, at least 1-1/e of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes, and decides in which servers the slots should be created in advance, without the knowledge of traffic rates. It effectively tracks a low-complexity greedy packing of existing jobs in the system while maintaining only a small number, g(L)=ω(log L), of reserved VM slots for high priority jobs that pack well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2022

Learning to Dispatch Multi-Server Jobs in Bipartite Graphs with Unknown Service Rates

Multi-server jobs are imperative in modern cloud computing systems. A mu...
research
09/19/2022

Capacity Allocation for Clouds with Parallel Processing, Batch Arrivals, and Heterogeneous Service Requirements

Problem Definition: Allocating sufficient capacity to cloud services is ...
research
07/10/2020

Stability, memory, and messaging tradeoffs in heterogeneous service systems

We consider a heterogeneous distributed service system, consisting of n ...
research
01/24/2020

Priority-based Fair Scheduling in Edge Computing

Scheduling is important in Edge computing. In contrast to the Cloud, Edg...
research
09/09/2022

Near-Optimal Stochastic Bin-Packing in Large Service Systems with Time-Varying Item Sizes

Motivated by the virtual machine scheduling problem in today's computing...
research
05/17/2022

Bankrupting DoS Attackers Despite Uncertainty

On-demand provisioning in the cloud allows for services to remain availa...
research
09/11/2021

Sharp Waiting-Time Bounds for Multiserver Jobs

Multiserver jobs, which are jobs that occupy multiple servers simultaneo...

Please sign up or login with your details

Forgot password? Click here to reset