Capacity Allocation for Clouds with Parallel Processing, Batch Arrivals, and Heterogeneous Service Requirements

09/19/2022
by   Eugene Furman, et al.
0

Problem Definition: Allocating sufficient capacity to cloud services is a challenging task, especially when demand is time-varying, heterogeneous, contains batches, and requires multiple types of resources for processing. In this setting, providers decide whether to reserve portions of their capacity to individual job classes or to offer it in a flexible manner. Methodology/results: In collaboration with Huawei Cloud, a worldwide provider of cloud services, we propose a heuristic policy that allocates multiple types of resources to jobs and also satisfies their pre-specified service level agreements (SLAs). We model the system as a multi-class queueing network with parallel processing and multiple types of resources, where arrivals (i.e., virtual machines and containers) follow time-varying patterns and require at least one unit of each resource for processing. While virtual machines leave if they are not served immediately, containers can join a queue. We introduce a diffusion approximation of the offered load of such system and investigate its fidelity as compared to the observed data. Then, we develop a heuristic approach that leverages this approximation to determine capacity levels that satisfy probabilistic SLAs in the system with fully flexible servers. Managerial Implications: Using a data set of cloud computing requests over a representative 8-day period from Huawei Cloud, we show that our heuristic policy results in a 20 compared to a benchmark that reserves resources. In addition, we show that the system utilization induced by our policy is superior to the benchmark, i.e., it implies less idling of resources in most instances. Thus, our approach enables cloud operators to both reduce costs and achieve better performance.

READ FULL TEXT

page 5

page 6

page 21

page 23

research
05/28/2020

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

We consider a distributed server system consisting of a large number of ...
research
04/12/2020

Service Level Driven Job Scheduling in Multi-Tier Cloud Computing: A Biologically Inspired Approach

Cloud computing environments often have to deal with random-arrival comp...
research
07/08/2022

Tackling Heterogeneous Traffic in Multi-access Systems via Erasure Coded Servers

Most data generated by modern applications is stored in the cloud, and t...
research
05/20/2021

Approximation Algorithms for the NFV Service Distribution Problem

Distributed cloud networking builds on network functions virtualization ...
research
06/03/2021

Towards Cost-Optimal Policies for DAGs to Utilize IaaS Clouds with Online Learning

Premier cloud service providers (CSPs) offer two types of purchase optio...
research
04/11/2019

A Processor-Sharing model for the Performance of Virtualized Network Functions

The parallel execution of requests in a Cloud Computing platform, as for...
research
01/25/2022

Learning Resource Allocation Policies from Observational Data with an Application to Homeless Services Delivery

We study the problem of learning, from observational data, fair and inte...

Please sign up or login with your details

Forgot password? Click here to reset