Optimal Service Elasticity in Large-Scale Distributed Systems

03/24/2017
by   Debankur Mukherjee, et al.
0

A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and time-varying demand patterns. Auto-scaling provides a popular paradigm for automatically adjusting service capacity in response to demand while meeting performance targets, and queue-driven auto-scaling techniques have been widely investigated in the literature. In typical data center architectures and cloud environments however, no centralized queue is maintained, and load balancing algorithms immediately distribute incoming tasks among parallel queues. In these distributed settings with vast numbers of servers, centralized queue-driven auto-scaling techniques involve a substantial communication overhead and major implementation burden, or may not even be viable at all. Motivated by the above issues, we propose a joint auto-scaling and load balancing scheme which does not require any global queue length information or explicit knowledge of system parameters, and yet provides provably near-optimal service elasticity. We establish the fluid-level dynamics for the proposed scheme in a regime where the total traffic volume and nominal service capacity grow large in proportion. The fluid-limit results show that the proposed scheme achieves asymptotic optimality in terms of user-perceived delay performance as well as energy consumption. Specifically, we prove that both the waiting time of tasks and the relative energy portion consumed by idle servers vanish in the limit. At the same time, the proposed scheme operates in a distributed fashion and involves only constant communication overhead per task, thus ensuring scalability in massive data center operations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2022

Asynchronous Load Balancing and Auto-scaling: Mean-Field Limit and Optimal Design

We introduce a Markovian framework for load balancing where classical al...
research
03/20/2018

Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

We consider the model of a token-based joint auto-scaling and load balan...
research
12/22/2017

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

We present an overview of scalable load balancing algorithms which provi...
research
12/14/2020

Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit

Load balancing plays a critical role in efficiently dispatching jobs in ...
research
06/14/2018

Scalable load balancing in networked systems: A survey of recent advances

The basic load balancing scenario involves a single dispatcher where tas...
research
12/18/2020

Learning and balancing time-varying loads in large-scale systems

Consider a system of n parallel server pools where tasks arrive as a tim...
research
12/25/2019

Large fork-join networks with nearly deterministic service times

In this paper, we study an N server fork-join queueing network with near...

Please sign up or login with your details

Forgot password? Click here to reset