LSQ: Load Balancing in Large-Scale Heterogeneous Systems with Multiple Dispatchers

03/04/2020
by   Shay Vargaftik, et al.
0

Nowadays, the efficiency and even the feasibility of traditional load-balancing policies are challenged by the rapid growth of cloud infrastructure and the increasing levels of server heterogeneity. In such heterogeneous systems with many load-balancers, traditional solutions, such as JSQ, incur a prohibitively large communication overhead and detrimental incast effects due to herd behavior. Alternative low-communication policies, such as JSQ(d) and the recently proposed JIQ, are either unstable or provide poor performance. We introduce the Local Shortest Queue (LSQ) family of load balancing algorithms. In these algorithms, each dispatcher maintains its own, local, and possibly outdated view of the server queue lengths, and keeps using JSQ on its local view. A small communication overhead is used infrequently to update this local view. We formally prove that as long as the error in these local estimates of the server queue lengths is bounded in expectation, the entire system is strongly stable. Finally, in simulations, we show how simple and stable LSQ policies exhibit appealing performance and significantly outperform existing low-communication policies, while using an equivalent communication budget. In particular, our simple policies often outperform even JSQ due to their reduction of herd behavior. We further show how, by relying on smart servers (i.e., advanced pull-based communication), we can further improve performance and lower communication overhead.

READ FULL TEXT
research
12/22/2017

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

We present an overview of scalable load balancing algorithms which provi...
research
08/03/2020

Distributed Dispatching in the Parallel Server Model

With the rapid increase in the size and volume of cloud services and dat...
research
02/20/2020

Asymptotically Optimal Load Balancing in Large-scale Heterogeneous Systems with Multiple Dispatchers

We consider the load balancing problem in large-scale heterogeneous syst...
research
06/06/2022

CARE: Resource Allocation Using Sparse Communication

We propose a new framework for studying effective resource allocation in...
research
06/24/2020

Scalable Load Balancing in the Presence of Heterogeneous Servers

Heterogeneity is becoming increasingly ubiquitous in modern large-scale ...
research
10/29/2020

Self-Learning Threshold-Based Load Balancing

We consider a large-scale service system where incoming tasks have to be...
research
12/16/2021

Utility maximizing load balancing policies

Consider a service system where incoming tasks are instantaneously dispa...

Please sign up or login with your details

Forgot password? Click here to reset