AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments

05/29/2020
by   Lucia Schuler, et al.
0

Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years. It promises the user services at large scale and low cost while eliminating the need for infrastructure management. On cloud provider side, flexible resource management is required to meet fluctuating demand. It can be enabled through automated provisioning and deprovisioning of resources. A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling, where a designated algorithm scales instances according to the number of incoming requests. In the recently evolving serverless framework Knative a request-based policy is proposed, where the algorithm scales resources by a configured maximum number of requests that can be processed in parallel per instance, the so-called concurrency. As we show in a baseline experiment, this predefined concurrency level can strongly influence the performance of a serverless application. However, identifying the concurrency configuration that yields the highest possible quality of service is a challenging task due to various factors, e.g. varying workload and complex infrastructure characteristics, influencing throughput and latency. While there has been considerable research into intelligent techniques for optimizing auto-scaling for virtual machine provisioning, this topic has not yet been discussed in the area of serverless computing. For this reason, we investigate the applicability of a reinforcement learning approach, which has been proven on dynamic virtual machine provisioning, to request-based auto-scaling in a serverless framework. Our results show that within a limited number of iterations our proposed model learns an effective scaling policy per workload, improving the performance compared to the default auto-scaling configuration.

READ FULL TEXT
research
05/19/2017

A Comparison of Reinforcement Learning Techniques for Fuzzy Cloud Auto-Scaling

A goal of cloud service management is to design self-adaptable auto-scal...
research
10/08/2018

Improving resource elasticity in cloud computing thanks to model-free control

In cloud computing management, the dynamic adaptation of computing resou...
research
08/22/2023

A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications

Serverless computing has gained a strong traction in the cloud computing...
research
03/07/2023

AHPA: Adaptive Horizontal Pod Autoscaling Systems on Alibaba Cloud Container Service for Kubernetes

The existing resource allocation policy for application instances in Kub...
research
08/03/2020

A simple and effective predictive resource scaling heuristic for large-scale cloud applications

We propose a simple yet effective policy for the predictive auto-scaling...
research
12/14/2020

WISE: A Computer System Performance Index Scoring Framework

The performance levels of a computing machine running a given workload c...
research
03/15/2022

Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out

The Raft algorithm maintains strong consistency across data replicas in ...

Please sign up or login with your details

Forgot password? Click here to reset