Interference and Need Aware Workload Colocation in Hyperscale Datacenters

07/25/2022
by   Sayak Chakraborti, et al.
0

Datacenters suffer from resource utilization inefficiencies due to the conflicting goals of service owners and platform providers. Service owners intending to maintain Service Level Objectives (SLO) for themselves typically request a conservative amount of resources. Platform providers want to increase operational efficiency to reduce capital and operating costs. Achieving both operational efficiency and SLO for individual services at the same time is challenging due to the diversity in service workload characteristics, resource usage patterns that are dependent on input load, heterogeneity in platform, memory, I/O, and network architecture, and resource bundling. This paper presents a tunable approach to resource allocation that accounts for both dynamic service resource needs and platform heterogeneity. In addition, an online K-Means-based service classification method is used in conjunction with an offline sensitivity component. Our tunable approach allows trading resource utilization efficiency for absolute SLO guarantees based on the service owners' sensitivity to its SLO. We evaluate our tunable resource allocator at scale in a private cloud environment with mostly latency-critical workloads. When tuning for operational efficiency, we demonstrate up to  50 reduction in required machines;  40 (TCO); and  60 increasing the number of tasks experiencing degradation of SLO by up to  25 compared to the baseline. When tuning for SLO, by introducing interference-aware colocation, we can tune the solver to reduce tasks experiencing degradation of SLO by up to  22 an additional cost of  30 trade-off between TCO and SLO violations, and offer tuning based on the requirements of the platform owners.

READ FULL TEXT

page 1

page 6

research
09/17/2018

The Serverless Scheduling Problem and NOAH

The serverless scheduling problem poses a new challenge to Cloud service...
research
09/07/2018

Dynamic Resource Allocation in the Cloud with Near-Optimal Efficiency

Cloud computing has motivated renewed interest in resource allocation pr...
research
01/19/2022

PROMPT: Learning Dynamic Resource Allocation Policies for Edge-Network Applications

A growing number of service providers are exploring methods to improve s...
research
09/12/2019

SQLR: Short Term Memory Q-Learning for Elastic Provisioning

As more and more application providers transition to the cloud and deliv...
research
07/18/2023

Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud

Multi-tenancy in public clouds may lead to co-location interference on s...
research
12/27/2019

URSA: Precise Capacity Planning and Contention-aware Scheduling for Public Clouds

Database platform-as-a-service (dbPaaS) is developing rapidly and a larg...
research
04/10/2023

RAPID: Enabling Fast Online Policy Learning in Dynamic Public Cloud Environments

Resource sharing between multiple workloads has become a prominent pract...

Please sign up or login with your details

Forgot password? Click here to reset