Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model Collaborative Learning Approach

11/26/2019
by   Lei Liu, et al.
0

Latency-critical services have been widely deployed in cloud environments. For cost-efficiency, multiple services are usually co-located on a server. Thus, run-time resource scheduling becomes the pivot for QoS control in these complicated co-location cases. However, the scheduling exploration space enlarges rapidly with the increasing server resources, making the schedulers hardly provide ideal solutions quickly. More importantly, we observe that there are "resource cliffs" in the scheduling exploration space. They affect the exploration efficiency and always lead to severe QoS fluctuations. Resource cliffs cannot be easily avoided in previous schedulers. To address these problems, we propose a novel ML-based intelligent scheduler - OSML. It learns the correlation between architectural hints (e.g., IPC, cache misses, memory footprint, etc.), scheduling solutions and the QoS demands based on a data set we collected from 11 widely deployed services running on off-the-shelf servers. OSML employs multiple ML models to work collaboratively to predict QoS variations, shepherd the scheduling, and recover from QoS violations in complicated co-location cases. OSML can intelligently avoid resource cliffs during scheduling and reach an optimal solution much faster than previous approaches for co-located LC services. Experimental results show that OSML supports higher loads and meets QoS targets with lower scheduling overheads and shorter convergence time than previous studies.

READ FULL TEXT

page 3

page 4

page 10

page 11

research
01/19/2022

PROMPT: Learning Dynamic Resource Allocation Policies for Edge-Network Applications

A growing number of service providers are exploring methods to improve s...
research
05/10/2023

MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural Networks

Driven by the wide adoption of deep neural networks (DNNs) across differ...
research
04/24/2018

Seer: Leveraging Big Data to Navigate the Increasing Complexity of Cloud Debugging

Performance unpredictability in cloud services leads to poor user experi...
research
04/12/2018

Pliant: Leveraging Approximation to Improve Datacenter Resource Efficiency

Cloud multi-tenancy is typically constrained to a single interactive ser...
research
04/30/2021

QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

Mobile edge computing pushes computationally-intensive services closer t...
research
11/16/2020

A Probability Distribution and Location-aware ResNet Approach for QoS Prediction

In recent years, the number of online services has grown rapidly, invoke...
research
01/17/2022

VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling

Deep learning (DL) models have achieved great success in many applicatio...

Please sign up or login with your details

Forgot password? Click here to reset