DeepAI AI Chat
Log In Sign Up

PROMPT: Learning Dynamic Resource Allocation Policies for Edge-Network Applications

by   Drew Penney, et al.
Oregon State University

A growing number of service providers are exploring methods to improve server utilization, reduce power consumption, and reduce total cost of ownership by co-scheduling high-priority latency-critical workloads with best-effort workloads. This practice requires strict resource allocation between workloads to reduce resource contention and maintain Quality of Service (QoS) guarantees. Prior resource allocation works have been shown to improve server utilization under ideal circumstances, yet often compromise QoS guarantees or fail to find valid resource allocations in more dynamic operating environments. Further, prior works are fundamentally reliant upon QoS measurements that can, in practice, exhibit significant transient fluctuations, thus stable control behavior cannot be reliably achieved. In this paper, we propose a novel framework for dynamic resource allocation based on proactive QoS prediction. These predictions help guide a reinforcement-learning-based resource controller towards optimal resource allocations while avoiding transient QoS violations due to fluctuating workload demands. Evaluation shows that the proposed method incurs 4.3x fewer QoS violations, reduces severity of QoS violations by 3.7x, improves best-effort workload performance, and improves overall power efficiency compared with prior work.


page 2

page 4

page 5


QoS-Aware Power Minimization of Distributed Many-Core Servers using Transfer Q-Learning

Web servers scaled across distributed systems necessitate complex runtim...

MORPHOSYS: Efficient Colocation of QoS-Constrained Workloads in the Cloud

In hosting environments such as IaaS clouds, desirable application perfo...

Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model Collaborative Learning Approach

Latency-critical services have been widely deployed in cloud environment...

RobustScaler: QoS-Aware Autoscaling for Complex Workloads

Autoscaling is a critical component for efficient resource utilization w...

Interference and Need Aware Workload Colocation in Hyperscale Datacenters

Datacenters suffer from resource utilization inefficiencies due to the c...

CILP: Co-simulation based Imitation Learner for Dynamic Resource Provisioning in Cloud Computing Environments

Intelligent Virtual Machine (VM) provisioning is central to cost and res...

Towards QoS-Aware and Resource-Efficient GPU Microservices Based on Spatial Multitasking GPUs In Datacenters

While prior researches focus on CPU-based microservices, they are not ap...