Collaborative Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud Network

05/10/2023
by   Shihao Shen, et al.
0

Kubernetes (k8s) has the potential to coordinate distributed edge resources and centralized cloud resources, but currently lacks a specialized scheduling framework for edge-cloud networks. Besides, the hierarchical distribution of heterogeneous resources makes the modeling and scheduling of k8s-oriented edge-cloud network particularly challenging. In this paper, we introduce KaiS, a learning-based scheduling framework for such edge-cloud network to improve the long-term throughput rate of request processing. First, we design a coordinated multi-agent actor-critic algorithm to cater to decentralized request dispatch and dynamic dispatch spaces within the edge cluster. Second, for diverse system scales and structures, we use graph neural networks to embed system state information, and combine the embedding results with multiple policy networks to reduce the orchestration dimensionality by stepwise scheduling. Finally, we adopt a two-time-scale scheduling mechanism to harmonize request dispatch and service orchestration, and present the implementation design of deploying the above algorithms compatible with native k8s components. Experiments using real workload traces show that KaiS can successfully learn appropriate scheduling policies, irrespective of request arrival patterns and system scales. Moreover, KaiS can enhance the average system throughput rate by 15.9 compared to baselines.

READ FULL TEXT

page 1

page 2

page 6

page 7

page 11

page 15

research
01/17/2021

Tailored Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud System

Kubernetes (k8s) has the potential to merge the distributed edge and the...
research
03/20/2022

EdgeMatrix: A Resources Redefined Edge-Cloud System for Prioritized Services

The edge-cloud system has the potential to combine the advantages of het...
research
08/01/2023

EdgeMatrix: A Resource-Redefined Scheduling Framework for SLA-Guaranteed Multi-Tier Edge-Cloud Computing Systems

With the development of networking technology, the computing system has ...
research
02/01/2023

Task Placement and Resource Allocation for Edge Machine Learning: A GNN-based Multi-Agent Reinforcement Learning Paradigm

Machine learning (ML) tasks are one of the major workloads in today's ed...
research
10/25/2020

LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference

In cloud ML inference systems, batching is an essential technique to inc...
research
09/01/2020

Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments using A3C learning and Residual Recurrent Neural Networks

The ubiquitous adoption of Internet-of-Things (IoT) based applications h...
research
07/02/2023

Collaborative Policy Learning for Dynamic Scheduling Tasks in Cloud-Edge-Terminal IoT Networks Using Federated Reinforcement Learning

In this paper, we examine cloud-edge-terminal IoT networks, where edges ...

Please sign up or login with your details

Forgot password? Click here to reset