Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling

05/01/2021
by   Mohammani Zaki, et al.
0

We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. Modern communication systems are becoming increasingly complex, and are required to handle multiple types of traffic with widely varying characteristics such as arrival rates and service times. This, coupled with the need for rapid network deployment, render a bottom up approach of first characterizing the traffic and then devising an appropriate scheduling protocol infeasible. In contrast, we formulate a top down approach to scheduling where, given an unknown network and a set of scheduling policies, we use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies. We derive convergence results and analyze finite time performance of the algorithm. Simulation results show that the algorithm performs well even when the arrival rates are nonstationary and can stabilize the system even when the constituent policies are unstable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2021

Phasic Policy Gradient Based Resource Allocation for Industrial Internet of Things

Time Slotted Channel Hopping (TSCH) behavioural mode has been introduced...
research
04/13/2021

Reinforcement learning for Admission Control in 5G Wireless Networks

The key challenge in admission control in wireless networks is to strike...
research
06/04/2018

Improving rewards in overloaded real-time systems

Competitive analysis of online algorithms has commonly been applied to u...
research
12/28/2018

MEETING BOT: Reinforcement Learning for Dialogue Based Meeting Scheduling

In this paper we present Meeting Bot, a reinforcement learning based con...
research
05/16/2021

DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling

For decades, system administrators have been striving to design and tune...
research
07/03/2021

TrafPy: Benchmarking Data Centre Network Systems

Benchmarking is commonly used in research fields such as computer archit...
research
03/03/2023

Queue Scheduling with Adversarial Bandit Learning

In this paper, we study scheduling of a queueing system with zero knowle...

Please sign up or login with your details

Forgot password? Click here to reset