Queue Scheduling with Adversarial Bandit Learning

03/03/2023
by   Jiatai Huang, et al.
0

In this paper, we study scheduling of a queueing system with zero knowledge of instantaneous network conditions. We consider a one-hop single-server queueing system consisting of K queues, each with time-varying and non-stationary arrival and service rates. Our scheduling approach builds on an innovative combination of adversarial bandit learning and Lyapunov drift minimization, without knowledge of the instantaneous network state (the arrival and service rates) of each queue. We then present two novel algorithms (SoftMaxWeight) and (Sliding-window SoftMaxWeight), both capable of stabilizing systems that can be stablized by some (possibly unknown) sequence of randomized policies whose time-variation satisfies a mild condition. We further generalize our results to the setting where arrivals and departures only have bounded moments instead of being deterministically bounded and propose and that are capable of stabilizing the system. As a building block of our new algorithms, we also extend the classical (Auer et al., 2002) algorithm for multi-armed bandits to handle unboundedly large feedback signals, which can be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2023

Learning to Schedule in Non-Stationary Wireless Networks With Unknown Statistics

The emergence of large-scale wireless networks with partially-observable...
research
09/02/2022

MaxWeight With Discounted UCB: A Provably Stable Scheduling Policy for Nonstationary Multi-Server Systems With Unknown Statistics

Multi-server queueing systems are widely used models for job scheduling ...
research
08/15/2023

Quantifying the Cost of Learning in Queueing Systems

Queueing systems are widely applicable stochastic models with use cases ...
research
02/20/2020

Queueing Subject To Action-Dependent Server Performance: Utilization Rate Reduction

We consider a discrete-time system comprising a first-come-first-served ...
research
02/02/2018

On Learning the cμ Rule: Single and Multiserver Settings

We consider learning-based variants of the c μ rule -- a classic and wel...
research
05/01/2021

Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling

We consider the problem of scheduling in constrained queueing networks w...
research
02/02/2018

On Learning the cμ Rule in Single and Parallel Server Networks

We consider learning-based variants of the c μ rule for scheduling in si...

Please sign up or login with your details

Forgot password? Click here to reset