Learning to Emulate an Expert Projective Cone Scheduler

01/30/2018
by   Neal Master, et al.
0

Projective cone scheduling defines a large class of rate-stabilizing policies for queueing models relevant to several applications. While there exists considerable theory on the properties of projective cone schedulers, there is little practical guidance on choosing the parameters that define them. In this paper, we propose an algorithm for designing an automated projective cone scheduling system based on observations of an expert projective cone scheduler. We show that the estimated scheduling policy is able to emulate the expert in the sense that the average loss realized by the learned policy will converge to zero. Specifically, for a system with n queues observed over a time horizon T, the average loss for the algorithm is O((T)√((n)/T)). This upper bound holds regardless of the statistical characteristics of the system. The algorithm uses the multiplicative weights update method and can be applied online so that additional observations of the expert scheduler can be used to improve an existing estimate of the policy. This provides a data-driven method for designing a scheduling policy based on observations of a human expert. We demonstrate the efficacy of the algorithm with a simple numerical example and discuss several extensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2022

An Improved Greedy Algorithm for Stochastic Online Scheduling on Unrelated Machines

Most practical scheduling applications involve some uncertainty about th...
research
01/18/2022

System-Agnostic Meta-Learning for MDP-based Dynamic Scheduling via Descriptive Policy

Dynamic scheduling is an important problem in applications from queuing ...
research
05/27/2019

Provably Efficient Imitation Learning from Observation Alone

We study Imitation Learning (IL) from Observations alone (ILFO) in large...
research
06/24/2023

Learning from Pixels with Expert Observations

In reinforcement learning (RL), sparse rewards can present a significant...
research
09/17/2019

Inferring and Learning Multi-Robot Policies by Observing an Expert

In this paper we present a technique for learning how to solve a multi-r...
research
02/25/2021

Off-Policy Imitation Learning from Observations

Learning from Observations (LfO) is a practical reinforcement learning s...
research
02/01/2021

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

We resolve the long-standing "impossible tuning" issue for the classic e...

Please sign up or login with your details

Forgot password? Click here to reset