Reinforcement Learning in Time-Varying Systems: an Empirical Study

01/14/2022
by   Pouya Hamadanian, et al.
13

Recent research has turned to Reinforcement Learning (RL) to solve challenging decision problems, as an alternative to hand-tuned heuristics. RL can learn good policies without the need for modeling the environment's dynamics. Despite this promise, RL remains an impractical solution for many real-world systems problems. A particularly challenging case occurs when the environment changes over time, i.e. it exhibits non-stationarity. In this work, we characterize the challenges introduced by non-stationarity and develop a framework for addressing them to train RL agents in live systems. Such agents must explore and learn new environments, without hurting the system's performance, and remember them over time. To this end, our framework (1) identifies different environments encountered by the live system, (2) explores and trains a separate expert policy for each environment, and (3) employs safeguards to protect the system's performance. We apply our framework to two systems problems: straggler mitigation and adaptive video streaming, and evaluate it against a variety of alternative approaches using real-world and synthetic data. We show that each component of our framework is necessary to cope with non-stationarity.

READ FULL TEXT

page 5

page 8

page 16

page 21

research
10/06/2021

Nested Policy Reinforcement Learning

Off-policy reinforcement learning (RL) has proven to be a powerful frame...
research
11/20/2020

MRAC-RL: A Framework for On-Line Policy Adaptation Under Parametric Model Uncertainty

Reinforcement learning (RL) algorithms have been successfully used to de...
research
11/11/2022

Controlling Commercial Cooling Systems Using Reinforcement Learning

This paper is a technical overview of DeepMind and Google's recent work ...
research
02/04/2023

Locally Constrained Policy Optimization for Online Reinforcement Learning in Non-Stationary Input-Driven Environments

We study online Reinforcement Learning (RL) in non-stationary input-driv...
research
09/04/2019

Inductive Bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters

The problem of scheduling of workloads onto heterogeneous processors (e....
research
08/30/2023

Learning to Navigate from Scratch using World Models and Curiosity: the Good, the Bad, and the Ugly

Learning to navigate unknown environments from scratch is a challenging ...
research
10/07/2020

Online Safety Assurance for Deep Reinforcement Learning

Recently, deep learning has been successfully applied to a variety of ne...

Please sign up or login with your details

Forgot password? Click here to reset