Reinforcement Learning in Possibly Nonstationary Environments

03/03/2022
by   Mengbing Li, et al.
17

We consider reinforcement learning (RL) methods in offline nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the system transition and the reward function to be constant over time. However, the stationarity assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this paper, we develop a consistent procedure to test the nonstationarity of the optimal policy based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimisation in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and a real data example from the 2018 Intern Health Study. A Python implementation of the proposed procedure is available at https://github.com/limengbinggz/CUSUM-RL

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2022

Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons

We consider reinforcement learning (RL) methods in offline domains witho...
research
07/22/2021

Accelerating Quadratic Optimization with Reinforcement Learning

First-order methods for quadratic optimization such as OSQP are widely u...
research
11/08/2022

Doubly Inhomogeneous Reinforcement Learning

This paper studies reinforcement learning (RL) in doubly inhomogeneous e...
research
05/10/2021

Deeply-Debiased Off-Policy Interval Estimation

Off-policy evaluation learns a target policy's value with a historical d...
research
02/14/2022

Reinforcement Learning in Presence of Discrete Markovian Context Evolution

We consider a context-dependent Reinforcement Learning (RL) setting, whi...
research
03/19/2021

On a probabilistic approach to synthesize control policies from example datasets

This paper is concerned with the design of control policies from example...
research
11/07/2019

H_∞ Model-free Reinforcement Learning with Robust Stability Guarantee

Reinforcement learning is showing great potentials in robotics applicati...

Please sign up or login with your details

Forgot password? Click here to reset