Doubly Inhomogeneous Reinforcement Learning

11/08/2022
by   Liyuan Hu, et al.
4

This paper studies reinforcement learning (RL) in doubly inhomogeneous environments under temporal non-stationarity and subject heterogeneity. In a number of applications, it is commonplace to encounter datasets generated by system dynamics that may change over time and population, challenging high-quality sequential decision making. Nonetheless, most existing RL solutions require either temporal stationarity or subject homogeneity, which would result in sub-optimal policies if both assumptions were violated. To address both challenges simultaneously, we propose an original algorithm to determine the “best data chunks" that display similar dynamics over time and across individuals for policy learning, which alternates between most recent change point detection and cluster identification. Our method is general, and works with a wide range of clustering and change point detection algorithms. It is multiply robust in the sense that it takes multiple initial estimators as input and only requires one of them to be consistent. Moreover, by borrowing information over time and population, it allows us to detect weaker signals and has better convergence properties when compared to applying the clustering algorithm per time or the change point detection algorithm per subject. Empirically, we demonstrate the usefulness of our method through extensive simulations and a real data application.

READ FULL TEXT
research
08/05/2019

Change-point detection in dynamic networks via graphon estimation

We propose a general approach for change-point detection in dynamic netw...
research
03/03/2022

Reinforcement Learning in Possibly Nonstationary Environments

We consider reinforcement learning (RL) methods in offline nonstationary...
research
11/19/2021

Population based change-point detection for the identification of homozygosity islands

In this paper, we propose a new method for offline change-point detectio...
research
09/24/2020

Bandit Change-Point Detection for Real-Time Monitoring High-Dimensional Data Under Sampling Control

In many real-world problems of real-time monitoring high-dimensional str...
research
12/07/2018

Change Point Estimation in a Dynamic Stochastic Block Model

We consider the problem of estimating the location of a single change po...
research
04/01/2021

AdaPool: A Diurnal-Adaptive Fleet Management Framework using Model-Free Deep Reinforcement Learning and Change Point Detection

This paper introduces an adaptive model-free deep reinforcement approach...
research
04/11/2022

Identifying the Dynamics of a System by Leveraging Data from Similar Systems

We study the problem of identifying the dynamics of a linear system when...

Please sign up or login with your details

Forgot password? Click here to reset