Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings

07/23/2021
by   Shengpu Tang, et al.
0

Reinforcement learning (RL) can be used to learn treatment policies and aid decision making in healthcare. However, given the need for generalization over complex state/action spaces, the incorporation of function approximators (e.g., deep neural networks) requires model selection to reduce overfitting and improve policy performance at deployment. Yet a standard validation pipeline for model selection requires running a learned policy in the actual environment, which is often infeasible in a healthcare setting. In this work, we investigate a model selection pipeline for offline RL that relies on off-policy evaluation (OPE) as a proxy for validation performance. We present an in-depth analysis of popular OPE methods, highlighting the additional hyperparameters and computational requirements (fitting/inference of auxiliary models) when used to rank a set of candidate policies. We compare the utility of different OPE methods as part of the model selection pipeline in the context of learning to treat patients with sepsis. Among all the OPE methods we considered, fitted Q evaluation (FQE) consistently leads to the best validation ranking, but at a high computational cost. To balance this trade-off between accuracy of ranking and computational efficiency, we propose a simple two-stage approach to accelerate model selection by avoiding potentially unnecessary computation. Our work serves as a practical guide for offline RL model selection and can help RL practitioners select policies using real-world datasets. To facilitate reproducibility and future extensions, the code accompanying this paper is available online at https://github.com/MLD3/OfflineRL_ModelSelection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2022

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Offline reinforcement learning (RL) can be used to improve future perfor...
research
07/17/2020

Hyperparameter Selection for Offline Reinforcement Learning

Offline reinforcement learning (RL purely from logged data) is an import...
research
01/31/2023

Revisiting Bellman Errors for Offline Model Selection

Offline model selection (OMS), that is, choosing the best policy from a ...
research
12/30/2022

Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes

The application of deep learning algorithms to temporal panel datasets i...
research
03/13/2023

Deploying Offline Reinforcement Learning with Human Feedback

Reinforcement learning (RL) has shown promise for decision-making tasks ...
research
11/03/2022

Oracle Inequalities for Model Selection in Offline Reinforcement Learning

In offline reinforcement learning (RL), a learner leverages prior logged...
research
05/02/2023

Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare

Many reinforcement learning (RL) applications have combinatorial action ...

Please sign up or login with your details

Forgot password? Click here to reset