Deep Offline Reinforcement Learning for Real-World Treatment Optimization Applications

02/15/2023
by   Milashini Nambiar, et al.
0

There is increasing interest in data-driven approaches for dynamically choosing optimal treatment strategies in many chronic disease management and critical care applications. Reinforcement learning methods are well-suited to this sequential decision-making problem, but must be trained and evaluated exclusively on retrospective medical record datasets as direct online exploration is unsafe and infeasible. Despite this requirement, the vast majority of dynamic treatment optimization studies use off-policy RL methods (e.g., Double Deep Q Networks (DDQN) or its variants) that are known to perform poorly in purely offline settings. Recent advances in offline RL, such as Conservative Q-Learning (CQL), offer a suitable alternative. But there remain challenges in adapting these approaches to real-world applications where suboptimal examples dominate the retrospective dataset and strict safety constraints need to be satisfied. In this work, we introduce a practical transition sampling approach to address action imbalance during offline RL training, and an intuitive heuristic to enforce hard constraints during policy execution. We provide theoretical analyses to show that our proposed approach would improve over CQL. We perform extensive experiments on two real-world tasks for diabetes and sepsis treatment optimization to compare performance of the proposed approach against prominent off-policy and offline RL baselines (DDQN and CQL). Across a range of principled and clinically relevant metrics, we show that our proposed approach enables substantial improvements in expected health outcomes and in consistency with relevant practice and safety guidelines.

READ FULL TEXT
research
06/26/2020

Critic Regularized Regression

Offline reinforcement learning (RL), also known as batch RL, offers the ...
research
04/15/2020

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...
research
12/01/2022

Launchpad: Learning to Schedule Using Offline and Online RL Methods

Deep reinforcement learning algorithms have succeeded in several challen...
research
04/15/2020

Datasets for Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...
research
03/25/2022

A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies

Sepsis is a leading cause of mortality and its treatment is very expensi...
research
06/13/2023

Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care

Most medical treatment decisions are sequential in nature. Hence, there ...
research
01/28/2023

SaFormer: A Conditional Sequence Modeling Approach to Offline Safe Reinforcement Learning

Offline safe RL is of great practical relevance for deploying agents in ...

Please sign up or login with your details

Forgot password? Click here to reset