Quasi-optimal Learning with Continuous Treatments

01/21/2023
by   Yuhan Li, et al.
0

Many real-world applications of reinforcement learning (RL) require making decisions in continuous action environments. In particular, determining the optimal dose level plays a vital role in developing medical treatment regimes. One challenge in adapting existing RL algorithms to medical applications, however, is that the popular infinite support stochastic policies, e.g., Gaussian policy, may assign riskily high dosages and harm patients seriously. Hence, it is important to induce a policy class whose support only contains near-optimal actions, and shrink the action-searching area for effectiveness and reliability. To achieve this, we develop a novel quasi-optimal learning algorithm, which can be easily optimized in off-policy settings with guaranteed convergence under general function approximations. Theoretically, we analyze the consistency, sample complexity, adaptability, and convergence of the proposed algorithm. We evaluate our algorithm with comprehensive simulated experiments and a dose suggestion real application to Ohio Type 1 diabetes dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/24/2020

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Standard reinforcement learning (RL) aims to find an optimal policy that...
research
10/06/2021

Nested Policy Reinforcement Learning

Off-policy reinforcement learning (RL) has proven to be a powerful frame...
research
02/22/2021

Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization

Action-constrained reinforcement learning (RL) is a widely-used approach...
research
02/01/2023

Sample Complexity of Kernel-Based Q-Learning

Modern reinforcement learning (RL) often faces an enormous state-action ...
research
02/10/2020

Discrete Action On-Policy Learning with Action-Value Critic

Reinforcement learning (RL) in discrete action space is ubiquitous in re...
research
02/24/2022

Policy Learning for Optimal Individualized Dose Intervals

We study the problem of learning individualized dose intervals using obs...
research
09/12/2023

A Q-learning Approach for Adherence-Aware Recommendations

In many real-world scenarios involving high-stakes and safety implicatio...

Please sign up or login with your details

Forgot password? Click here to reset