POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning

01/13/2020
by   Joseph Futoma, et al.
0

Many medical decision-making settings can be framed as partially observed Markov decision processes (POMDPs). However, popular two-stage approaches that first learn a POMDP model and then solve it often fail because the model that best fits the data may not be the best model for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in the kinds of batch, off-policy settings common in medicine. We demonstrate our approach on synthetic examples and a real-world hypotension management task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2023

Decision Making for Autonomous Vehicles

This paper is on decision making of autonomous vehicles for handling rou...
research
02/05/2020

Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making

The Markov assumption (MA) is fundamental to the empirical validity of r...
research
09/09/2019

Off-Policy Evaluation in Partially Observable Environments

This work studies the problem of batch off-policy evaluation for Reinfor...
research
12/31/2020

Robust Asymmetric Learning in POMDPs

Policies for partially observed Markov decision processes can be efficie...
research
10/18/2019

Multi-View Reinforcement Learning

This paper is concerned with multi-view reinforcement learning (MVRL), w...
research
03/24/2019

Truly Batch Apprenticeship Learning with Deep Successor Features

We introduce a novel apprenticeship learning algorithm to learn an exper...
research
05/23/2018

Variational Inference for Data-Efficient Model Learning in POMDPs

Partially observable Markov decision processes (POMDPs) are a powerful a...

Please sign up or login with your details

Forgot password? Click here to reset