Conformal Prediction Intervals for Markov Decision Process Trajectories

06/10/2022
by   Thomas G. Dietterich, et al.
8

Before delegating a task to an autonomous system, a human operator may want a guarantee about the behavior of the system. This paper extends previous work on conformal prediction for functional data and conformalized quantile regression to provide conformal prediction intervals over the future behavior of an autonomous system executing a fixed control policy on a Markov Decision Process (MDP). The prediction intervals are constructed by applying conformal corrections to prediction intervals computed by quantile regression. The resulting intervals guarantee that with probability 1-δ the observed trajectory will lie inside the prediction interval, where the probability is computed with respect to the starting state distribution and the stochasticity of the MDP. The method is illustrated on MDPs for invasive species management and StarCraft2 battles.

READ FULL TEXT

page 12

page 15

page 16

page 18

page 20

research
05/06/2022

Hitting time for Markov decision process

We define the hitting time for a Markov decision process (MDP). We do no...
research
11/29/2022

Will My Robot Achieve My Goals? Predicting the Probability that an MDP Policy Reaches a User-Specified Behavior Target

As an autonomous system performs a task, it should maintain a calibrated...
research
12/01/2016

Optimizing Quantiles in Preference-based Markov Decision Processes

In the Markov decision process model, policies are usually evaluated by ...
research
08/15/2013

Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations

Control applications often feature tasks with similar, but not identical...
research
09/01/2023

Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football

For decades, National Football League (NFL) coaches' observed fourth dow...
research
09/26/2019

Markov Decision Process for Video Generation

We identify two pathological cases of temporal inconsistencies in video ...
research
10/27/2021

Play to Grade: Testing Coding Games as Classifying Markov Decision Process

Contemporary coding education often presents students with the task of d...

Please sign up or login with your details

Forgot password? Click here to reset