Combining Parametric and Nonparametric Models for Off-Policy Evaluation

05/14/2019
by   Omer Gottesman, et al.
0

We consider a model-based approach to perform batch off-policy evaluation in reinforcement learning. Our method takes a mixture-of-experts approach to combine parametric and non-parametric models of the environment such that the final value estimate has the least expected error. We do so by first estimating the local accuracy of each model and then using a planner to select which model to use at every time step as to minimize the return error estimate along entire trajectories. Across a variety of domains, our mixture-based approach outperforms the individual models alone as well as state-of-the-art importance sampling-based estimators.

READ FULL TEXT
research
07/03/2018

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

In this work, we consider the problem of estimating a behaviour policy f...
research
12/07/2022

Low Variance Off-policy Evaluation with State-based Importance Sampling

In off-policy reinforcement learning, a behaviour policy performs explor...
research
10/02/2017

Oracle Importance Sampling for Stochastic Simulation Models

We consider the problem of estimating an expected outcome from a stochas...
research
09/21/2022

Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models

We study the problem of off-policy evaluation (OPE) for episodic Partial...
research
01/15/2019

Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning

Sepsis is the leading cause of mortality in the ICU. It is challenging t...
research
11/07/2018

Explaining Deep Learning Models - A Bayesian Non-parametric Approach

Understanding and interpreting how machine learning (ML) models make dec...
research
04/10/2018

Learning Latent Events from Network Message Logs: A Decomposition Based Approach

In this communication, we describe a novel technique for event mining us...

Please sign up or login with your details

Forgot password? Click here to reset