Policy Improvement for POMDPs Using Normalized Importance Sampling

01/10/2013
by   Christian R. Shelton, et al.
0

We present a new method for estimating the expected return of a POMDP from experience. The method does not assume any knowledge of the POMDP and allows the experience to be gathered from an arbitrary sequence of policies. The return is estimated for any new policy of the POMDP. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its theoretical properties. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons. We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to REINFORCE algorithms showing an order of magnitude reduction in the number of trials required.

READ FULL TEXT
research
11/10/2016

Importance Sampling with Unequal Support

Importance sampling is often used in machine learning when training and ...
research
09/13/2021

State Relevance for Off-Policy Evaluation

Importance sampling-based estimators for off-policy evaluation (OPE) are...
research
12/07/2022

Low Variance Off-policy Evaluation with State-based Importance Sampling

In off-policy reinforcement learning, a behaviour policy performs explor...
research
10/21/2020

Optimal Off-Policy Evaluation from Multiple Logging Policies

We study off-policy evaluation (OPE) from multiple logging policies, eac...
research
05/25/2020

Importance Sampling for Pathwise Sensitivity of Stochastic Chaotic Systems

This paper proposes a new pathwise sensitivity estimator for chaotic SDE...
research
09/27/2022

Using Importance Samping in Estimating Weak Derivative

In this paper we study simulation-based methods for estimating gradients...
research
03/02/2018

Not All Samples Are Created Equal: Deep Learning with Importance Sampling

Deep neural network training spends most of the computation on examples ...

Please sign up or login with your details

Forgot password? Click here to reset