-
Importance Sampling Policy Evaluation with an Estimated Behavior Policy
In reinforcement learning, off-policy evaluation is the task of using da...
read it
-
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
We establish a connection between the importance sampling estimators typ...
read it
-
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
In importance sampling (IS)-based reinforcement learning algorithms such...
read it
-
Importance Resampling for Off-policy Prediction
Importance sampling (IS) is a common reweighting strategy for off-policy...
read it
-
The Importance of Pessimism in Fixed-Dataset Policy Optimization
We study worst-case guarantees on the expected return of fixed-dataset p...
read it
-
Asymptotic optimality of adaptive importance sampling
Adaptive importance sampling (AIS) uses past samples to update the sampl...
read it
-
Efficiency of adaptive importance sampling
The sampling policy of stage t, formally expressed as a probability dens...
read it
Conditional Importance Sampling for Off-Policy Learning
The principal contribution of this paper is a conceptual framework for off-policy reinforcement learning, based on conditional expectations of importance sampling ratios. This framework yields new perspectives and understanding of existing off-policy algorithms, and reveals a broad space of unexplored algorithms. We theoretically analyse this space, and concretely investigate several algorithms that arise from this framework.
READ FULL TEXT
Comments
There are no comments yet.