Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

05/14/2008
by   Christos Dimitrakakis, et al.
0

Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervised learning problem, have been proposed recently. Finding good policies with such methods requires not only an appropriate classifier, but also reliable examples of best actions, covering the state space sufficiently. Up to this time, little work has been done on appropriate covering schemes and on methods for reducing the sample complexity of such methods, especially in continuous state spaces. This paper focuses on the simplest possible covering scheme (a discretized grid over the state space) and performs a sample-complexity comparison between the simplest (and previously commonly used) rollout sampling allocation strategy, which allocates samples equally at each state under consideration, and an almost as simple method, which allocates samples only as needed and requires significantly fewer samples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2019

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

We study the sample complexity of approximate policy iteration (PI) for ...
research
12/12/2022

Variance-Reduced Conservative Policy Iteration

We study the sample complexity of reducing reinforcement learning to a s...
research
02/03/2023

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Recently, the impressive empirical success of policy gradient (PG) metho...
research
08/05/2022

Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation

In this work, we study policy-based methods for solving the reinforcemen...
research
06/22/2023

Fitted Value Iteration Methods for Bicausal Optimal Transport

We develop a fitted value iteration (FVI) method to compute bicausal opt...
research
06/14/2019

A Distribution Dependent and Independent Complexity Analysis of Manifold Regularization

Manifold regularization is a commonly used technique in semi-supervised ...

Please sign up or login with your details

Forgot password? Click here to reset