On-Line Policy Iteration for Infinite Horizon Dynamic Programming

06/01/2021
by   Dimitri Bertsekas, et al.
32

In this paper we propose an on-line policy iteration (PI) algorithm for finite-state infinite horizon discounted dynamic programming, whereby the policy improvement operation is done on-line, only for the states that are encountered during operation of the system. This allows the continuous updating/improvement of the current policy, thus resulting in a form of on-line PI that incorporates the improved controls into the current policy as new states and controls are generated. The algorithm converges in a finite number of stages to a type of locally optimal policy, and suggests the possibility of variants of PI and multiagent PI where the policy improvement is simplified. Moreover, the algorithm can be used with on-line replanning, and is also well-suited for on-line PI algorithms with value and policy approximations.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/04/2020

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning

We consider infinite horizon dynamic programming problems, where the con...
09/26/2019

Action Selection for MDPs: Anytime AO* vs. UCT

In the presence of non-admissible heuristics, A* and other best-first al...
09/19/2016

Incremental Sampling-based Motion Planners Using Policy Iteration Methods

Recent progress in randomized motion planners has led to the development...
11/09/2020

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems

In this paper we consider infinite horizon discounted dynamic programmin...
05/10/2021

Value Iteration in Continuous Actions, States and Time

Classical value iteration approaches are not applicable to environments ...
01/23/2013

My Brain is Full: When More Memory Helps

We consider the problem of finding good finite-horizon policies for POMD...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.