On the convergence of optimistic policy iteration for stochastic shortest path problem

08/27/2018
by   Yuanlong Chen, et al.
0

In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem. We consider the Monte Carlo and TD(λ) methods for the policy evaluation step under the condition that the termination state will be reached almost surely.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2020

On the Convergence of Reinforcement Learning with Monte Carlo Exploring Starts

A basic simulation-based reinforcement learning algorithm is the Monte C...
research
06/10/2022

Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality

Goal-oriented Reinforcement Learning, where the agent needs to reach the...
research
02/14/2012

Suboptimality Bounds for Stochastic Shortest Path Problems

We consider how to use the Bellman residual of the dynamic programming o...
research
07/31/2022

Convex duality for stochastic shortest path problems in known and unknown environments

This paper studies Stochastic Shortest Path (SSP) problems in known and ...
research
10/19/2019

Opinion shaping in social networks using reinforcement learning

In this paper, we study how to shape opinions in social networks when th...
research
01/08/2021

When does the Physarum Solver Distinguish the Shortest Path from other Paths: the Transition Point and its Applications

Physarum solver, also called the physarum polycephalum inspired algorith...
research
02/24/2016

Stochastic Shortest Path with Energy Constraints in POMDPs

We consider partially observable Markov decision processes (POMDPs) with...

Please sign up or login with your details

Forgot password? Click here to reset