On the convergence of cycle detection for navigational reinforcement learning

11/27/2015
by   Tom J. Ameloot, et al.
0

We consider a reinforcement learning framework where agents have to navigate from start states to goal states. We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible. Reducible tasks have an acyclic solution. We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation. Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks. In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2023

Smoothed Q-learning

In Reinforcement Learning the Q-learning algorithm provably converges to...
research
11/24/2021

Learning State Representations via Retracing in Reinforcement Learning

We propose learning via retracing, a novel self-supervised approach for ...
research
08/28/2018

Cycle-of-Learning for Autonomous Systems from Human Interaction

We discuss different types of human-robot interaction paradigms in the c...
research
07/21/2020

On the Convergence of Reinforcement Learning with Monte Carlo Exploring Starts

A basic simulation-based reinforcement learning algorithm is the Monte C...
research
11/07/2018

Policy Certificates: Towards Accountable Reinforcement Learning

The performance of a reinforcement learning algorithm can vary drastical...
research
08/05/2021

An Elementary Proof that Q-learning Converges Almost Surely

Watkins' and Dayan's Q-learning is a model-free reinforcement learning a...
research
06/30/2023

TD Convergence: An Optimization Perspective

We study the convergence behavior of the celebrated temporal-difference ...

Please sign up or login with your details

Forgot password? Click here to reset