TD Convergence: An Optimization Perspective

06/30/2023
by   Kavosh Asadi, et al.
0

We study the convergence behavior of the celebrated temporal-difference (TD) learning algorithm. By looking at the algorithm through the lens of optimization, we first argue that TD can be viewed as an iterative optimization algorithm where the function to be minimized changes per iteration. By carefully investigating the divergence displayed by TD on a classical counter example, we identify two forces that determine the convergent or divergent behavior of the algorithm. We next formalize our discovery in the linear TD setting with quadratic loss and prove that convergence of TD hinges on the interplay between these two forces. We extend this optimization perspective to prove convergence of TD in a much broader setting than just linear approximation and squared loss. Our results provide a theoretical explanation for the successful application of TD in reinforcement learning.

READ FULL TEXT
research
02/11/2022

Regularized Q-learning

Q-learning is widely used algorithm in reinforcement learning community....
research
08/18/2023

Baird Counterexample Is Solved: with an example of How to Debug a Two-time-scale Algorithm

Baird counterexample was proposed by Leemon Baird in 1995, first used to...
research
06/15/2017

Reinforcement Learning under Model Mismatch

We study reinforcement learning under model misspecification, where we d...
research
06/06/2016

Learning to Optimize

Algorithm design is a laborious process and often requires many iteratio...
research
04/25/2019

Zap Q-Learning for Optimal Stopping Time Problems

We propose a novel reinforcement learning algorithm that approximates so...
research
01/11/2023

An Analysis of Quantile Temporal-Difference Learning

We analyse quantile temporal-difference learning (QTD), a distributional...
research
11/27/2015

On the convergence of cycle detection for navigational reinforcement learning

We consider a reinforcement learning framework where agents have to navi...

Please sign up or login with your details

Forgot password? Click here to reset