Control Theoretic Analysis of Temporal Difference Learning

12/29/2021
by   Donghwan Lee, et al.
0

The goal of this paper is to investigate a control theoretic analysis of linear stochastic iterative algorithm and temporal difference (TD) learning. TD-learning is a linear stochastic iterative algorithm to estimate the value function of a given policy for a Markov decision process, which is one of the most popular and fundamental reinforcement learning algorithms. While there has been a series of successful works in theoretical analysis of TD-learning, it was not until recently that researchers found some guarantees on its statistical efficiency. In this paper, we propose a control theoretic finite-time analysis TD-learning, which exploits standard notions in linear system control communities. Therefore, the proposed work provides additional insights on TD-learning and reinforcement learning with simple concepts and analysis tools in control theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2022

Analysis of Temporal Difference Learning: Linear System Approach

The goal of this technical note is to introduce a new finite-time conver...
research
06/06/2018

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Temporal difference learning (TD) is a simple iterative algorithm used t...
research
07/10/2023

Dynamics of Temporal Difference Reinforcement Learning

Reinforcement learning has been successful across several applications i...
research
05/05/2021

H-TD2: Hybrid Temporal Difference Learning for Adaptive Urban Taxi Dispatch

We present H-TD2: Hybrid Temporal Difference Learning for Taxi Dispatch,...
research
03/08/2022

A Sharp Characterization of Linear Estimators for Offline Policy Evaluation

Offline policy evaluation is a fundamental statistical problem in reinfo...
research
01/11/2023

An Analysis of Quantile Temporal-Difference Learning

We analyse quantile temporal-difference learning (QTD), a distributional...
research
07/25/2022

Finite-Time Analysis of Asynchronous Q-learning under Diminishing Step-Size from Control-Theoretic View

Q-learning has long been one of the most popular reinforcement learning ...

Please sign up or login with your details

Forgot password? Click here to reset