On the Chattering of SARSA with Linear Function Approximation

02/14/2022
by   Shangtong Zhang, et al.
0

SARSA, a classical on-policy control algorithm for reinforcement learning, is known to chatter when combined with linear function approximation: SARSA does not diverge but oscillates in a bounded region. However, little is know about how fast SARSA converges to that region and how large the region is. In this paper, we make progress towards solving this open problem by showing the convergence rate of projected SARSA to a bounded region. Importantly, the region is much smaller than the ball used for projection provided that the the magnitude of the reward is not too large. Our analysis applies to expected SARSA as well as SARSA(λ). Existing works regarding the convergence of linear SARSA to a fixed point all require the Lipschitz constant of SARSA's policy improvement operator to be sufficiently small; our analysis instead applies to arbitrary Lipschitz constants and thus characterizes the behavior of linear SARSA for a new regime.

READ FULL TEXT
research
02/11/2022

Regularized Q-learning

Q-learning is widely used algorithm in reinforcement learning community....
research
06/07/2023

Convergence of SARSA with linear function approximation: The random horizon case

The reinforcement learning algorithm SARSA combined with linear function...
research
01/12/2020

Distributed Fixed Point Method for Solving Systems of Linear Algebraic Equations

We present a class of iterative fully distributed fixed point methods to...
research
12/16/2021

A Closed-Form Bound on the Asymptotic Linear Convergence of Iterative Methods via Fixed Point Analysis

In many iterative optimization methods, fixed-point theory enables the a...
research
09/26/2019

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Gradient-based temporal difference (GTD) algorithms are widely used in o...
research
07/05/2023

Stability of Q-Learning Through Design and Optimism

Q-learning has become an important part of the reinforcement learning to...

Please sign up or login with your details

Forgot password? Click here to reset