AI Chat AI Image Generator AI Video Text to Speech

Double Q(σ) and Q(σ, λ): Unifying Reinforcement Learning Control Algorithms

11/05/2017

∙

by Markus Dumke, et al.

∙

∙

Temporal-difference (TD) learning is an important field in reinforcement learning. Sarsa and Q-Learning are among the most used TD algorithms. The Q(σ) algorithm (Sutton and Barto (2017)) unifies both. This paper extends the Q(σ) algorithm to an online multi-step algorithm Q(σ, λ) using eligibility traces and introduces Double Q(σ) as the extension of Q(σ) to double learning. Experiments suggest that the new Q(σ, λ) algorithm can outperform the classical TD control methods Sarsa(λ), Q(λ) and Q(σ).

page 1

page 2

page 3

page 4

research

∙ 08/15/2020

Chrome Dino Run using Reinforcement Learning

Reinforcement Learning is one of the most advanced set of algorithms kno...

0 Divyanshu Marwah, et al. ∙

research

∙ 06/17/2023

Vanishing Bias Heuristic-guided Reinforcement Learning Algorithm

Reinforcement Learning has achieved tremendous success in the many Atari...

0 Qinru Li, et al. ∙

research

∙ 03/08/2018

The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA

During the 2017 NBA playoffs, Celtics coach Brad Stevens was faced with ...

0 Jiaxuan Wang, et al. ∙

research

∙ 10/07/2022

Elastic Step DQN: A novel multi-step algorithm to alleviate overestimation in Deep QNetworks

Deep Q-Networks algorithm (DQN) was the first reinforcement learning alg...

0 Adrian Ly, et al. ∙

research

∙ 12/13/2015

True Online Temporal-Difference Learning

The temporal-difference methods TD(λ) and Sarsa(λ) form a core part of m...

0 Harm van Seijen, et al. ∙

research

∙ 02/09/2018

A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning

Recently, a new multi-step temporal learning algorithm, called Q(σ), uni...

0 Long Yang, et al. ∙

research

∙ 04/11/2022

Implementing Online Reinforcement Learning with Temporal Neural Networks

A Temporal Neural Network (TNN) architecture for implementing efficient ...

0 James E. Smith, et al. ∙