Investigating practical linear temporal difference learning

02/28/2016
by   Adam White, et al.
0

Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximation, linear complexity, and temporal difference (TD) updates. This paper contains two main contributions. First, we derive two new hybrid TD policy-evaluation algorithms, which fill a gap in this collection of algorithms. Second, we perform an empirical comparison to elicit which of these new linear TD methods should be preferred in different situations, and make concrete suggestions about practical use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2021

Emphatic Algorithms for Deep Reinforcement Learning

Off-policy learning allows us to learn about possible policies of behavi...
research
02/20/2023

Backstepping Temporal Difference Learning

Off-policy learning ability is an important feature of reinforcement lea...
research
11/06/2018

Online Off-policy Prediction

This paper investigates the problem of online prediction learning, where...
research
02/24/2023

Why Target Networks Stabilise Temporal Difference Methods

Integral to recent successes in deep reinforcement learning has been a c...
research
05/21/2020

Novel Policy Seeking with Constrained Optimization

In this work, we address the problem of learning to seek novel policies ...
research
12/13/2015

True Online Temporal-Difference Learning

The temporal-difference methods TD(λ) and Sarsa(λ) form a core part of m...
research
10/24/2020

An Adiabatic Theorem for Policy Tracking with TD-learning

We evaluate the ability of temporal difference learning to track the rew...

Please sign up or login with your details

Forgot password? Click here to reset