Convergent Tree-Backup and Retrace with Function Approximation

05/25/2017
by   Ahmed Touati, et al.
0

Off-policy learning is key to scaling up reinforcement learning as it allows to learn about a target policy from the experience generated by a different behavior policy. Unfortunately, it has been challenging to combine off-policy learning with function approximation and multi-step bootstrapping in a way that leads to both stable and efficient algorithms. In this paper, we show that the Tree Backup and Retrace algorithms are unstable with linear function approximation, both in theory and with specific examples. Based on our analysis, we then derive stable and efficient gradient-based algorithms, compatible with accumulating or Dutch traces, using a novel methodology based on saddle-point methods. In addition to convergence guarantees, we provide finite-sample analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2022

Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation

In this work, we study policy-based methods for solving the reinforcemen...
research
05/10/2021

Parameter-free Gradient Temporal Difference Learning

Reinforcement learning lies at the intersection of several challenges. M...
research
06/06/2020

Stable and Efficient Policy Evaluation

Policy evaluation algorithms are essential to reinforcement learning due...
research
04/15/2013

Off-policy Learning with Eligibility Traces: A Survey

In the framework of Markov Decision Processes, off-policy learning, that...
research
01/21/2021

Breaking the Deadly Triad with a Target Network

The deadly triad refers to the instability of a reinforcement learning a...
research
06/25/2019

Expected Sarsa(λ) with Control Variate for Variance Reduction

Off-policy learning is powerful for reinforcement learning. However, the...
research
05/25/2018

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Policy evaluation with linear function approximation is an important pro...

Please sign up or login with your details

Forgot password? Click here to reset