Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

06/14/2021
by   Anas Barakat, et al.
0

Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning. However, a theoretical understanding of the use of target networks in actor-critic methods is largely missing in the literature. In this paper, we bridge this gap between theory and practice by proposing the first theoretical analysis of an online target-based actor-critic algorithm with linear function approximation in the discounted reward setting. Our algorithm uses three different timescales: one for the actor and two for the critic. Instead of using the standard single timescale temporal difference (TD) learning algorithm as a critic, we use a two timescales target-based version of TD learning closely inspired from practical actor-critic algorithms implementing target networks. First, we establish asymptotic convergence results for both the critic and the actor under Markovian sampling. Then, we provide a finite-time analysis showing the impact of incorporating a target network into actor-critic methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2019

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost

Despite the empirical success of the actor-critic algorithm, its theoret...
research
09/16/2009

A Convergent Online Single Time Scale Actor Critic Algorithm

Actor-Critic based approaches were among the first to address reinforcem...
research
07/06/2019

Playing Flappy Bird via Asynchronous Advantage Actor Critic Algorithm

Flappy Bird, which has a very high popularity, has been trained in many ...
research
02/28/2022

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation

We study the convergence of the actor-critic algorithm with nonlinear fu...
research
12/20/2013

A Supervised Goal Directed Algorithm in Economical Choice Behaviour: An Actor-Critic Approach

This paper aims to find an algorithmic structure that affords to predict...
research
11/03/2020

Intrinsic Robotic Introspection: Learning Internal States From Neuron Activations

We present an introspective framework inspired by the process of how hum...
research
12/17/2021

Symmetry-aware Neural Architecture for Embodied Visual Navigation

Visual exploration is a task that seeks to visit all the navigable areas...

Please sign up or login with your details

Forgot password? Click here to reset