Finite-time analysis of single-timescale actor-critic

10/18/2022
by   Xuyang Chen, et al.
0

Despite the great empirical success of actor-critic methods, its finite-time convergence is still poorly understood in its most practical form. In particular, the analysis of single-timescale actor-critic presents significant challenges due to the highly inaccurate critic estimation and the complex error propagation dynamics over iterations. Existing works on analyzing single-timescale actor-critic only focus on the i.i.d. sampling or tabular setting for simplicity, which is rarely the case in practical applications. We consider the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic is updated with a single Markovian sample per actor step. We prove that the online single-timescale actor-critic method is guaranteed to find an ϵ-approximate stationary point with 𝒪(ϵ^-2) sample complexity under standard assumptions, which can be further improved to 𝒪(ϵ^-2) under i.i.d. sampling. Our analysis develops a novel framework that evaluates and controls the error propagation between actor and critic in a systematic way. To our knowledge, this is the first finite-time analysis for online single-timescale actor-critic method. Overall, our results compare favorably to the existing literature on analyzing actor-critic in terms of considering the most practical settings and requiring weaker assumptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2022

A Small Gain Analysis of Single Timescale Actor Critic

We consider a version of actor-critic which uses proportional step-sizes...
research
08/18/2022

Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator

The actor-critic (AC) reinforcement learning algorithms have been the po...
research
01/26/2021

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

Actor-critic style two-time-scale algorithms are very popular in reinfor...
research
06/12/2022

Finite-Time Analysis of Fully Decentralized Single-Timescale Actor-Critic

Decentralized Actor-Critic (AC) algorithms have been widely utilized for...
research
06/20/2023

Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap

Warm-Start reinforcement learning (RL), aided by a prior policy obtained...
research
12/20/2013

A Supervised Goal Directed Algorithm in Economical Choice Behaviour: An Actor-Critic Approach

This paper aims to find an algorithmic structure that affords to predict...
research
07/27/2022

JDRec: Practical Actor-Critic Framework for Online Combinatorial Recommender System

A combinatorial recommender (CR) system feeds a list of items to a user ...

Please sign up or login with your details

Forgot password? Click here to reset