On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost

07/14/2019
by   Zhuoran Yang, et al.
0

Despite the empirical success of the actor-critic algorithm, its theoretical understanding lags behind. In a broader context, actor-critic can be viewed as an online alternating update algorithm for bilevel optimization, whose convergence is known to be fragile. To understand the instability of actor-critic, we focus on its application to linear quadratic regulators, a simple yet fundamental setting of reinforcement learning. We establish a nonasymptotic convergence analysis of actor-critic in this setting. In particular, we prove that actor-critic finds a globally optimal pair of actor (policy) and critic (action-value function) at a linear rate of convergence. Our analysis may serve as a preliminary step towards a complete theoretical understanding of bilevel optimization with nonconvex subproblems, which is NP-hard in the worst case and is often solved using heuristics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2021

Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Actor-critic methods integrating target networks have exhibited a stupen...
research
12/14/2019

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

Multi-agent reinforcement learning has been successfully applied to a nu...
research
07/10/2020

A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic

This paper analyzes a two-timescale stochastic algorithm for a class of ...
research
04/05/2020

Reinforcement Learning Architectures: SAC, TAC, and ESAC

The trend is to implement intelligent agents capable of analyzing availa...
research
05/14/2019

TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture

In this paper, we propose TauRieL and target Traveling Salesman Problem ...
research
08/19/2021

Global Convergence of the ODE Limit for Online Actor-Critic Algorithms in Reinforcement Learning

Actor-critic algorithms are widely used in reinforcement learning, but a...
research
01/26/2021

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

Actor-critic style two-time-scale algorithms are very popular in reinfor...

Please sign up or login with your details

Forgot password? Click here to reset