Single Time-scale Actor-critic Method to Solve the Linear Quadratic Regulator with Convergence Guarantees

01/31/2022
by   Mo Zhou, et al.
5

We propose a single time-scale actor-critic algorithm to solve the linear quadratic regulator (LQR) problem. A least squares temporal difference (LSTD) method is applied to the critic and a natural policy gradient method is used for the actor. We give a proof of convergence with sample complexity (^-1log(^-1)^2). The method in the proof is applicable to general single time-scale bilevel optimization problem. We also numerically validate our theoretical results on the convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2020

A Finite Time Analysis of Two Time-Scale Actor Critic Methods

Actor-critic (AC) methods have exhibited great empirical success compare...
research
07/13/2019

Stochastic Convergence Results for Regularized Actor-Critic Methods

In this paper, we present a stochastic convergence proof, under suitable...
research
06/28/2023

SARC: Soft Actor Retrospective Critic

The two-time scale nature of SAC, which is an actor-critic algorithm, is...
research
08/18/2022

Global Convergence of Two-timescale Actor-Critic for Solving Linear Quadratic Regulator

The actor-critic (AC) reinforcement learning algorithms have been the po...
research
06/18/2023

On the Global Convergence of Natural Actor-Critic with Two-layer Neural Network Parametrization

Actor-critic algorithms have shown remarkable success in solving state-o...
research
09/29/2021

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

We study a novel two-time-scale stochastic gradient method for solving o...
research
02/26/2020

When Do Drivers Concentrate? Attention-based Driver Behavior Modeling With Deep Reinforcement Learning

Driver distraction a significant risk to driving safety. Apart from spat...

Please sign up or login with your details

Forgot password? Click here to reset