Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

07/26/2021
by   Gangshan Jing, et al.
0

Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the network structure inherent in the optimization objective, which allows each agent to estimate its local gradient by local cost evaluation independently, without use of any consensus protocol. The proposed algorithm exhibits an asynchronous update scheme, and is designed for stochastic non-convex optimization with a possibly non-convex feasible domain based on the block coordinate descent method. The algorithm is later employed as a distributed model-free RL algorithm for distributed linear quadratic regulator design, where a learning graph is designed to describe the required interaction relationship among agents in distributed learning. We provide an empirical validation of the proposed algorithm to benchmark its performance on convergence rate and variance against a centralized ZOO algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2021

Distributed stochastic gradient tracking algorithm with variance reduction for non-convex optimization

This paper proposes a distributed stochastic algorithm with variance red...
research
03/24/2017

A randomized primal distributed algorithm for partitioned and big-data non-convex optimization

In this paper we consider a distributed optimization scenario in which t...
research
04/22/2022

Distributed stochastic projection-free solver for constrained optimization

This paper proposes a distributed stochastic projection-free algorithm f...
research
03/01/2020

Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks

This paper proposes a fully asynchronous scheme for policy evaluation of...
research
11/17/2022

Escaping From Saddle Points Using Asynchronous Coordinate Gradient Descent

Large-scale non-convex optimization problems are expensive to solve due ...
research
10/13/2017

DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization

Machine learning with big data often involves large optimization models....
research
08/04/2021

Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

In this paper we explore the use of block coordinate descent (BCD) to op...

Please sign up or login with your details

Forgot password? Click here to reset