Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

10/28/2017
by   Sergio Valcarcel Macua, et al.
0

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named Diff-DAC, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from its local task only, but it aims to learn a policy that performs well on average for the whole set of tasks. During the learning process, agents communicate their value-policy parameters to their neighbors, diffusing the information across the network, so that they converge to a common policy, with no need for a central node. The method is scalable, since the computational and communication costs per agent grow with its number of neighbors. We derive Diff-DAC's from duality theory and provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method that approximates the solution of a linear program. Experiments suggest that Diff-DAC can outperform the single previous distributed MRL approach (i.e., Dist-MTLPS) and even the centralized architecture.

READ FULL TEXT
research
10/28/2017

Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning

We propose a multiagent distributed actor-critic algorithm for multitask...
research
10/23/2021

Fully Distributed Actor-Critic Architecture for Multitask Deep Reinforcement Learning

We propose a fully distributed actor-critic architecture, named Diff-DAC...
research
10/03/2019

SensorDrop: A Reinforcement Learning Framework for Communication Overhead Reduction on the Edge

In IoT solutions, it is usually desirable to collect data from a large n...
research
04/28/2017

Adaptation and learning over networks for nonlinear system modeling

In this chapter, we analyze nonlinear filtering problems in distributed ...
research
05/14/2019

TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture

In this paper, we propose TauRieL and target Traveling Salesman Problem ...
research
11/15/2018

Seq2Seq Mimic Games: A Signaling Perspective

We study the emergence of communication in multiagent adversarial settin...
research
04/13/2018

Robust Dual View Depp Agent

Motivated by recent advance of machine learning using Deep Reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset