Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning

10/28/2017
by   Sergio Valcarcel Macua, et al.
0

We propose a multiagent distributed actor-critic algorithm for multitask reinforcement learning (MRL), named Diff-DAC. The agents are connected, forming a (possibly sparse) network. Each agent is assigned a task and has access to data from this local task only. During the learning process, the agents are able to communicate some parameters to their neighbors. Since the agents incorporate their neighbors' parameters into their own learning rules, the information is diffused across the network, and they can learn a common policy that generalizes well across all tasks. Diff-DAC is scalable since the computational complexity and communication overhead per agent grow with the number of neighbors, rather than with the total number of agents. Moreover, the algorithm is fully distributed in the sense that agents self-organize, with no need for coordinator node. Diff-DAC follows an actor-critic scheme where the value function and the policy are approximated with deep neural networks, being able to learn expressive policies from raw data. As a by-product of Diff-DAC's derivation from duality theory, we provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method to approximate the solution of a linear program. Experiments illustrate the performance of the algorithm in the cart-pole, inverted pendulum, and swing-up cart-pole environments.

READ FULL TEXT
research
10/28/2017

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

We propose a fully distributed actor-critic algorithm approximated by de...
research
11/15/2018

Seq2Seq Mimic Games: A Signaling Perspective

We study the emergence of communication in multiagent adversarial settin...
research
09/18/2017

Guided Deep Reinforcement Learning for Swarm Systems

In this paper, we investigate how to learn to control a group of coopera...
research
06/25/2021

A nonlinear hidden layer enables actor-critic agents to learn multiple paired association navigation

Navigation to multiple cued reward locations has been increasingly used ...
research
10/18/2021

An actor-critic algorithm with deep double recurrent agents to solve the job shop scheduling problem

There is a growing interest in integrating machine learning techniques a...
research
05/17/2021

Using Distributed Reinforcement Learning for Resource Orchestration in a Network Slicing Scenario

The Network Slicing (NS) paradigm enables the partition of physical and ...
research
11/06/2016

Modular Multitask Reinforcement Learning with Policy Sketches

We describe a framework for multitask deep reinforcement learning guided...

Please sign up or login with your details

Forgot password? Click here to reset