DiGrad: Multi-Task Reinforcement Learning with Shared Actions

02/27/2018
by   Parijat Dewangan, et al.
0

Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate each other, making the learning unstable and sometimes less data efficient. In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient). The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network. We also propose a simple heuristic in the differential policy gradient update to further improve the learning. The proposed architecture was tested on 8 link planar manipulator and 27 degrees of freedom(DoF) Humanoid for learning multi-goal reachability tasks for 3 and 2 end effectors respectively. We show that our approach supports efficient multi-task learning in complex robotic systems, outperforming related methods in continuous action spaces.

READ FULL TEXT
research
02/03/2018

Multi-task Learning for Continuous Control

Reliable and effective multi-task learning is a prerequisite for the dev...
research
06/08/2020

A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning

We develop a mathematical framework for solving multi-task reinforcement...
research
07/13/2017

Distral: Robust Multitask Reinforcement Learning

Most deep reinforcement learning algorithms are data inefficient in comp...
research
09/26/2019

V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control

Some of the most successful applications of deep reinforcement learning ...
research
10/21/2022

PaCo: Parameter-Compositional Multi-Task Reinforcement Learning

The purpose of multi-task reinforcement learning (MTRL) is to train a si...
research
11/24/2017

Action Branching Architectures for Deep Reinforcement Learning

Discrete-action algorithms have been central to numerous recent successe...
research
03/08/2019

Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes

Recent advances in on-policy reinforcement learning (RL) methods enabled...

Please sign up or login with your details

Forgot password? Click here to reset