Multi-task Learning with Gradient Guided Policy Specialization

09/23/2017
by   Wenhao Yu, et al.
0

We present a method for efficient learning of control policies for multiple related robotic motor skills. Our approach consists of two stages, joint training and specialization training. During the joint training stage, a neural network policy is trained with minimal information to disambiguate the motor skills. This forces the policy to learn a common representation of the different tasks. Then, during the specialization training stage we selectively split the weights of the policy based on a per-weight metric that measures the disagreement among the multiple tasks. By splitting part of the control policy, it can be further trained to specialize to each task. To update the control policy during learning, we use Trust Region Policy Optimization with Generalized Advantage Function (TRPOGAE). We propose a modification to the gradient update stage of TRPO to better accommodate multi-task learning scenarios. We evaluate our approach on three continuous motor skill learning problems in simulation: 1) a locomotion task where three single legged robots with considerable difference in shape and size are trained to hop forward, 2) a manipulation task where three robot manipulators with different sizes and joint types are trained to reach different locations in 3D space, and 3) locomotion of a two-legged robot, whose range of motion of one leg is constrained in different ways. We compare our training method to three baselines. The first baseline uses only joint training for the policy, the second trains independent policies for each task, and the last randomly selects weights to split. We show that our approach learns more efficiently than each of the baseline methods.

READ FULL TEXT

page 4

page 5

research
01/12/2021

Bootstrapping Motor Skill Learning with Motion Planning

Learning a robot motor skill from scratch is impractically slow; so much...
research
12/11/2020

Protective Policy Transfer

Being able to transfer existing skills to new situations is a key capabi...
research
09/16/2022

Learning Policies for Continuous Control via Transition Models

It is doubtful that animals have perfect inverse models of their limbs (...
research
01/01/2022

Learning Free Gait Transition for Quadruped Robots via Phase-Guided Controller

Gaits and transitions are key components in legged locomotion. For legge...
research
12/10/2020

Multi-expert learning of adaptive legged locomotion

Achieving versatile robot locomotion requires motor skills which can ada...
research
05/19/2022

Concurrent Policy Blending and System Identification for Generalized Assistive Control

In this work, we address the problem of solving complex collaborative ro...
research
02/19/2023

Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

This work aims to push the limits of agility for bipedal robots by enabl...

Please sign up or login with your details

Forgot password? Click here to reset