Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse Skills

07/03/2022
by   Julia Tan, et al.
0

Deep reinforcement learning (RL) is a promising approach to solving complex robotics problems. However, the process of learning through trial-and-error interactions is often highly time-consuming, despite recent advancements in RL algorithms. Additionally, the success of RL is critically dependent on how well the reward-shaping function suits the task, which is also time-consuming to design. As agents trained on a variety of robotics problems continue to proliferate, the ability to reuse their valuable learning for new domains becomes increasingly significant. In this paper, we propose a post-hoc technique for policy fusion using Optimal Transport theory as a robust means of consolidating the knowledge of multiple agents that have been trained on distinct scenarios. We further demonstrate that this provides an improved weights initialisation of the neural network policy for learning new tasks, requiring less time and computational resources than either retraining the parent policies or training a new policy from scratch. Ultimately, our results on diverse agents commonly used in deep RL show that specialised knowledge can be unified into a "Renaissance agent", allowing for quicker learning of new skills.

READ FULL TEXT

page 1

page 5

page 7

research
12/06/2022

Reinforcement Learning for UAV control with Policy and Reward Shaping

In recent years, unmanned aerial vehicle (UAV) related technology has ex...
research
09/12/2023

Risk-Aware Reinforcement Learning through Optimal Transport Theory

In the dynamic and uncertain environments where reinforcement learning (...
research
05/17/2023

Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum

While reinforcement learning (RL) has achieved great success in acquirin...
research
06/18/2019

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Solving complex, temporally-extended tasks is a long-standing problem in...
research
08/05/2021

The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning

AI and reinforcement learning (RL) have improved many areas, but are not...
research
02/09/2021

Scheduling the NASA Deep Space Network with Deep Reinforcement Learning

With three complexes spread evenly across the Earth, NASA's Deep Space N...
research
12/04/2022

Hierarchical Policy Blending As Optimal Transport

We present hierarchical policy blending as optimal transport (HiPBOT). T...

Please sign up or login with your details

Forgot password? Click here to reset