PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning

04/24/2020
by   Guillaume Matheron, et al.
4

The exploration-exploitation trade-off is at the heart of reinforcement learning (RL). However, most continuous control benchmarks used in recent RL research only require local exploration. This led to the development of algorithms that have basic exploration capabilities, and behave poorly in benchmarks that require more versatile exploration. For instance, as demonstrated in our empirical study, state-of-the-art RL algorithms such as DDPG and TD3 are unable to steer a point mass in even small 2D mazes. In this paper, we propose a new algorithm called "Plan, Backplay, Chain Skills" (PBCS) that combines motion planning and reinforcement learning to solve hard exploration environments. In a first phase, a motion planning algorithm is used to find a single good trajectory, then an RL algorithm is trained using a curriculum derived from the trajectory, by combining a variant of the Backplay algorithm and skill chaining. We show that this method outperforms state-of-the-art RL algorithms in 2D maze environments of various sizes, and is able to improve on the trajectory obtained by the motion planning phase.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2019

Harnessing Reinforcement Learning for Neural Motion Planning

Motion planning is an essential component in most of today's robotic app...
research
08/31/2021

A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures

Motion planning is critical to realize the autonomous operation of mobil...
research
01/18/2020

Multi-agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning

This paper introduces a hybrid algorithm of deep reinforcement learning ...
research
05/09/2018

Learning Coordinated Tasks using Reinforcement Learning in Humanoids

With the advent of artificial intelligence and machine learning, humanoi...
research
11/09/2020

Bimanual Regrasping for Suture Needles using Reinforcement Learning for Rapid Motion Planning

Regrasping a suture needle is an important process in suturing, and prev...
research
02/27/2020

Sub-Goal Trees – a Framework for Goal-Based Reinforcement Learning

Many AI problems, in robotics and other domains, are goal-based, essenti...
research
03/13/2013

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

Bayesian Reinforcement Learning (RL) is capable of not only incorporatin...

Please sign up or login with your details

Forgot password? Click here to reset