Learning Skills to Patch Plans Based on Inaccurate Models

09/29/2020
by   Alex LaGrassa, et al.
0

Planners using accurate models can be effective for accomplishing manipulation tasks in the real world, but are typically highly specialized and require significant fine-tuning to be reliable. Meanwhile, learning is useful for adaptation, but can require a substantial amount of data collection. In this paper, we propose a method that improves the efficiency of sub-optimal planners with approximate but simple and fast models by switching to a model-free policy when unexpected transitions are observed. Unlike previous work, our method specifically addresses when the planner fails due to transition model error by patching with a local policy only where needed. First, we use a sub-optimal model-based planner to perform a task until model failure is detected. Next, we learn a local model-free policy from expert demonstrations to complete the task in regions where the model failed. To show the efficacy of our method, we perform experiments with a shape insertion puzzle and compare our results to both pure planning and imitation learning approaches. We then apply our method to a door opening task. Our experiments demonstrate that our patch-enhanced planner performs more reliably than pure planning and with lower overall sample complexity than pure imitation learning.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

research
03/08/2019

Dyna-AIL : Adversarial Imitation Learning by Planning

Adversarial methods for imitation learning have been shown to perform we...
research
04/01/2020

Learning Sparse Rewarded Tasks from Sub-Optimal Demonstrations

Model-free deep reinforcement learning (RL) has demonstrated its superio...
research
03/04/2022

Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization

Recent progress in state-only imitation learning extends the scope of ap...
research
04/07/2020

State-Only Imitation Learning for Dexterous Manipulation

Dexterous manipulation has been a long-standing challenge in robotics. R...
research
03/21/2018

Learning Deep Policies for Physics-Based Manipulation in Clutter

Uncertainty in modeling real world physics makes transferring traditiona...
research
08/16/2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning

We address one-shot imitation learning, where the goal is to execute a p...
research
04/03/2023

Chain-of-Thought Predictive Control

We study generalizable policy learning from demonstrations for complex l...

Please sign up or login with your details

Forgot password? Click here to reset