Sample-Efficient Learning of Nonprehensile Manipulation Policies via Physics-Based Informed State Distributions

10/24/2018
by   Lerrel Pinto, et al.
2

This paper proposes a sample-efficient yet simple approach to learning closed-loop policies for nonprehensile manipulation. Although reinforcement learning (RL) can learn closed-loop policies without requiring access to underlying physics models, it suffers from poor sample complexity on challenging tasks. To overcome this problem, we leverage rearrangement planning to provide an informative physics-based prior on the environment's optimal state-visitation distribution. Specifically, we present a new technique, Learning with Planned Episodic Resets (LeaPER), that resets the environment's state to one informed by the prior during the learning phase. We experimentally show that LeaPER significantly outperforms traditional RL approaches by a factor of up to 5X on simulated rearrangement. Further, we relax dynamics from quasi-static to welded contacts to illustrate that LeaPER is robust to the use of simpler physics models. Finally, LeaPER's closed-loop policies significantly improve task success rates relative to both open-loop controls with a planned path or simple feedback controllers that track open-loop trajectories. We demonstrate the performance and behavior of LeaPER on a physical 7-DOF manipulator in https://youtu.be/feS-zFq6J1c.

READ FULL TEXT

page 1

page 2

page 12

research
05/18/2021

Robust Physics-Based Manipulation by Interleaving Open and Closed-Loop Execution

We present a planning and control framework for physics-based manipulati...
research
12/05/2022

Physics-Informed Model-Based Reinforcement Learning

We apply reinforcement learning (RL) to robotics. One of the drawbacks o...
research
03/21/2018

Learning Deep Policies for Physics-Based Manipulation in Clutter

Uncertainty in modeling real world physics makes transferring traditiona...
research
06/11/2023

Generalizable Wireless Navigation through Physics-Informed Reinforcement Learning in Wireless Digital Twin

The growing focus on indoor robot navigation utilizing wireless signals ...
research
06/20/2023

Informed POMDP: Leveraging Additional Information in Model-Based RL

In this work, we generalize the problem of learning through interaction ...
research
12/06/2022

Few-Shot Preference Learning for Human-in-the-Loop RL

While reinforcement learning (RL) has become a more popular approach for...
research
04/09/2019

Practical Open-Loop Optimistic Planning

We consider the problem of online planning in a Markov Decision Process ...

Please sign up or login with your details

Forgot password? Click here to reset