Learning Dense Rewards for Contact-Rich Manipulation Tasks

11/17/2020
by   Zheng Wu, et al.
0

Rewards play a crucial role in reinforcement learning. To arrive at the desired policy, the design of a suitable reward function often requires significant domain expertise as well as trial-and-error. Here, we aim to minimize the effort involved in designing reward functions for contact-rich manipulation tasks. In particular, we provide an approach capable of extracting dense reward functions algorithmically from robots' high-dimensional observations, such as images and tactile feedback. In contrast to state-of-the-art high-dimensional reward learning methodologies, our approach does not leverage adversarial training, and is thus less prone to the associated training instabilities. Instead, our approach learns rewards by estimating task progress in a self-supervised manner. We demonstrate the effectiveness and efficiency of our approach on two contact-rich manipulation tasks, namely, peg-in-hole and USB insertion. The experimental results indicate that the policies trained with the learned reward function achieves better performance and faster convergence compared to the baselines.

READ FULL TEXT

page 1

page 4

research
05/20/2022

Learning Dense Reward with Temporal Variant Self-Supervision

Rewards play an essential role in reinforcement learning. In contrast to...
research
05/20/2019

Reinforcement Learning without Ground-Truth State

To perform robot manipulation tasks, a low dimension state of the enviro...
research
03/20/2019

Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning

Robots must know how to be gentle when they need to interact with fragil...
research
03/08/2021

Self-Supervised Online Reward Shaping in Sparse-Reward Environments

We propose a novel reinforcement learning framework that performs self-s...
research
01/06/2021

One-shot Policy Elicitation via Semantic Reward Manipulation

Synchronizing expectations and knowledge about the state of the world is...
research
09/24/2021

RMPs for Safe Impedance Control in Contact-Rich Manipulation

Variable impedance control in operation-space is a promising approach to...
research
03/28/2022

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

Training a high-dimensional simulated agent with an under-specified rewa...

Please sign up or login with your details

Forgot password? Click here to reset