Reward-Based Environment States for Robot Manipulation Policy Learning

12/10/2021

∙

Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success classifier. Our experiments, using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task, reveal that our proposed state representation can achieve up to 97 our best policies.

READ FULL TEXT

Reward-Based Environment States for Robot Manipulation Policy Learning

Sign in with Google

Consider DeepAI Pro