Simple Sensor Intentions for Exploration

by   Tim Hertweck, et al.

Modern reinforcement learning algorithms can learn solutions to increasingly difficult control problems while at the same time reduce the amount of prior knowledge needed for their application. One of the remaining challenges is the definition of reward schemes that appropriately facilitate exploration without biasing the solution in undesirable ways, and that can be implemented on real robotic systems without expensive instrumentation. In this paper we focus on a setting in which goal tasks are defined via simple sparse rewards, and exploration is facilitated via agent-internal auxiliary tasks. We introduce the idea of simple sensor intentions (SSIs) as a generic way to define auxiliary tasks. SSIs reduce the amount of prior knowledge that is required to define suitable rewards. They can further be computed directly from raw sensor streams and thus do not require expensive and possibly brittle state estimation on real systems. We demonstrate that a learning system based on these rewards can solve complex robotic tasks in simulation and in real world settings. In particular, we show that a real robotic arm can learn to grasp and lift and solve a Ball-in-a-Cup task from scratch, when only raw sensor streams are used for both controller input and in the auxiliary reward definition.


page 1

page 3

page 6


3D Simulation for Robot Arm Control with Deep Q-Learning

Recent trends in robot arm control have seen a shift towards end-to-end ...

Learning by Playing - Solving Sparse Reward Tasks from Scratch

We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm ...

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

Exploration versus exploitation dilemma is a significant problem in rein...

Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards

While using shaped rewards can be beneficial when solving sparse reward ...

Learning to Solve Tasks with Exploring Prior Behaviours

Demonstrations are widely used in Deep Reinforcement Learning (DRL) for ...

Efficient Exploration via State Marginal Matching

To solve tasks with sparse rewards, reinforcement learning algorithms mu...

Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems

In this article, we propose a backpropagation-free approach to robotic c...

Please sign up or login with your details

Forgot password? Click here to reset