Curiosity-driven Exploration by Self-supervised Prediction

05/15/2017
by   Deepak Pathak, et al.
0

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be useful later in its life. We formulate curiosity as the error in an agent's ability to predict the consequence of its own actions in a visual feature space learned by a self-supervised inverse dynamics model. Our formulation scales to high-dimensional continuous state spaces like images, bypasses the difficulties of directly predicting pixels, and, critically, ignores the aspects of the environment that cannot affect the agent. The proposed approach is evaluated in two environments: VizDoom and Super Mario Bros. Three broad settings are investigated: 1) sparse extrinsic reward, where curiosity allows for far fewer interactions with the environment to reach the goal; 2) exploration with no extrinsic reward, where curiosity pushes the agent to explore more efficiently; and 3) generalization to unseen scenarios (e.g. new levels of the same game) where the knowledge gained from earlier experience helps the agent explore new places much faster than starting from scratch. Demo video and code available at https://pathak22.github.io/noreward-rl/

READ FULL TEXT

page 1

page 5

research
04/15/2021

Self-Supervised Exploration via Latent Bayesian Surprise

Training with Reinforcement Learning requires a reward function that is ...
research
04/24/2021

Ask Explore: Grounded Question Answering for Curiosity-Driven Exploration

In many real-world scenarios where extrinsic rewards to the agent are ex...
research
02/22/2023

Exploration by self-supervised exploitation

Reinforcement learning can solve decision-making problems and train an a...
research
06/10/2019

Self-Supervised Exploration via Disagreement

Efficient exploration is a long-standing problem in sensorimotor learnin...
research
06/27/2019

Supervise Thyself: Examining Self-Supervised Representations in Interactive Environments

Self-supervised methods, wherein an agent learns representations solely ...
research
11/18/2022

Curiosity in hindsight

Consider the exploration in sparse-reward or reward-free environments, s...
research
09/17/2021

Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

Curiosity-based reward schemes can present powerful exploration mechanis...

Please sign up or login with your details

Forgot password? Click here to reset