Self-Supervised Exploration via Latent Bayesian Surprise

04/15/2021
by   Pietro Mazzaglia, et al.
0

Training with Reinforcement Learning requires a reward function that is used to guide the agent towards achieving its objective. However, designing smooth and well-behaved rewards is in general not trivial and requires significant human engineering efforts. Generating rewards in self-supervised way, by inspiring the agent with an intrinsic desire to learn and explore the environment, might induce more general behaviours. In this work, we propose a curiosity-based bonus as intrinsic reward for Reinforcement Learning, computed as the Bayesian surprise with respect to a latent state variable, learnt by reconstructing fixed random features. We extensively evaluate our model by measuring the agent's performance in terms of environment exploration, for continuous tasks, and looking at the game scores achieved, for video games. Our model is computationally cheap and empirically shows state-of-the-art performance on several problems. Furthermore, experimenting on an environment with stochastic actions, our approach emerged to be the most resilient to simple stochasticity. Further visualization is available on the project webpage.(https://lbsexploration.github.io/)

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement lea...
research
02/22/2023

Exploration by self-supervised exploitation

Reinforcement learning can solve decision-making problems and train an a...
research
05/15/2017

Curiosity-driven Exploration by Self-supervised Prediction

In many real-world scenarios, rewards extrinsic to the agent are extreme...
research
05/29/2018

Playing hard exploration games by watching YouTube

Deep reinforcement learning methods traditionally struggle with tasks wh...
research
08/13/2018

Large-Scale Study of Curiosity-Driven Learning

Reinforcement learning algorithms rely on carefully engineering environm...
research
11/05/2018

Contingency-Aware Exploration in Reinforcement Learning

This paper investigates whether learning contingency-awareness and contr...
research
08/24/2022

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

In real-world scenarios, reinforcement learning under sparse-reward syne...

Please sign up or login with your details

Forgot password? Click here to reset