Latent World Models For Intrinsically Motivated Exploration

10/05/2020
by   Aleksandr Ermolov, et al.
0

In this work we consider partially observable environments with sparse rewards. We present a self-supervised representation learning method for image-based observations, which arranges embeddings respecting temporal distance of observations. This representation is empirically robust to stochasticity and suitable for novelty detection from the error of a predictive forward model. We consider episodic and life-long uncertainties to guide the exploration. We propose to estimate the missing information about the environment with the world model, which operates in the learned latent space. As a motivation of the method, we analyse the exploration problem in a tabular Partially Observable Labyrinth. We demonstrate the method on image-based hard exploration environments from the Atari benchmark and report significant improvement with respect to prior work. The source code of the method and all the experiments is available at https://github.com/htdt/lwm.

READ FULL TEXT
research
04/07/2022

Temporal Alignment for History Representation in Reinforcement Learning

Environments in Reinforcement Learning are usually only partially observ...
research
08/10/2020

GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning

Autonomous agents using novelty based goal exploration are often efficie...
research
06/16/2022

BYOL-Explore: Exploration by Bootstrapped Prediction

We present BYOL-Explore, a conceptually simple yet general approach for ...
research
10/17/2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement lea...
research
10/02/2018

EMI: Exploration with Mutual Information Maximizing State and Action Embeddings

Policy optimization struggles when the reward feedback signal is very sp...
research
08/31/2022

Cell-Free Latent Go-Explore

In this paper, we introduce Latent Go-Explore (LGE), a simple and genera...
research
01/17/2023

Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Open Information Extraction models have shown promising results with suf...

Please sign up or login with your details

Forgot password? Click here to reset