MIME: Mutual Information Minimisation Exploration

01/16/2020
by   Haitao Xu, et al.
0

We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn. We propose a counter-intuitive solution that we call Mutual Information Minimising Exploration (MIME) where an agent learns a latent representation of the environment without trying to predict the future states. We show that our agent performs significantly better over sharp transition boundaries while matching the performance of surprisal driven agents elsewhere. In particular, we show state-of-the-art performance on difficult learning games such as Gravitar, Montezuma's Revenge and Doom.

READ FULL TEXT
research
10/11/2018

Empowerment-driven Exploration using Mutual Information Estimation

Exploration is a difficult challenge in reinforcement learning and is of...
research
06/12/2020

Deep Reinforcement and InfoMax Learning

Our work is based on the hypothesis that a model-free agent whose repres...
research
08/06/2018

Learning to Share and Hide Intentions using Information Regularization

Learning to cooperate with friends and compete with foes is a key compon...
research
06/21/2020

Emergent cooperation through mutual information maximization

With artificial intelligence systems becoming ubiquitous in our society,...
research
10/09/2022

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic a...
research
03/10/2021

Hard Attention Control By Mutual Information Maximization

Biological agents have adopted the principle of attention to limit the r...
research
06/19/2023

Learning Models of Adversarial Agent Behavior under Partial Observability

The need for opponent modeling and tracking arises in several real-world...

Please sign up or login with your details

Forgot password? Click here to reset