Empowerment-driven Exploration using Mutual Information Estimation

10/11/2018
by   Navneet Madhu Kumar, et al.
0

Exploration is a difficult challenge in reinforcement learning and is of prime importance in sparse reward environments. However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments. In such cases, empowerment can serve as an intrinsic reward signal to enable the agent to maximize the influence it has over the near future. We formulate empowerment as the channel capacity between states and actions and is calculated by estimating the mutual information between the actions and the following states. The mutual information is estimated using Mutual Information Neural Estimator and a forward dynamics model. We demonstrate that an empowerment driven agent is able to improve significantly the score of a baseline DQN agent on the game of Montezuma's Revenge.

READ FULL TEXT
research
01/16/2020

MIME: Mutual Information Minimisation Exploration

We show that reinforcement learning agents that learn by surprise (surpr...
research
02/05/2020

Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning

In reinforcement learning, an agent learns to reach a set of goals by me...
research
12/04/2019

Learning Efficient Representation for Intrinsic Motivation

Mutual Information between agent Actions and environment States (MIAS) q...
research
10/09/2022

Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

Adapting to the changes in transition dynamics is essential in robotic a...
research
10/02/2018

EMI: Exploration with Mutual Information Maximizing State and Action Embeddings

Policy optimization struggles when the reward feedback signal is very sp...
research
06/07/2021

Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Many reinforcement learning (RL) environments consist of independent ent...
research
09/29/2015

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

The mutual information is a core statistical quantity that has applicati...

Please sign up or login with your details

Forgot password? Click here to reset