Exploration in Feature Space for Reinforcement Learning

The infamous exploration-exploitation dilemma is one of the oldest and most important problems in reinforcement learning (RL). Deliberate and effective exploration is necessary for RL agents to succeed in most environments. However, until very recently even very sophisticated RL algorithms employed simple, undirected exploration strategies in large-scale RL tasks. We introduce a new optimistic count-based exploration algorithm for RL that is feasible in high-dimensional MDPs. The success of RL algorithms in these domains depends crucially on generalization from limited training experience. Function approximation techniques enable RL agents to generalize in order to estimate the value of unvisited states, but at present few methods have achieved generalization about the agent's uncertainty regarding unvisited states. We present a new method for computing a generalized state visit-count, which allows the agent to estimate the uncertainty associated with any state. In contrast to existing exploration techniques, our ϕ-pseudocount achieves generalization by exploiting the feature representation of the state space that is used for value function approximation. States that have less frequently observed features are deemed more uncertain. The resulting ϕ-Exploration-Bonus algorithm rewards the agent for exploring in feature space rather than in the original state space. This method is simpler and less computationally expensive than some previous proposals, and achieves near state-of-the-art results on high-dimensional RL benchmarks. In particular, we report world-class results on several notoriously difficult Atari 2600 video games, including Montezuma's Revenge.

READ FULL TEXT

page 38

page 50

page 64

page 66

page 69

page 70

page 71

page 72

research
06/25/2017

Count-Based Exploration in Feature Space for Reinforcement Learning

We introduce a new count-based optimistic exploration algorithm for Rein...
research
06/08/2023

On the Importance of Exploration for Generalization in Reinforcement Learning

Existing approaches for improving generalization in deep reinforcement l...
research
11/15/2016

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

Count-based exploration algorithms are known to perform near-optimally w...
research
03/23/2017

Unsupervised Basis Function Adaptation for Reinforcement Learning

When using reinforcement learning (RL) algorithms to evaluate a policy i...
research
05/31/2023

Accelerating Reinforcement Learning with Value-Conditional State Entropy Exploration

A promising technique for exploration is to maximize the entropy of visi...
research
10/16/2014

Domain-Independent Optimistic Initialization for Reinforcement Learning

In Reinforcement Learning (RL), it is common to use optimistic initializ...
research
07/31/2018

Count-Based Exploration with the Successor Representation

The problem of exploration in reinforcement learning is well-understood ...

Please sign up or login with your details

Forgot password? Click here to reset