TOMA: Topological Map Abstraction for Reinforcement Learning

by   Zhao-Heng Yin, et al.

Animals are able to discover the topological map (graph) of surrounding environment, which will be used for navigation. Inspired by this biological phenomenon, researchers have recently proposed to generate topological map (graph) representation for Markov decision process (MDP) and use such graphs for planning in reinforcement learning (RL). However, existing graph generation methods suffer from many drawbacks. One drawback is that existing methods do not learn an abstraction for graphs, which results in high memory cost. Another drawback is that these existing methods can only work in some specific settings, which limits their application. In this paper, we propose a new method, called TOpological Map Abstraction (TOMA), for graph generation. TOMA can generate an abstract graph representation for MDP, which costs much less memory than existing methods. Furthermore, the generated graphs of TOMA can be used as a basic multi-purpose tool for different RL applications. As an application example, we propose planning to explore, in which TOMA is used to accelerate exploration by guiding the agent towards unexplored states. A novel experience replay module called vertex memory is also proposed to improve exploration performance. Experimental results show that TOMA can robustly generate abstract graph representation on several 2D world environments with different types of observation. Under the guidance of such graph representation, the agent can escape local minima during exploration.


page 7

page 8

page 9


Learning Abstract Models for Strategic Exploration and Fast Reward Transfer

Model-based reinforcement learning (RL) is appealing because (i) it enab...

Variance-Based Rewards for Approximate Bayesian Reinforcement Learning

The exploreexploit dilemma is one of the central challenges in Reinforce...

Deep Reinforcement Learning with Graph-based State Representations

Deep RL approaches build much of their success on the ability of the dee...

An Analysis of Abstracted Model-Based Reinforcement Learning

Many methods for Model-based Reinforcement learning (MBRL) provide guara...

Exploiting Multiple Abstractions in Episodic RL via Reward Shaping

One major limitation to the applicability of Reinforcement Learning (RL)...

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

Reinforcement learning (RL) has been demonstrated to have great potentia...

Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism

The dynamic job-shop scheduling problem (DJSP) is a class of scheduling ...

Please sign up or login with your details

Forgot password? Click here to reset