Learning Transferable Graph Exploration

10/28/2019
by   Hanjun Dai, et al.
12

This paper considers the problem of efficient exploration of unseen environments, a key challenge in AI. We propose a `learning to explore' framework where we learn a policy from a distribution of environments. At test time, presented with an unseen environment from the same distribution, the policy aims to generalize the exploration strategy to visit the maximum number of unique states in a limited number of steps. We particularly focus on environments with graph-structured state-spaces that are encountered in many important real-world applications like software testing and map building. We formulate this task as a reinforcement learning problem where the `exploration' agent is rewarded for transitioning to previously unseen environment states and employ a graph-structured memory to encode the agent's past trajectory. Experimental results demonstrate that our approach is extremely effective for exploration of spatial maps; and when applied on the challenging problems of coverage-guided software-testing of domain-specific programs and real-world mobile applications, it outperforms methods that have been hand-engineered by human experts.

READ FULL TEXT
research
05/11/2021

Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration Under Uncertainty

This paper studies the problem of autonomous exploration under localizat...
research
10/30/2019

Continuous Control with Contexts, Provably

A fundamental challenge in artificial intelligence is to build an agent ...
research
03/10/2020

Exploring Unknown States with Action Balance

Exploration is a key problem in reinforcement learning. Recently bonus-b...
research
10/13/2021

Safe Driving via Expert Guided Policy Optimization

When learning common skills like driving, beginners usually have domain ...
research
07/26/2017

Guiding Reinforcement Learning Exploration Using Natural Language

In this work we present a technique to use natural language to help rein...
research
04/30/2023

Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward

We propose Structured Exploration with Achievements (SEA), a multi-stage...
research
03/31/2020

Optimal Bidding Strategy without Exploration in Real-time Bidding

Maximizing utility with a budget constraint is the primary goal for adve...

Please sign up or login with your details

Forgot password? Click here to reset