A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

02/08/2021
by   Forest Agostinelli, et al.
8

A* search is an informed search algorithm that uses a heuristic function to guide the order in which nodes are expanded. Since the computation required to expand a node and compute the heuristic values for all of its generated children grows linearly with the size of the action space, A* search can become impractical for problems with large action spaces. This computational burden becomes even more apparent when heuristic functions are learned by general, but computationally expensive, deep neural networks. To address this problem, we introduce DeepCubeAQ, a deep reinforcement learning and search algorithm that builds on the DeepCubeA algorithm and deep Q-networks. DeepCubeAQ learns a heuristic function that, with a single forward pass through a deep neural network, computes the sum of the transition cost and the heuristic value of all of the children of a node without explicitly generating any of the children, eliminating the need for node expansions. DeepCubeAQ then uses a novel variant of A* search, called AQ* search, that uses the deep Q-network to guide search. We use DeepCubeAQ to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and show that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time when performing AQ* search and that AQ* search is orders of magnitude faster than A* search.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2021

A Meta-Heuristic Search Algorithm based on Infrasonic Mating Displays in Peafowls

Meta-heuristic techniques are important as they are used to find solutio...
research
03/21/2021

Policy-Guided Heuristic Search with Guarantees

The use of a policy and a heuristic function for guiding search can be q...
research
09/19/2022

MAN: Multi-Action Networks Learning

Learning control policies with large action spaces is a challenging prob...
research
09/12/2022

A Differentiable Loss Function for Learning Heuristics in A*

Optimization of heuristic functions for the A* algorithm, realized by de...
research
11/27/2018

Single-Agent Policy Tree Search With Guarantees

We introduce two novel tree search algorithms that use a policy to guide...
research
10/31/2011

Probabilistic Planning via Heuristic Forward Search and Weighted Model Counting

We present a new algorithm for probabilistic planning with no observabil...
research
06/08/2022

Planning with Dynamically Estimated Action Costs

Information about action costs is critical for real-world AI planning ap...

Please sign up or login with your details

Forgot password? Click here to reset