A Novel Automated Curriculum Strategy to Solve Hard Sokoban Planning Instances

10/03/2021
by   Dieqiao Feng, et al.
8

In recent years, we have witnessed tremendous progress in deep reinforcement learning (RL) for tasks such as Go, Chess, video games, and robot control. Nevertheless, other combinatorial domains, such as AI planning, still pose considerable challenges for RL approaches. The key difficulty in those domains is that a positive reward signal becomes exponentially rare as the minimal solution length increases. So, an RL approach loses its training signal. There has been promising recent progress by using a curriculum-driven learning approach that is designed to solve a single hard instance. We present a novel automated curriculum approach that dynamically selects from a pool of unlabeled training instances of varying task complexity guided by our difficulty quantum momentum strategy. We show how the smoothness of the task hardness impacts the final learning results. In particular, as the size of the instance pool increases, the “hardness gap” decreases, which facilitates a smoother automated curriculum based learning process. Our automated curriculum approach dramatically improves upon the previous approaches. We show our results on Sokoban, which is a traditional PSPACE-complete planning problem and presents a great challenge even for specialized solvers. Our RL agent can solve hard instances that are far out of reach for any previous state-of-the-art Sokoban solver. In particular, our approach can uncover plans that require hundreds of steps, while the best previous search methods would take many years of computing time to solve such instances. In addition, we show that we can further boost the RL performance with an intricate coupling of our automated curriculum approach with a curiosity-driven search strategy and a graph neural net representation.

READ FULL TEXT
research
06/04/2020

Solving Hard AI Planning Instances Using Curriculum-Driven Deep Reinforcement Learning

Despite significant progress in general AI planning, certain domains rem...
research
09/20/2022

Graph Value Iteration

In recent years, deep Reinforcement Learning (RL) has been successful in...
research
04/07/2022

Learning to Solve Travelling Salesman Problem with Hardness-adaptive Curriculum

Various neural network models have been proposed to tackle combinatorial...
research
05/17/2023

Curriculum Learning in Job Shop Scheduling using Reinforcement Learning

Solving job shop scheduling problems (JSSPs) with a fixed strategy, such...
research
10/10/2021

Hard instance learning for quantum adiabatic prime factorization

Prime factorization is a difficult problem with classical computing, who...
research
06/09/2022

Learning to generalize Dispatching rules on the Job Shop Scheduling

This paper introduces a Reinforcement Learning approach to better genera...
research
06/28/2022

Left Heavy Tails and the Effectiveness of the Policy and Value Networks in DNN-based best-first search for Sokoban Planning

Despite the success of practical solvers in various NP-complete domains ...

Please sign up or login with your details

Forgot password? Click here to reset