Optimal Solving of Constrained Path-Planning Problems with Graph Convolutional Networks and Optimized Tree Search

by   Kevin Osanlou, et al.
Université Paris-Dauphine

Learning-based methods are growing prominence for planning purposes. However, there are very few approaches for learning-assisted constrained path-planning on graphs, while there are multiple downstream practical applications. This is the case for constrained path-planning for Autonomous Unmanned Ground Vehicles (AUGV), typically deployed in disaster relief or search and rescue applications. In off-road environments, the AUGV must dynamically optimize a source-destination path under various operational constraints, out of which several are difficult to predict in advance and need to be addressed on-line. We propose a hybrid solving planner that combines machine learning models and an optimal solver. More specifically, a graph convolutional network (GCN) is used to assist a branch and bound (B B) algorithm in handling the constraints. We conduct experiments on realistic scenarios and show that GCN support enables substantial speedup and smoother scaling to harder problems.



There are no comments yet.


page 1


Learning-based Preference Prediction for Constrained Multi-Criteria Path-Planning

Learning-based methods are increasingly popular for search algorithms in...

Constrained Shortest Path Search with Graph Convolutional Neural Networks

Planning for Autonomous Unmanned Ground Vehicles (AUGV) is still a chall...

CPPNet: A Coverage Path Planning Network

This paper presents a deep-learning based CPP algorithm, called Coverage...

Rapid Uncertainty Propagation and Chance-Constrained Path Planning for Small Unmanned Aerial Vehicles

With the number of small Unmanned Aircraft Systems (sUAS) in the nationa...

Multi-Robot Path Planning in Complex Environments via Graph Embedding

We propose an approach to solve multi-agent path planning (MPP) problems...

Advanced BIT* (ABIT*): Sampling-Based Planning with Advanced Graph-Search Techniques

Path planning is an active area of research essential for many applicati...

Gated Path Planning Networks

Value Iteration Networks (VINs) are effective differentiable path planni...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I introduction

Automated path-planning is an area of interest in AI with a wide panel of applications. The ability to efficiently plan an optimal path in a geometric graph that meets a set of requirements is becoming increasingly critical in a world where autonomy is starting to prevail. The requirements usually consist in a set of constraints imposed on the solution path , making it more difficult to compute. In the case of autonomous unmanned ground vehicles (AUGV), terrain structure is represented through a geometric graph, and maneuvers must consider terrain knowledge. Disaster relief, logistics, or area surveillance are a few among many applications for which online constrained path-planning algorithms enable autonomous mobility using perception and control functionalities. The ability of the AUGV to efficiently come up with an optimal path for a given mission has a direct impact on operational efficiency, underpinning the importance of an efficient path-planner.

For such problems, classical robotic systems integrate A* algorithms [10] as a best-first search approach in the space of available paths. For a complete overview of static algorithms (such as A*), replanning algorithms (D*), anytime algorithms (e.g. ARA*), and anytime replanning algorithms (AD*), we refer the reader to Ferguson et al. [5]

. Tree search algorithms stemming from A* require the specification of a planning domain where constraints are modeled, and can be more or less efficient depending on the affinity of the search heuristic with the planning context. In this work, we focus on the branch and bound (B&B) tree search algorithm instead

[23]. More specifically, our study focuses on the performance gain when it is coupled with machine learning techniques.

Fig. 1: Proposed framework for solving a path-planning problem with constraints in a graph with a GCN-assisted solver. The GCN takes as input the graph and the problem, and provides relevant information to speed up a tree search, after which an optimal solution path can be built.

Convolutional Neural Networks (CNNs) have been proven to be very efficient for computer vision applications, such as image recognition [16]

. They are a multilayer perceptron variant designed for minimal preprocessing and capable of detecting complex patterns in images. In this paper, we are dealing with graphs which represent either maneuvers or off-road navigation. Instead of CNNs, the paper focuses on graph convolutional neural networks (GCNs), a recent architecture for learning complex patterns in graph data 

[2, 14, 19]. One of the main reasons GCNs are preferred over CNNs for graph processing is that nodes and edges are attributed relevant features. CNNs are not able to learn features of an equivalent quality from an image of the graph. Moreover, CNNs are not invariant to node permutations, an issue GCNs do not share.

In this paper, we study a GCN-based approach for constrained path-planning and stick to exact resolution methods for path-related problems in a specific graph. We run experiments on realistic AUGV scenarios for which we consider mandatory pass-by nodes as type of constraint. This makes path-planning similar to the traveling salesman problem (TSP). The TSP is an NP-hard problem for which there exist efficient approximate solvers [22], however it remains a challenge for exact approaches ([3], [27], [17]). We make the following contributions. First, we define a GCN architecture suited for the considered problem. Second, we propose a self-supervised training strategy for the GCN. We then provide a framework which combines the GCN with the depth-first branch and bound (B&B) algorithm. Finally, we conduct experiments on realistic problems for which results exhibit accelerated solving performance.

Ii Related Work

In the past few years there has been a growing interest for transferring the intuitions and practices from neural networks on structured inputs towards graphs [7, 11, 4, 14][11] bridges spectral graph theory with multi-layer neural networks by learning smooth spectral multipliers of the graph Laplacian. Then  [4] and [14]

approximate these smooth filters in the spectral domain using polynomials of the graph Laplacian. Free parameters of the polynomials are learned by a neural network, avoiding the costly computation of the eigenvectors of the graph Laplacian. We refer the reader to 

[1] for a comprehensive review on learning on graphs.

Applications of these types of networks are starting to emerge. Recent works suggest GCNs are capable of making key decisions to solve either path-planning problems in graphs [20], or even STRIPS planning tasks [21]. Regarding path-related optimization problems, Liet al. [20] tackle the maximum independent set (MIS) problem with a solver that combines a GCN-guided tree search and local search optimization. The input to their solver is a graph, and the output is a highly optimized solution. Kool et al. [15]

propose a reinforcement learning framework to solve the travelling salesman problem (TSP) and variants of the vehicle routing problem (VRP) approximately. While they prefer an encoder-decoder architecture over GCNs, they achieve better results than previous learning-based approaches. These related works focus on the approximate solving of a given task in big graphs. To this end, a learning model is coupled with tree search algorithms to narrow a wide search space in order to retrieve a high quality solution in a short time. In contrast, our work is intended for optimal task solving in smaller graphs. As optimal solving requires visiting most of the search space to ensure proof of optimality, previous approaches are not suitable and we proceed differently.

Iii Context and Problem Formalization

We consider a weighted connected graph , where is the set of vertices, the set of edges and the adjacency matrix of the graph. While in this work we choose to deal with realistic AUGV scenario graphs which are undirected (§VII), our approach is equally applicable to directed graphs. In a typical crisis scenario, the AUGV has to proceed from an initial point to several areas for information gathering, before making its way to a final destination to share its assessment of the ongoing situation. Another frequent scenario consists in delivering food and first-aid equipment to areas likely to be undergoing a shortage. Such problems can be formalized mathematically. Let be a path-planning problem instance, defined as follows:

  • is the index of the start node in ,

  • is the index of the destination node in ,

  • is a set of constraints that need to be satisfied.

Solving optimally means finding a path , i.e. a sequence of nodes (or edges), which begins from , ends in , satisfies all constraints in and minimizes the total weight of the edges included in . We can consider various types of constraints in . In this work, we experiment with constraints related to mandatory nodes, which require the solution path to include a given set of nodes . In the next sections, we will refer to a path-planning problem instance simply as an instance. A valid solution path is required to include every node in at least once. Since the order of visit is not imposed for , this problem can be assimilated to a TSP variant.

Iv Path-building with Graph Convolutional Networks

In this section, we present our approach for training a neural network on a particular graph. We aim to leverage the learning capacity of a network to approximate the behavior of a model-based planner on the graph.

Iv-a Neural Networks

Neural Networks (NNs) enable multiple levels of abstraction of data by using models with trainable parameters coupled with non-linear transformations of the input data.

In spite of the complex structure of a NN, the main mechanism is straightforward. A feedforward neural network, or

multi-layer perceptron (MLP)

, with layers describes a function

that maps an input vector

to an output vector . Vector is the input data that we need to analyze (e.g. an image, a signal, a graph, etc.), while is the expected decision from the NN (e.g. a class index, a heatmap, etc.). The function performs successive operations over the input :


where is the hidden state of the network and is the mapping function performed at layer and parameterized by trainable parameters and bias

, and piece-wise activation function

; .

CNNs [6, 18] are a popular architecture for 2D data. They generalize MLPs by sliding groups of parameters across an input vector similarly to filters in image processing, leveraging fewer parameters and parallel computation. Hidden states in CNNs preserve the number of dimensions of the input, i.e. 2D when images are used as input, and are called feature maps.

Iv-B Graph Convolutional Networks

GCNs are generalizations of CNNs to non-Euclidean graphs [1]. GCNs are in fact neural networks based on local operators on a graph which are derived from spectral graph theory. The filter parameters are typically shared over all locations in the graph, thus the name convolutional.

We consider here the approach of Kipf and Welling [14]. The GCNs have the following layer propagation rule:


where is the adjacency matrix of the graph with added self-connections such that when multiplying with we aggregate features vectors from both a node and its neighbors. Matrix

is the identity matrix;

is the diagonal node degree matrix of ; is the activation function, which we set to . Matrix is employed for normalization of in order to avoid a change of scales in the feature vectors when multiplying with . In [14], the authors argue that using a symmetric normalization, i.e. , ensures better dynamics compared to simple averaging of neighboring nodes in the one-sided normalization .

Iv-C Problem Instance Encoding

The input of our model is a vector containing information about the problem instance, including the graph representation. For every instance of a given graph , we associate a vector made up of triplet features from each node in , making up for a total of features. The three features for a node are:

  • start node feature if node is the start node in instance , otherwise

  • end node feature if node is the end node in instance , otherwise

  • mandatory node feature if node is a mandatory node in instance , otherwise .

We obtain the input by stacking node features:


Iv-D Neural network architecture and training

We define a neural network that consists of a sequence of multiple graph convolutions followed by a fully connected layer. This GCN takes as input any instance over the graph

, and outputs a probability vector

. Then, corresponds to the next mandatory node to visit from the node in an optimal path that solves . The hidden states of the graph convolution layers consist of higher-dimensional features for each node in the graph . After the final convolutional layer, the feature matrix is flattened into a vector by concatenating its rows and linked to a fully connected layer that maps it to a vector . We use the softmax function to convert into probabilities.

NNs trained in a supervised manner use labeled training data, i.e. a set of input-output pairs sampled from a large training set. Here, is an instance and is the index of the next mandatory node for in an optimal path solving . We train the GCN on instances that have already been optimally solved by an exact planner, which serves here as a teacher. The network learns to approximate the solutions computed by the planner. To this end, we train the network using the negative log-likelihood loss

, used for multi-class classification, and stochastic gradient descent (SGD).

Iv-E Mandatory Node Ordering

For a given input instance, the GCN computes a mandatory node prediction at a time. In order to make it compatible with instances with varying amounts of mandatory nodes we make a few adjustments. Given an instance with multiple mandatory nodes , we perform multiple recursive GCN calls to get the next mandatory node predictions . More specifically, after a prediction is computed, we generate a sub-instance where the start node becomes the current predicted mandatory node and where the list of mandatory nodes contains the remaining nodes except , i.e. , where and . We use the recursive calls during both training and testing. An interesting side-effect of this strategy is that it ensures an implicit balancing of the training samples by difficulty, as we generate sub-instances ranging from challenging (large ) to trivial (small ).

V Self-supervised learning

We present a self-supervised learning strategy aimed at training the GCN for a particular graph. First, we define a planning domain to solve instances. Second, we introduce a modified version of the A* algorithm for generating optimal data on which the GCN is trained. In this section, we refer to an instance as a planning state

. An end state is a termination instance, i.e. an instance for which the destination node has been reached and all mandatory nodes have been visited. We denote the termination instances as . There are exactly as many termination instances as there are nodes in . We respectively define the successors and predecessors of a state as and in Table I.

Successor state Predecessor state
- or
   is not an end state is not an end state
  Transition cost: Transition cost:
11footnotemark: 1

only if
22footnotemark: 2 only if

TABLE I: Transition rules to successors and predecessors.

Transition costs from a state to a neighboring state is the cost of the edge in the graph linking the start nodes of both states. With these rules, the destination node remains always the same. Therefore, we run the backwards version of A* from every termination instance as initial state (using as rule of succession). For each state visited by A*, a path is built from to which the algorithm considers the shortest. We define as the cost of , as the next state visited after in , and as the first mandatory node of that is visited in .

Furthermore we perform the following changes to A*. First, when a shorter path is found to a state while developing a state , i.e. , the values of and are also updated along with to take into account the shorter path. Secondly, we set the heuristic function to . Since A* is run backwards from a termination state , we are not aiming for the algorithm to reach a defined state in particular, but seek to reach as many states as possible. This ensures that when a state is taken from the OPEN priority list of states left to develop, an optimal path from to has already been found. Choosing as the next mandatory node to visit thus enables optimal solving. Consequently we can add the pair to the training set. We provide the pseudo-code in Algorithm 1.

1:function ComputePaths()
2:     while   do
3:          remove state from the front of OPEN;
4:          if  then
5:               insert the pair to the training set           
6:          for all  do
7:               if  then
8:                    g(s’) = g(s) + c(s, s’)
9:                    a(s’) = s
10:                    if  then
12:                    else
14:                    insert into OPEN with value                               
15:function main()
16:     for all  do
17:          for all  do
19:          , ,
21:          insert into OPEN with value
22:          ComputePaths();      
33footnotemark: 3

: returns length of list ; : returns mandatory nodes of instance ; : returns the start node of instance ; : returns all possible predecessors of instance
44footnotemark: 4 is created from without adding its start node to the list of mandatory nodes
55footnotemark: 5 is created from by adding its start node to the list of mandatory nodes
66footnotemark: 6

Algorithm 1 Backwards A* for reverse instance solving

The data generated by the algorithm is added to the training set and shuffled. The GCN is then trained on this set with supervised learning. Results show that training on the ”synthetic” data generated with A* enables the GCN to generalize well on instances that A* did not process. We argue this is because the distribution obtained with A* is related to the path length of resolved instances. In fact, graph patterns already explored in short path solutions are incrementally included into longest ones.

Vi Depth-First Branch and Bound Tree Search

It is possible to resolve an instance within a combinatorial search tree, rather than in the path-planning domain defined in §V. To this end, we compute the shortest source-destination paths for every pair of nodes in using Dijkstra’s algorithm, as well as the associated path cost. Solving an instance then becomes equivalent to finding the optimal order in which the mandatory nodes in are visited for the first time. The solution path associated with an order of the mandatory nodes can be built by concatenating the shortest path from to , from to , from to , … , from to , and from to . Its total cost is the sum of the cost of each shortest path used to build it. A particular tree can be searched for identifying an optimal order of the mandatory nodes. For an instance , we define the root of this tree as the node, every leaf node as the node, and every intermediate tree node as a mandatory node in , such that a path from the root of the tree to a leaf defines an order in which to visit the mandatory nodes in . The cost of of transitioning from a node to a child node in the tree is the cost of the shortest path in between the pair

. In the following, we refer to this tree as the mandatory search tree. Since we are using here only mandatory node constraints, the combinatorial optimization problem associated with this tree displays a similar structure with the TSP. However, it differs in the choice of the start and destination nodes which are fixed.

The branch and bound algorithm (B&B) is a popular tree search algorithm that is well known for its computational efficiency ([23]). In this work, we consider a depth-first B&B search algorithm. When developing a node inside the tree, the algorithm checks if each branch is expected to host a better solution than the best solution found so far. Should that not be the case for a given branch, the branch is cut, and the algorithm will not develop nodes further down the branch. This is done by using a lower bound and an upper bound. The lower bound is the sum of the total cost from the root node to the current node and a heuristic function that approximates the remaining cost from the current node to the best achievable solution in branches below. This heuristic function should return a value as close as possible to this remaining cost (to cut as frequently as possible), while staying smaller (for the algorithm to remain optimal).

Next, we define the heuristic function that we use. Let be a mandatory node in the mandatory search tree, and the set of remaining mandatory nodes left to the leaf node , i.e. the nodes in that haven’t been included between the root node and . Let . We define two functions, and , that respectively return for a node in the lowest shortest path cost in from to any node in , and the second lowest such cost. We build the heuristic function by considering the remaining nodes left. For each node left, we consider the weight of all edges connecting it to other nodes in and add the first and second smallest such weights, with the exception of and for which only the first smallest weight is added, and divide the total by 2:

Recent progress in learning-assisted tree search has shown that machine learning can be used to narrow the search space in very large domains to allow for efficient solving. Inspiring results have been shown in the game of Go  [25, 26], where a very good solution, which is not required to be optimal, is found in record time. Promising parts of the search tree are first visited in accordance with the suggestions of the neural network, unpromising parts then, only if time allows. On the other hand, in a context where finding an optimal solution is critical, the search cannot be directed in such a way, as proof of optimality is required. Consequently, we keep the GCN out of the tree search procedure. For a given instance, we use the GCN recursively in order to obtain a suggested order of visit of the mandatory nodes (IV-E), from which we build the associated solution path by concatenating the shortest paths. This is done in negligible time compared to the tree search. The cost of the solution path found in such probing manner [24] is then used as an initial upper bound for the B&B algorithm (figure 2). In section §VII we conduct experiments to evaluate the influence of the upper bound obtained with the GCN on search performance.

Fig. 2: GCN-assisted branch and bound algorithm pipeline. The GCN is used recursively to build an order of visit for the mandatory nodes. The order is converted into a path using the previously computed shortest path pairs, and the cost of the path is used as an initial upper bound for the algorithm. Here, the upper bound of the GCN allows for a level 2 early cut.

Vii Experiments

Vii-a Benchmarks and baselines

We run experiments to evaluate the impact of the GCN’s upper bound on the B&B method. Since proof of optimality is necessary in our context, we focus only on small-scale problems for which optimal solving is possible in reasonable time. We consider four different graphs, , , and , with respectively 15, 23, 22 and 23 nodes. These graphs represent realistic AUGV crisis scenarios in which aid has to be provided to key points in operational areas. More details on the graphs are available in [9], from which the scenarios have been built. We generate , , and random instances for each graph respectively. In order to remain close to some ’realistic’ instances, we generate the instances as follows: using the shortest source-destination paths computed previously, we apply a decimation ratio (typically 80%) to keep only the 20% source-destination pairs that have the longest shortest paths. For each resulting pair kept, we generate multiple random instances with an increasing cardinality for the set of mandatory nodes, ranging from 5 to 12.

We consider 4 different baseline solvers on the benchmark instances generated to compare solving performance. All solvers search for an optimal solution path. First, we use a solver based on dynamic programming (DP) which searches the mandatory search tree. Second, we run the B&B algorithm to search the mandatory search tree, both with and without the upper bound provided by the GCN. Lastly, we solve instances using forward A* applied on the planning domain described in section §V, with the minimum spanning tree (MST) as heuristic function. The MST heuristic is computed for an instance (, , ) by considering the complete graph , which comprises only the , and mandatory nodes . All pairs of nodes in are connected by an edge which has a weight equal to the cost of the shortest path from to in . The MST heuristic value is obtained by adding the following three values: the total weight of the MST of all mandatory nodes in , the minimum edge weight in from the node to any node in the MST, and the minimum edge weight in from any node in the MST to the node.

Vii-B Implementation details

We set the run-time of A* to 10 hours per graph for data generation. We use 3 graph convolutional layers of width 100. During training, we apply batch normalization

[12] with decay of moving average , dropout with drop rate of , and train the GCN with Adam [13]. We set the learning rate to . We train models on a Tesla P100 GPU using over 1.5M training examples generated by A*, for 5 hours. We conduct the benchmark tests in this section on a laptop with an Intel i5 processor and 8GB of RAM. We point out that our approach requires GPU only for the training of the GCN, which can be done offline. Problem instances can be then solved online on a CPU.

Fig. 3: Comparison of the performance of different solvers on benchmark instances generated for graph . The axis represents the number of mandatory nodes of the instances, the axis the average solving time. We limit the axis to a range [0,4] to obtain a better comparison scale.
  Mandatory #: 5 7 9 11 12
  Avg. node visits: 11.7K 876K 98.6M T/O T/O
  Avg. time (s): - 0.26 30.52 T/O T/O
  Avg. node visits: 418 4,94K 146K 11,1M 120M
  Avg. time (s): - - 0.07 5.39 71.1
  A*, h=MST
  Avg. state visits: 26.4 58.5 141 342 503
  Avg. time (s): 0.14 0.29 0.73 1.84 2.70
  B&B + GCN
  Avg. node visits: 148 1,24K 10,8K 161K 642K
  Avg. time (s): - - 0.01 0.08 0.34
TABLE II: Experiments for graph . Legend: T/O = minutes timeout.
  Mandatory #: 5 7 9 11 12
  Avg. node visits: 11.7K 876K 98.6M T/O T/O
  Avg. time (s): - 0.24 30.53 T/O T/O
  Avg. node visits: 545 7.58K 193K 11,8M 122M
  Avg. time (s): - - 0.09 5.28 70.8
  A*, h=MST
  Avg. state visits: 41.78 96.8 226 787 1,09K
  Avg. time (s): 0.15 0.44 0.96 3.18 4.36
  B&B + GCN
  Avg. node visits: 233 2.46K 29.1K 235K 859K
  Avg. time (s): - - 0.02 0.11 0.56
TABLE III: Experiments for graph . Legend: T/O = minutes timeout.

Vii-C Results

We summarize results for in Figure 3. We detail the experiments for graph and in Tables II and III. We include solving time only if it is measurable by CPU clock time. Figures for all graphs show a similar trend. We note that for instances with seven mandatory nodes and more, best-first algorithms such as A* applied on the planning domain defined in §V become more suited than depth-first algorithms such as B&B applied on the combinatorial search tree associated with the mandatory constraints. Although each planning state takes longer to compute in order to account for the specifics of the planning domain, overall significantly fewer planning states are visited than mandatory search tree nodes. This is because the planning domain takes advantage of the graph structure, which gives A* a significant edge over depth-first DP and B&B. We note, however, that when the upper bound of the GCN is used, the B&B algorithm is able to outperform A* on all instances, even the most complex ones, while scaling more smoothly with the number of mandatory nodes. In table IV we provide additional insight on these results through information collected from the mandatory search tree for instances with 11 mandatory nodes. The average number of nodes processed in the subtree of each child node of the root node is given, as well as the average score of the best known solution after the subtree is processed.

  Root node child # - 1 2 3
  Avg. node visits - 11M 50K 30K
  Avg. best sol. score 10.35K 9940 9781
  B&B + GCN
  Avg. node visits - 74K 11K 7K
  Avg. best sol. score 9890 9300 9264 9249
TABLE IV: Information from the mandatory search tree.

Since our B&B algorithm is depth-first, processing the entire subtree under the first child node of the root node when no initial upper bound is known is highly computationally expensive. Indeed, no cut can be made until a leaf node is reached, and even then, the identified solution is very likely to be costly compared to the optimal solution, thus the updated upper bound would still not allow for frequent cuts, until a good part of the subtree has been processed. On the other hand, if a good upper bound is known in advance, which is generally the case for the one given by the GCN in our experiments, the algorithm does not suffer from this issue, and early cuts can be made.

Viii Discussion and Further Works

We experiment with path-planning problems defined by three features: the start node, destination node, and mandatory nodes. We accelerate optimal depth-first solving of the search tree associated with the mandatory constraints by leveraging the upper bound computed by the GCN. We show that this speedup is significant, competing successfully with A*. This is the case even for scenarios where handling constraints within the planning domain is more appropriate than extracting and solving them separately. Also, our attempts to guide A* search with a GCN heuristic achieved worse results than the MST heuristic. The reason is due to the best-first approach for which the GCN is unable to provide a suitable heuristic. Moreover, the proposed framework can include additional types of constraints for path-planning problems. Each new constraint type results in additional features on nodes, and potentially also on edges [28]. In this case, the GCN would learn to predict the next node to visit, and not the next mandatory node. Recursive GCN calls would be made until a solution which satisfies all constraints is found, backtracking when necessary. The approach can be combined with state-of-the-art constraint propagation techniques [8]. In the same manner, the solution cost can be used as an initial upper bound for a depth-first search of a combinatorial tree associated with the constraints.

Learning-wise, the more constraint types there are for a path-planning problem, the wider the GCN learning domain will be. Further work will especially focus on this limitation to relate the exhaustiveness of the training phase with the variety of constraint types. Also, the proposed approach requires neural network offline training for a given graph (e.g. problem scenario). It can then be used online for path re-planning purposes, as the AUGV drives through the graph, with appealing computational performances.

Ix Conclusion

We introduced a method combining graph neural networks (GCN) and branch and bound (B&B) tree search to handle constraints in path-planning, successfully accelerating optimal solving. A relevant self-supervised strategy has been developed, based on A*, which provides appropriate data for GCN training. The heuristic information computed by the GCN enables better scaling of the B&B algorithm onto more complex problems. Results exhibit solving times that outperform A* with a handcrafted heuristic function based on minimum spanning trees. Various AUGV applications can benefit from such an approach, especially when known terrains are given and path or itineraries must be computed on the fly. We also hope this line of work will serve to highlight the merits of learning with GCNs in optimal path-planning problems.


  • [1] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst (2017)

    Geometric deep learning: going beyond euclidean data

    IEEE Signal Processing Magazine 34 (4), pp. 18–42. Cited by: §II, §IV-B.
  • [2] J. Bruna, W. Zaremba, A. Szlam, and Y. Lecun (2014) Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations, (English (US)). Cited by: §I.
  • [3] Y. Caseau and F. Laburthe (1997) Solving small tsps with constraints.. In ICLP, Vol. 97, pp. 104. Cited by: §I.
  • [4] M. Defferrard, X. Bresson, and P. Vandergheynst (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, pp. 3844–3852. Cited by: §II.
  • [5] D. Ferguson, M. Likhachev, and A. (. Stentz (2005-06) A guide to heuristic-based path planning. In Proceedings of the International Workshop on Planning under Uncertainty for Autonomous Systems, International Conference on Automated Planning and Scheduling (ICAPS), Cited by: §I.
  • [6] K. Fukushima and S. Miyake (1982)

    Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition

    In Competition and cooperation in neural nets, pp. 267–285. Cited by: §IV-A.
  • [7] M. Gori, G. Monfardini, and F. Scarselli (2005) A new model for learning in graph domains. In Neural Networks, 2005. IJCNN’05. Proceedings. 2005 IEEE International Joint Conference on, Vol. 2, pp. 729–734. Cited by: §II.
  • [8] C. Guettier and F. Lucas (2016) A constraint-based approach for planning unmanned aerial vehicle activities.

    The Knowledge Engineering Review

    31 (5), pp. 486–497.
    Cited by: §VIII.
  • [9] C. Guettier (2007) Solving planning and scheduling problems in network based operations. In Proceedings of Constraint Programming (CP), Cited by: §VII-A.
  • [10] P. Hart, N. Nilsson, and B. Raphael (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems, Science and Cybernetics 4 (2), pp. 100–107. Cited by: §I.
  • [11] M. Henaff, J. Bruna, and Y. LeCun (2015) Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163. Cited by: §II.
  • [12] S. Ioffe and C. Szegedy (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167. Cited by: §VII-B.
  • [13] D. P. Kingma and J. Ba (2015) Adam: a method for stochastic optimization. In 3rd International Conference for Learning Representations, Cited by: §VII-B.
  • [14] T. N. Kipf and M. Welling (2017) Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, Cited by: §I, §II, §IV-B.
  • [15] W. Kool and M. Welling (2018) Attention solves your tsp. arXiv preprint arXiv:1803.08475. Cited by: §II.
  • [16] A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105. Cited by: §I.
  • [17] G. Laporte (1992) The traveling salesman problem: an overview of exact and approximate algorithms. European Journal of Operational Research 59 (2), pp. 231–247. Cited by: §I.
  • [18] Y. LeCun, Y. Bengio, et al. (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361 (10), pp. 1995. Cited by: §IV-A.
  • [19] Y. Li, D. Tarlow, M. Brockschmidt, and R. S. Zemel (2016) Gated graph sequence neural networks. In International Conference on Learning Representations, Cited by: §I.
  • [20] Z. Li, Q. Chen, and V. Koltun (2018) Combinatorial optimization with graph convolutional networks and guided tree search. In Advances in Neural Information Processing Systems, pp. 536––545. Cited by: §II.
  • [21] T. Ma, P. Ferber, S. Huo, J. Chen, and M. Katz (2018) Adaptive planner scheduling with graph neural networks. CoRR abs/1811.00210. External Links: Link, 1811.00210 Cited by: §II.
  • [22] J. Monnot, V. T. Paschos, and S. Toulouse (2003) Approximation algorithms for the traveling salesman problem. Mathematical methods of operations research 56 (3), pp. 387–405. Cited by: §I.
  • [23] P. M. Narendra and K. Fukunaga (1977) A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers C-26, pp. 917–922. Cited by: §I, §VI.
  • [24] K. Osanlou, C. Guettier, A. Bursuc, T. Cazenave, and E. Jacopin (2018) Constrained shortest path search with graph convolutional neural networks. In Workshop on Planning and Learning (PAL-18), Cited by: §VI.
  • [25] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. (2016) Mastering the game of go with deep neural networks and tree search. nature 529 (7587), pp. 484–489. Cited by: §VI.
  • [26] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al. (2017) Mastering the game of go without human knowledge. Nature 550 (7676), pp. 354. Cited by: §VI.
  • [27] T. Volgenant and R. Jonker (1982-01) A branch and bound algorithm for the symmetric traveling salesman problem based on the 1-tree relaxation. European Journal of Operational Research 9, pp. 83–89. External Links: Document Cited by: §I.
  • [28] W. B. W. Vos (2017-08) End-to-end learning of latent edge weights for graph convolutional networks. Master’s Thesis, University of Amsterdam. Cited by: §VIII.