1.1 Concepts and Terminology
Path planning is, in its general meaning, the task of finding a sequence of actions that allows transitioning from an initial configuration to a goal configuration [3]. Because of the generality of the concept, this report will use the word agent to refer to the object that executes the plan; a term that is commonly used in literature.
Formally speaking the configuration of an agent in its workspace is a set of parameters that define its current state. What a configuration represents is application dependent. For an agent translating in the 2D space, the configuration can be composed of two parameters defining its position (often denoted x and y). For a robotic arm, the parameters of the configuration can be the angular positions of the arm joints. The minimum number of parameters needed to uniquely describe the configuration of an agent is its degree of freedom (DoF)
. The parameters of the configuration form what is referred to as a generalized coordinates of the agent. The vector space defined by these coordinates is named the
configuration space. It includes all possible configurations of an agent and has a dimension equal to the degree of freedom this agent has.
Normally, the workspace of an agent contains obstacles. Configurations that cause the agent to collide with these obstacles are referred to as forbidden. Forbidden configurations also result from all kinds of internal constraints of an agent (dynamic and kinematic constraints for example). Generally, a configuration is forbidden if it is not possible or desired for the agent to be at. Other configurations are denoted as free.
Based on this, the task of path planning is to find a continuous path from the start configuration to the goal configuration where all the configurations along the path are members of the free space. If such path is not available, failure should be reported. In addition to that, quality measures can be defined for paths and the task of path planning may include optimization with respect to some or all of these measures. For instance, a planner that finds the shortest path for a translating agent optimizes the distance.
1.2 Literature Review
The problem of path planning has been given significance attention in research both as a general problem in fields like artificial intelligence and as an application specific problem.
Two main ways exist for representing the environment over which the planning is performed [4]. The first one is Artificial Potential Field. In this approach, the motion of the agent is governed by potential force fields applied over the environment. Usually the goal configuration applies an attractive force and obstacles apply repulsive forces on the agent. The agent then goes through the steepest descent in these forces fields until reaching the goal [5]. Potiential field methods are less popular nowadays [4]. One of the drawbacks of this technique is that the agent is prone to get stuck in local minima of the potential field, so generally, this method does not guarantee success [4]. That is, there might be a valid path although the method has not found one.
Another way of representing the environment is by capturing the connectivity of the configuration into some sort of a roadmap. This roadmap should cover the free configuration space such that any free configuration in the environment can be connected to the roadmap in a trivial way. The roadmap should also be connected such that if for any pairs of configurations on the roadmap a valid path exists in the free space, a valid path also exists through the roadmap. A roadmap that satisfies these two conditions allows complete path planning methods, that is, if a valid path exists, it will be found. For many applications, it is very hard, or even impossible to generate a roadmap that meets these conditions.
To create a roadmap, the configuration space is discretized by selecting (sampling) configurations. These configurations are added to the roadmap and connections are created between them to capture the connectivity of the free space. The roadmap can then be represented as a directed graph structure on which nodes (or vertices) of the graph are the sampled configurations and a directed edge between two nodes exists if a valid transition from the configuration at the tail of the edge to the configuration at the head exists in the free space.
Each edge can be associated with a cost that is given based on the wanted characteristics of the desired path. For instance, if the purpose is to minimize distance, the cost would be the distance between the two configuration. In other words, the cost function encodes the criteria of an acceptable path. A path with higher cost would generally be given a lower priority in path planning. An optimal path planning method is a one that finds the optimal path, that is, the sum of its edge costs is minimal across all possible paths leading from the start configuration to the goal configuration.
Different discretization (or sampling) schemes are used to create a roadmap for path planning. In complex highdimensional configuration spaces (e.g. 6 DoF robotic arm), the Probabilistic RoadMap (Prm) method has proven to be particularly useful [4]. The idea of the method is to randomly sample the configuration space to find free configurations and connect those close to each others by edges. However, in low dimensional configuration spaces, it is both tractable and more practical to use uniform sampling where nodes are uniformly away from each others yet close enough to capture the free and forbidden space and produce reliable plans [2]. More than that, unlike probabilistic method, the resultant roadmap has the two conditions mentioned above that would allow complete planning.
Generating the roadmap is considered a preprocessing phase in path planning and is often done offline. The idea is to put as much effort as needed in creating a good roadmap. Then at real time, the roadmap can be used multiple times to answer queries in a relatively fast way.
There are different approaches in literature for computing paths given some roadmap of the configuration space. Generally, the popular techniques can be divided to two main categories: deterministic, heuristicbased algorithms (A*like algorithms) and randomized algorithms (such as RapidlyExploring Random Tree (Rrt)). When the planning is done over a low dimensional space, (i.e. when the degree of freedom of the agent is low), deterministic algorithms are often preferred because they provide a bound^{1}^{1}1A factor such that the found path has a cost lower than or equal to the cost of the optimal path multiplied by . on the quality (or optimality) of the returned path [3].
1.3 Problem Definition and Modeling
For the application investigated in this report, it is suitable to treat the agent (the autonomous car in this case) as a 2D object moving in the 2D space. This is a sufficient representation since there is no big gain from working in the 3D space.^{2}^{2}2Colliding with a tall object is probably not better than colliding with a short one.
1.3.1 Planning in a Static Environment
The second chapter deals with the problem of an agent navigating in a complex environment where there are stationary obstacles like walls and buildings. Further, this environment contains roads with different degrees of traversability (e.g. different speed limits) in addition to desirable and undesirable areas according to some criteria (known protocols of driving). This agent is required to reach a specific goal state while keeping a minimal cost with regards to all mentioned criteria.
In a practical situation, the agent will never have a complete knowledge of the environment prior to planning. Instead, it may have an initial knowledge of the layout of the environment, including static obstacles, from readily available maps. While it is moving, it acquires new information (using onboard and/or offboard sensors) about stationary obstacles, which makes it necessary to continually update the planned path.
In this case, the configuration of the agent has only two parameters indicating its position in the 2D space. A configuration (position) is forbidden if there is an obstacle at it. Because of this representation, path planning returns a path consisting of a sequence of positions with no time information.
Note that while the configuration space changes (new received information cause previously free configurations to become forbidden and vice versa), it does not contain any information about the dynamics of the environment. So, speed and heading of the moving obstacles are not known. This is what is referred to as a partiallyknown static space. In this case, even when there are moving obstacles, the configuration space is dealt with as it is at the moment of planning. The agent plans its path by frequently taking a
snapshot of the environment and finding a trajectory according to this static image. This, of course, increases the risk of losing the validity of the plan soon after planning and requires replanning more often. Even with this, the agent may collide with obstacles moving at high speeds.Because of the low dimensionality of the problem, its most suitable to use a roadmap generated by uniformly discretizing the configuration space allowing a complete path planning. Uniform sampling in the 2D space creates a gridlike environment, where cells represent configurations (positions of their centers) as seen in Fig. 2.3. Path planning is then solved as a directed graph search problem using deterministic, heuristicbased algorithms. This allows returning a path with a bound on its optimality. The desired criteria of the found path can be encoded in the edges costs. As new information arrives, the graph is updated (nodes and edges are added, removed and/or edge costs are changed) and the plan needs to be updated or recalculated (from scratch).
The second chapter focuses on heuristicbased search algorithms and their use in the context shown here. The purpose of this chapter is not to discuss the codes of theses algorithms rather than to provide insights on them and discuss how they apply to the considered problem. For that, the pseudocodes of these algorithms are not included here, but can be seen in their respective references^{3}^{3}3I suggest consulting [3] and [4] for a well presented view, history and bibliography on the topic.. It is surely better to know these algorithms from the papers that presented them (all cited in the report).
1.3.2 Planning in a Dynamic Environment
For a complete dynamic planning, the planner has to consider the dynamics of the moving obstacles. Ideally, the agent receives information about the speeds and heading angles of the moving obstacles. This information is used to estimate their future positions and planning is done such that the resultant path guarantees safety. This is the concentration of the third chapter. Different approaches exist to do this. One of the popular approaches is to plan in the so called
timeconfiguration space [6]. This means adding time as an additional dimension to the configuration space and extrapolating the positions of the moving obstacles along the time dimension. a roadmap is then generated in this 3D space and heuristicbased search is used to find a path. It will be seen that this approach makes it complex to force dynamic and kinematic constraints on the resultant path (such as bounds on the speed and acceleration).Another approach is to divide the planning task to two levels: one global and one local. On the global level, a path is planned on the configuration space without considering any of the moving obstacles. Then the agent starts moving along this global coarse, avoiding obstacles locally in a dynamic way by controlling its speed and heading angle. One of the most widely used local obstacle avoidance methods is the velocity obstacle approach [7]. The idea of this approach is to geometrically construct what is known as the relative velocity space of the agent and the moving obstacles. In this space, the set of velocities that will lead the agent to a collision in the future is defined. The agent then chooses a velocity outside this set based on some predetermined preferences. While this approach nicely allows applying dynamic constraints on the chosen velocities, it doesn’t guarantee global optimality as the general path is planned without considering the moving obstacles. Moreover, because of the locality of the approach, it is possible for an agent based on it to get stuck in local minima (e.g. Ushaped static obstacles) [8].
The third chapter introduces a proposed approach based on incremental search that tries to capture the advantages of both methods discussed above having a global planning with optimality bounds while maintaining dynamic and kinematic constraints. A new online roadmap constructing approach is also presented increasing the flexibility of the agent movements and on the same time preventing the roadmap from explosion.
2.1 HeuristicBased Path Planning Algorithms
It is hard to discuss path planning without mentioning Dijkstra’s algorithm introduced in 1959 [9]. If there is a graph consisting of a set of nodes (or vertices) and a set of edges between nodes , and for each edge an associated non negative cost , Dijkstra’s algorithm is able to find optimal paths from a single start node to all nodes in . The algorithm is shown below. It starts at the start node and adds to a queue (often named ) all the successors of that node. It maintains for each node the cost of the best found path to the start . nodes are popped out of the queue with priority given to lower values. In addition to that, the algorithm maintains for each node a back pointer that indicates from which node in the predecessors (the nodes where there are directed edges from them to ) does the best found path from to come. The search finishes when the queue becomes empty. If a path was found successfully, it can be traced using back pointers beginning from .
With the use of a priority queue, the running time of Dijkstra’s algorithm is , where m is the number of edges and n is the number of nodes[10]. For large graphs, Dijkstra’s algorithm becomes very computationally expensive as it iterates through all the nodes in the graph. It is easy to notice that, in many applications, it might be unnecessary to compute the optimal paths to all nodes in the graph, but to only one goal node. This is what A* addresses.
A* is an algorithm based on Dijkstra’s and published in 1968 [11]. It utilizes a heuristic that helps focusing the direction of the search directly towards the goal. The difference between Dijkstra’s algorithm and A* is that in A*, search ends as soon as is popped out of the priority queue and that the key of a node (its priority in the queue. Priority is given to nodes with lower key value) is computed by adding to a heuristic . The algorithm is shown below for reference.
The heuristic is a value^{1}^{1}1It can be a function that takes a node and returns a real number, or a value for each node saved in a LockUp table. associated with each node and is an estimation of the cost from this node to the goal. It defines the characteristics and affects the performance and results of the A* search. A heuristic is said to be admissible if it is a lower bound of the cost of moving from to the goal. More formally:
(2.1) 
where is the cost of the optimal path from to the goal. If the heuristic satisfies this, A* is guaranteed to find the optimal path to . An admissible heuristic for a graph embedded in the plane, for instance, where edge costs indicate Euclidean distances between the nodes, is the direct Euclidean distance to the goal (any other path to the goal will be longer). A tighter condition on heuristic is consistency, which is satisfied when:
(2.2) 
where is the set of successors of (the nodes where there are directed edges from to them).
For two heuristic functions and , is said to be more informed if for all nodes u. The least informed, yet admissible heuristic is for all . This heuristic makes A* equivalent to Dijkstra’s algorithm. The better informed a heuristic is, the better the performance of A* as less nodes are expanded. A perfect heuristic, that is, it represents the actual cost of arriving to the goal, would allow A* to reach the goal directly without expanding any nodes outside the optimal path.
Several notes are worth mentioning about A*. It is important to realize that initializing the values of all nodes at the start (lines 13) is not necessary. Instead, a node can be initialized when it is encountered for the first time in the search. This change can be significant when the graph contains a large number of nodes making initializing all of them an expensive task while only a subset of them will be considered. Further, it can be seen that expanding states (executing lines 915 of the algorithm) is where the major computation time is spent. In other words, the time spent by the search algorithm can be looked at as linear function of the number of expanded nodes. It is interesting to know that if the heuristic is consistent, preventing any node from being expanded more than once would still lead to an optimal result [12]. Finally, A* can as well work with more than one goal with nearly no addition to the algorithm. In this case we let donate the set of goal nodes rather than one goal node. The heuristic function is then the minimum value of the heuristic between and each node in and the search terminates whenever one of the goal nodes is at the top of .
A* is, without any doubt, a big improvement over Dijkstra’s algorithm. However, there is still a large room for further developments. For example, the algorithm always computes the optimal path to the goal, which can still be very expensive in applications where the configuration space is very large or highly multidimensional. In these cases, one might prefer a suboptimal path computed in a short time over an optimal path taking much longer to be ready. Also, whenever a change happens in the graph, the search has to be carried out again from scratch throwing away all the efforts done before. This is unhandy and inefficient, especially in environments that are continuously changing by nature as in the case of autonomous cars. The following subsections review the evolution of heuristicbased search to account for all these situations.
2.1.1 Backward A*
A useful note about A* is that it can be performed backwards. This means that the search starts at the goal^{2}^{2}2Again this can be easily extended to multiple goals. until finding the start. To do this, every occurrence in the algorithm shown above is replaced by and vice versa. The used heuristic should be a lower bound of the cost of moving from the start to the node . In other words, it should be backward admissible and consistent:
(2.3) 
Now, represents the cost of the best found path from to the goal and when expanding a node , its predecessors are examined instead of its successors (line 9). The found path can then be traced by back pointers (or perhaps front pointers) starting from .
While this produces the same result giving the optimal path, the behavior of the search is different. One difference is that forward search spends more time and expands more nodes around obstacles bent towards the start, while backward search struggle more with obstacles bent towards the goal as in Fig. 2.1. Also, it was found that, when moving in a completely unknown terrain, it can be much more efficient to plan forward rather than backward [3]. This is because in such cases, the observed area is the surrounding of the start, while the edge costs near the goal are not known and typically given optimistic costs (less than the actual). Planning forward allows the agent to efficiently explore the observed area and rapidly progress planning towards the goal directly. In the opposite case, however, the search starts at the unknown area moving soon to the observed area where it encounters costlier edges and begin expanding a large area trying to find a cheaper path.


blue: start, red: goal, black: obstacles, yellow: explored area, dashed line: path. Note that paths produced by both forward and backward A* are equivalent in total cost (distance).
However, backward search has a very important property that represents the cost of the best found path from towards the goal not the start. This is useful because when the agent moves, the start position changes. On the other hand, the position of the goal is fixed, which means that the values of all the nodes are still valid. Retaining these values after the planning has finished may allow us to use them again when a change happens in the graph to update the plan instead of planning from scratch.
2.1.2 Dynamic Replanning (D*)
As mentioned above, in real world scenarios, the agent would initially have an inaccurate graph to plan on. The graph gets continuously and regularly updated as time passes and as the agent moves and so the planned path may become invalid or suboptimal. It would be then prohibitively expensive to plan from scratch using A* to maintain validity and optimality every time an update happens on the graph, especially in such large and complex configuration spaces with tens of thousands of nodes. Further, the changes might actually not affect the optimality of the planned path or affect it in a very minor way such that the planned path can be easily fixed. In these cases, fixing the path or replanning is much more logical than starting planning from scratch.
Achieving this is the purpose of replanning algorithmst. Among the most well known and widely used algorithms under this category are Focused Dynamic^{3}^{3}3Note that ‘dynamic’ here does not mean that it plans in the configurationtime space and accounts for the dynamics of the environment. Instead, it means that it can update a previously prepared plan to account for changes in the graph, whether this graph represents a configuration or configurationtime space. A* (D*) (published 1995) and Dynamic A* Lite (D* Lite) (published 2002) [13]. The two algorithms behave equivalently, but the report will focus on D* Lite as it is simpler and considered a bit more efficient [13].
D* Lite is an extension of backward A*. It is based on the idea that when an update occurs in the graph, the cost of the shortest path to the goal (value) of only few nodes is affected. It tries to recalculate this cost only for nodes that were affected by the new update in the graph and are relevant for calculating the new optimal path to the goal.^{4}^{4}4The costs of nodes to the goal are enough to determine the optimal path as it can be traced from to the goal by always transitioning from node to the successor node that minimizes . This is achieved by some updates on backward A*. Another value, beside value, is maintained for each node. It is named , a onestep lookahead value based on the value and hence it is better informed than the value. it always satisfies:
(2.4) 
A node is consistent if , otherwise it is overconsistent if or underconsistent if . overconsistent nodes are those with an expectation that the cost of their path to the goal will decrease, while underconsistent nodes are those with an expectation that the cost of their path to the goal will increase. The queue always holds the inconsistent nodes (over/underconsistent). They are expanded based on the key value:
(2.5) 
The key of a node has two elements and the nodes are ordered in a lexicographical way in the queue based on them. This means that the priority is given to nodes with lower values of and ties (i.e. when of a node is equal to that of another node) are broken by selecting the node with the lowest . When a node is expanded, it is made consistent if it was overconsistent and the values of all of its predecessors are updated, otherwise it is made overconsistent and the values of itself and its predecessors are recalculated.
When D* Lite runs for the first time ( is overconsistent by initialization and all others have ), it works equivalently to a backward A* search. The search ends when there is no node in the queue with a key less than . D* Lite then maintains the queue as well as the and values of all the nodes. When any change is detected, values of the affected nodes are calculated, then inconsistent^{5}^{5}5If no inconsistent node found, the previously calculated path is still optimal. nodes are inserted to the queue (that might still has nodes from previous searches) and the nodes are expanded and updated until there is no node in the queue with a key less than the start node. The second element of the key that determines the priority in the queue insures that affected nodes lying along the previously planned path are processed in efficient order. So a minimal number of nodes is processed to find the new optimal path.
D* Lite is well suited for the path planning problem in unknown or partially known environments. It works well with moving start position (because it is a backward search), and replans in response to any change that happens in the graph. Some facts are significant about replanning. Replanning is more expensive when changes occur in the nodes closer to the goal. This is because these changes affect the values of a larger number of nodes. So, D* Lite is best used when the changes happen close to the agent (e.g. through its observation). It is also important to see that as replanning is based on using previous search results, if changes were significant and the graph was considerably altered, planning from scratch might be much more efficient than relying on search results of a nearly different graph. For that, it is common for systems utilizing D* Lite to abort replanning process if a significant change was detected or if a certain time threshold was exceeded and start planning from scratch [3].
While D* Lite expands considerably less nodes than a normal A* because it uses previous search information, expanding a node in D* is more expensive as it usually involves iterating over the successors of predecessors of the node to update the values of the predecessors. So, for a small environment with no or very few changes happening in the graph, A* might be more efficient than D* Lite, but as the graph gets bigger and changes increase, the merits of D* Lite replanning feature can be seen and appreciated. An optimized version of D* Lite was presented in [13]. It considerably decreases the frequency of calculating values throughout the search making it significantly faster.
2.1.3 Anytime Planning (ARA*)
All the algorithms shown in this report until now compute the optimal path. However, calculating the optimal path might not be achievable under certain time constrains in complex multidimensional workspaces as it requires very large number of nodes to be expanded. A class of search algorithms often named anytime planners addresses this issue. Anytime planners construct an initial suboptimal solution quickly and improve this solution as time and resources allow. Heuristicbased anytime algorithms generally rely on the fact that if an admissible heuristic of A* is inflated by a factor of , the found path is guaranteed to be no more than times longer than the actual optimal path [3].
One of the known heuristicbased anytime algorithms is Anytime Repairing A* (ARA*) published in 2003 [14]. There is a forward version of this algorithm based on A* [14] as well as a backward version based on backward A* [15, 3]. The backward version is superior for moving agents as it is able to improve its plan while moving ( is changing) for the reasons explained in section 2.1.1.
In backward ARA*, the priority of a node in the queue is determined by the key:
(2.6) 
where is the inflation factor. The algorithm starts by performing an inflated backward A* with an initial inflation factor . This search is done very quickly and efficiently for one significant fact: each node in the queue is only expanded once. This is different from normal A* search where a node is naturally expanded many times depending on its connectivity with neighbor nodes. In ARA*, once a node has been expanded in a particular search, then if it was found, during the search, inconsistent (the cost of the shortest path of the goal was found to be actually less or more than ) it is inserted into a list named . The search ends when the start node is popped out of and a suboptimal plan results. After that, if the time and resources allow, while the agent is moving or before moving, the inflation factor is decreased by a certain step. The nodes in the list are moved to the queue (that might still contains nodes from previous searches) and the priorities of all nodes in are updated according to (2.6). This continues until or until there are no more resources.
From search to search, ARA* retains the queue and the list as well as the values of all expanded nodes. The inflation factor is decreased gradually each search getting the planned path closer and closer to the optimal path until resources end or until the path is satisfactory. Note that improving the plan (decreasing ) can only happen in a static environment where the graph stays as it is and nothing is moving except the agent on its previously planned path. What makes the ARA* subsequent searches fast is the fact that in each search only nodes that were found inconsistent during the previous search are inserted to and that each node in is expanded at most once. Yet, with that, ARA* is able to guarantee an optimality bound^{6}^{6}6A number such that the cost of the resultant plan is less than or equal times the cost of the optimal plan. for each search done [14].
Anytime planning allows fast planning in time critical situations as well as in complex multidimensional configuration spaces where planning an optimal path would take a very long time. However, it has its limitations too. The most significant one is that it is not suitable in dynamic configuration spaces where nodes and edges can change. Any change will require ARA* to set back , discard all the information of the previous searches and plan from scratch. Another limitation concerns the generality of the algorithm and comes from the requirement to select some initial inflation factor and a step by which is decreased at each subsequent search. These values have a strong influence on the performance and result of the search. Simulation experiments done on the simulation environment presented in this report have shown that these parameters are situationdependent. The larger is the greedier the search through the configuration space is. With large values of , the algorithm may expand low number of nodes but is then more prone to be temporarily stuck in local minima. It is recommended by the algorithm designers to decrease by small steps as large steps may result in expanding too much nodes [14]. However, how small this step should be as well as the initial inflation factor are left for the user to decide based on his needs.
2.1.4 Anytime Dynamic Replanning (AD*)
Replanning algorithms allow maintaining an optimal path in the presence of changes, fixing the path instead of reconstructing it from scratch, which made planning in dynamic configuration spaces a lot cheaper. On the other hand anytime algorithms perform suboptimal searches and improve the plans while time and resources are available leading to a highly reactive planning that is flexible under resources constraints. Many real world applications would benefit from the features of both algorithms categories. For instance, autonomous cars operate in very dynamic configuration spaces where replanning is continuously needed. On the same time, plans are required to be produced quickly and efficiently allowing the agent to find valid (even if suboptimal) paths in critical situations and continue to improve its plans while moving and replanning. Under this motivation, the Anytime Dynamic A* (AD*) was developed in 2005 [15].
AD* takes from D* Lite its key that has two elements and the lexicographical ordering of the queue as well as the value that always satisfies (2.4). It takes from ARA* the idea of inflating the heuristic with starting from an initial value of and gradually decreasing it as time allows to improve the path, in addition to the idea of expanding the nodes only once during a search and adding only the nodes that were found inconsistent at the previous search to at the start of a subsequent search. As a result, it is able to plan and replan (in case of changes) in an anytime fashion while moving.
The key by which nodes are ordered in the queue shows how AD* is an intersection between D* Lite and ARA*:
(2.7) 
The heuristic is not inflated for underconsistent nodes (the cost of the path going through them is larger than anticipated) entering the queue as an admissible heuristic is needed so that their new costs propagate through their neighbors.
With this, AD* is able to handle changes in both the graph and the inflation factor giving it a great flexibility and capability in dynamic configuration spaces. However, this does not come without complexities and drawbacks. In fact, while AD* has the advantages of both replanning and anytime planning algorithms, it also suffers from the drawbacks in both of them. From the anytime side, the algorithm still requires the user to select suitable value for and the decrease step as discussed in section 2.1.3. From the replanning side, expanding a node is more expensive than in a normal A* search as seen in section 2.1.2. Also, it is possible, if significant changes happened, that replanning in AD* would be more expensive than planning from scratch. In fact, it was pointed out that this is more so in AD* than D* Lite as AD* needs to reorder the whole queue every time is changed [3]. Dealing with this issue in AD* is complex as one has three options confronting changes in the graph:

continue decreasing the inflation factor by the same usual step and let AD* replan and improve the path on the same time. This is recommended when the changes are minor and insignificant.

increase the inflation factor to an adequately large value to allow AD* to replan quickly and then improve the plan by gradually decreasing as time allows. This is recommended if the changes are significant but replanning is still more affordable than starting from scratch.

discard all the previous search information and start planning from scratch. This should be the decision when replanning is taking a long time.
The problem is to determine how significant the change should be such that planning from scratch is more affordable. This is highly application dependent. One can set a time threshold, for example, such that when it is exceeded in replanning, planning from scratch begins, but then this would waste valuable time and resources. Instead, simulations and experiments can be done to determine the change conditions that would motivate a new fresh planning.
Despite all these complexities, AD* has proven to be very powerful. The algorithm was used for planning in Boss, the autonomous car that won the 2006 DAPRA urban driving challenge moving 97 km in a dynamic urban environment and Obeying the California Driver Handbook [16]. An optimization that considerably decreases the cost of expanding a node and extends the algorithm to plan in the configurationtime space was proposed in 2006 [6]. The original algorithm as well as the optimized version were implemented to plan in simulated dynamic configuration spaces as seen further below.
2.2 Test Environment
A gridbased simulation environment was designed on MATLAB to test the different shown algorithms. In this section, the backbone of this environment is described. A block diagram of the main environment components is shown in Fig. 2.2. An arrow from a component to a component indicates that uses .
2.2.1 Forming a Grid
A 2D grid is represented as a MATLAB structure named grid. The structure contains three fields: (1) obstacles which is a binary matrix of the width and height of the grid. A cell in that grid can be either free or occupied by an obstacle. If it is free, its respective value in the obstacles matrix is 0 (or false) otherwise it is 1 (or true). (2) start which is the cell where the initial state or configuration is. A cell (or node) is represented by a horizontal vector containing two numbers representing the and coordinates of the cell in the grid respectively. (3) goal which is a matrix containing goal nodes, where is at least one. This allows to have more than one goal in a grid.
It can be concluded from this that the obstacles field is what gives the grid its general structure and size. It is defined such that if it is given the and coordinates of a node as its first and second indices respectively, it determines whether this node has an obstacle on it or not. Hence, it is hard to get a visual image of the grid by simply printing this matrix^{7}^{7}7The first index corresponds to a row in MATLAB, so it represents a position in the vertical direction while, in the used definition here, it actually represents the horizontal coordinate.. A grid can be generated by calling one of the following two functions:

createEmptyGrid: takes the width and length of the grid and returns a grid with no obstacles (grid.obstacles is all zeros) and with no defined start and goal nodes.

createGridMap: takes an obstacle matrix and returns a grid structure with the given obstacle matrix and with no defined start and goal nodes.
Table 2.1 shows additional functions that operate on the grid structure. All of these functions take two arguments: a grid structure and a node. Verification functions returns a Boolean that is true when the input node is both in the grid and is obstacle/free. Modification functions facilitate changes on a node in the grid. These functions do not allow an obstacle to be placed over a goal or a start node and vice versa. However, the start node can be in the same cell as a goal node.^{8}^{8}8In which case no need to do any path planning! While multiple goals can be added, only one start node can be defined, so when calling the defineStart function, the old start position is replaced by the new one. It is worth noting that these modification functions do not, in fact, change the input grid structure. Instead, they return a new grid structure that is a copy of the input one with the changes applied on it.
Grid Verification Functions  
isObstacle  isFree  
Grid Modification Functions  
addObstacle  addGoal  defineStart 
removeObstacle  removeGoal 
2.2.2 Search Algorithms and Search Information
All the algorithms discussed in section 2.1 were implemented as functions. Seven search functions are provided, namely: AStar and AStarBack for forward and backward A* search respectively, ARAStarBack for backward ARA* search, DStarLite and DStarLite2 for normal and optimized D* Lite search respectively and ADStar and ADStar2 for normal and optimized AD* search respectively. All of them were implemented such that they support multiple goals.
For any path planning algorithm to be compatible with the designed simulation environment, it should be in a form of a function that takes two arguments: a grid structure and another structure named searchInfo, and returns a new version of the searchInfo structure. As its name suggests, this structure holds the search results produced by the path planning algorithm. If the input searchInfo is empty, the function plans from scratch. Otherwise, different procedures are taken based on the different capabilities of the algorithms. Table 2.2 shows the common fields in the searchInfo structure that are present in the output of all the search functions included. In addition to fields shown in table 2.2, searchInfo may contain additional fields depending on the needs of the search function that uses it. For instance, ADStar and ADStar2 output a searchInfo structure that have the additional fields rhs, open, incons and eps (for ).
Field  Description 

success  A binary that is true if a path was found successfully. 
start  The start of the found plan. This is particularly important when the grid has multiple goals and a backward search is done. This field then determines which goal is in the found plan. Note that, for a backward search, the search is from goals towards the start, so the start of the plan is a goal not the start node. 
bp  A backpointer matrix. It indicates for each node the next node to be taken to arrive to the goal according to the found plan. 
g  A matrix containing the value of each node calculated in the search. 
expanded  A binary matrix that is true for nodes that were expanded during the search. 
grid  The grid structure on which the plan was found. This is needed so that when a search function is called again, a comparison between the input grid and the grid field in the input searchInfo can determine if the grid was changed and planning/replanning is needed. 
time  The time consumed to complete the search in seconds. Note that this is machine dependent. 
Normally, search algorithms do not output a complete path from to the goal. This is because the complete path is not needed, only the next step matters. However, in the simulation environment presented here, computing the complete path was necessary to be able to visualize it. It was important, however, to separate the task of calculating the path from the search functions themselves so that when time measurements are done, they only reflect the effort spent on planning. For that a function named tracePath is included. This function takes a searchInfo structure and returns a copy of it with an additional field named path that is computed in the function. The function simply checks the successful field, if it is 1, it traces the path beginning by the node searchInfo.start and moving to the next one using back pointers searchInfo.bp until it reaches the end of the path. It is able to do this for both forward and backward functions and in both cases the outputted path is a matrix containing nodes, where is the length of the path, beginning by the start node and ending with a goal node. Note that although most of the implemented algorithms (D* Lite, ARA* and AD*) didn’t include back pointers in their published versions, it was possible, yet not very easy, to extend them to produce back pointers.
2.2.3 Heuristic and Connectivity Functions
The search functions need several information to find a valid plan as explained in section 2.1. The needed information includes a heuristic function that takes two nodes and return an admissible estimate of the cost of transitioning from one to the other. In addition to that, the search may need to access the predecessors and/or the successors of a node. For that, predecessors and successors functions are required. They take as their arguments a grid and a node and return the predecessors/successors nodes of the input node in the input grid. Finally, the edge cost of moving from a node to another (if an edge exists between them) is required.
The simulation environment was designed with generality in mind. So at the start of all search functions, there is a code section named “options” where the handles of the four functions (h for the heuristic, pred for predecessors, succ for successors and c for cost) should be given. The functions can be provided by the user according to the purposes of the needed simulation and the whole code will work smoothly without needing to change anything else.
For simplicity, the tests presented in this report are all based on an 8connected grid model (Fig. 2.2(a)). In this model, a node is connected to the eight surrounding nodes except obstacle nodes, which are not connected. This implies that the predecessors of a node are the same as its successors. So for both predecessors and successors, one function named neighbors8 is used to return the nonobstacle of the surrounding 8 nodes of a node. In addition to that, a simple cost function named cost8 that only considers distance is used. If there is no edge between a node and another (as the case with obstacles), the cost of transitioning is infinity. Moving to an adjacent node (straight) costs 1 and moving to a corner node (diagonally) costs . For the heuristic, the diagonal heuristic [17] is used in a function named hDiagonalDistance. In the heuristic, the unit straight distance is considered as 1 and the unit diagonal distance is considered as .
Note that, with the provided functions, the agent is allowed to move backward any time. In addition, it cannot move more than one node a time. These conditions can be easily changed by changing the connectivity functions pred and succ. Suppose, for example, we want the agent to only move forward in the direction of the goal, then pred and succ cannot be linked to the same function. Instead pred should returns neighbors that are in the opposite direction of the goal and succ should return neighbors that are in the direction of the goal as in Fig. 2.2(b). Similarly, the agent can be allowed to move more than one node a time, by connecting to it more surrounding nodes than eight, provided that there is no obstacle between the node and any of the connected nodes. Connecting the surrounding 24 nodes, for example, would allow moving one or two nodes a time in any direction. It is clear that the more connectivity there is in the graph the more expensive the computation needed to find a plan.
2.2.4 Graphing and Animation
The principal graphing function in the presented simulation environment is named plotGridPath. Its purpose is to allow plotting grids, paths, as well as expanded cells in the grid during the search. For that it takes a group of inputs. The inputs include a grid and searchInfo structures. This allows plotting the grid and the found path from the search as well as the expanded nodes during the search. If the input searchInfo is empty, the function only plots the grid, and if the input grid is empty while searchInfo is not, the graph updates the path on a readily available plot. For that, a handle to the figure where the grid was plotted before has to be inputted. In addition to that, handles to specific plotted objects are among the inputs, namely: handles to the plotted start cell, expanded cells and path line. These inputs allow updating the plotted figure with minimal computational efforts instead of redrawing the whole figure again and again. If a plot is being produced for the first time, the input handles can be replaced by empty inputs. The function will then create a new plot from scratch.
In the top of the function file, there is a code section named “options” where the user can control various options regarding the produced plots. This includes controlling whether specific elements appear or not like expanded cells, path line, grid lines and coordinates ticks. The options also allow changing the colors of all elements of the figure and enabling or disabling a color map for the expanded cells. When this color map option is enabled, the expanded cells are plotted with different colors based on their gvalue. The color is yellow for low gvalues close to zero (expanded cell was found by the search to be very close to the search start), it moves gradually to cyan as the gvalue increase (meaning that the cell is considered far from the search start). This creates more informative figures giving the user an idea on why the found path was chosen by a specific search algorithm. The effect of some of these options can be seen in the figures included in this report. For instance, the plots in Fig. 2.1 are generated with grid lines and color map disabled, while plots in Fig. 2.4 has grid lines and color map enabled, and Fig. 2.5 and 2.7 have grid line disabled and color map enabled. In all of these figures, coordinates ticks option was disabled.
After its execution, plotGridPath returns handles to the plotted figure as well as objects inside it (again, start cell, expanded cells and path line). These outputs can then be used as inputs to the same function in future calls which would allow creating fast animations. Suppose, for example, we have a grid that has not been plotted before, and its associated search information from a search function that operated on the grid. We can call plotGridPath with these two inputs and empty inputs for the handles. The function would produce a plot for the grid and the path on it and return handles to the created figure and the individual items inside it. Suppose then we want to animate movement of the agent along the planned path, the function can be called repeatedly with the output handles of each call as the input handles of the following call and with updated start position in input grid and updated path in the input searchInfo. The same principle can be used to animate the updated paths and plans while the agent is moving in case of anytime planning or replanning.
To make animation easier and more accessible to the user, an animation function named planMove is also included in the simulation environment. It takes a grid and a searchInfo structures as well as all the handles taken by plotGridPath and a handle to a search function. If there is no plan, planMove uses the search function to plan a path and animate movement of the agent along the planned path for one step using plotGridPath. It then outputs all the handles outputted by plotGridPath as well as an updated version of the inputted grid (where the start position is changed to the node the agent moved to) and searchInfo (where the path is reduced by one node because the agent already executed part of the path). Using these outputs as inputs to the same function in a future call (naturally in a loop) will animate another one step movement and so on until the agent reaches the goal.
It is also possible if a plan was already found in an initial call of planMove to input a grid where some obstacles have been changed (using addObstacle and removeObstacle functions) to represent the situation of the grid after the one step movement. planMove will then animate moving the agent one step along the old path. Then plot the new situation of the grid and use the search function to update the plan, or plan from scratch if it is not capable of replanning, and show the new plan on the figure. With this, planMove is able to animate moving the agent, changes in the grid and the subsequent replanning as well as anytime planning. Because updating the figure can be too fast depending on the size of the grid and the time needed to do the search, the function includes an “options” section to change the pause time between updates to make the animation easier to follow.
In addition to the outputs mentioned above, planMove also outputs the number of expanded cells and the time taken to do the search if a search was done. This allows the function to be used for time tests too. To make the tests faster and more accurate, the function doesn’t show any animation if the input figure handle is empty, which allow doing fast and accurate time tests by using planMove to move the agent along the path and compute the total time taken by all planning/replanning. This can be done repeatedly then the average of this total time can be taken as a performance measure.
2.3 Results and Analysis
Several tests have been performed on the simulation environment. The tests started on a simple grid so that it is easy to inspect and debug when there is a problem in a search algorithm. This grid is the same grid that can be seen in [15, 3]. The idea was to regenerate the same results shown in these papers to ensure the correct functioning of the whole system. The grid can be generated in the environment by calling a function named paperGrid.
Another larger and more complex grid was designed and can be generated in the environment by calling a function named complexGrid. This grid has two goal nodes. Its size gives a better opportunity to test the replanting and the effect of different rates and inflation factors for anytime planning.
Finally a third huge grid is included in the environment and can be generated by calling largeGrid. This grid was initially generated by a maze generator. However, the generated maze had very narrow passages only one cell wide, so it was extensively modified to have wider passages and multiple paths to the different places in it. Because the grid is very large, its obstacles matrix is saved in a .txt file. When largeGrid is called, it loads the matrix, adds a start positions and four different goal positions. The large size of this grid allows more reliable and realistic testing for the search functions. Table 2.3 gives a summary of the properties of the three grids.
Property  Grids  

paperGrid  coomplexGrid  largeGrid  
Dimensions (verti. horiz.)  
Number of cells  42  390  5445 
Number of goals  1  2  4 
Start position  
Obstacles density  
Movement scenario changes  1  3  7 
The reasons for including multiple goal positions in some of the test grids is that dealing with multiple goals is particularly important when planning in the configurationtime space (that is to be inspected later), where the search has multiple configurationtime goal nodes usually in the same configuration but at multiple points in time.
For each grid there is an associated test MATLAB script. The script contains what we call a movement (or navigation) scenario where the planMove function is used to plan a path and move the agent through it. During movement, the grid is changed and the agent is faced by a new situation where it has to replan or plan from scratch. The number of times each grid is changed during the movement scenario is shown in table 2.3. The test script iterates through all the seven included search functions performing the test on them. Every test script has two choices: (1) To disable time tests. In this case, an animation for the movement and planning in the agent is shown to the user to follow and see. This usually useful for what the author names “Behavior” tests, where the behavior of the search can be followed and verified. (2) To enable time tests. The plotting is then disabled and the script moves the agent silently through the grid until it reaches a goal. While doing this, it records the total time spend, the total number of expanded cells in planning, replanning and anytime improvement as well as the length of the followed path to the goal. It does this repeatedly for a number of repetitions inputted by the user depending on the available time and computational power. The tests are done for each search function and the average metrics of them over the repetitions are printed on the screen at the end.
2.3.1 Behavior Tests
The first performed test was, as mentioned before, the regeneration of the search results that was shown in [15, 3] over a small grid. The test script of the paperGrid performs the same simple navigation scenario shown in these papers, where the agent plans, moves for two cells then finds a new information about the environment realizing that there was a falsely placed obstacle in the old graph. So, it plans or replans depending on its capabilities.
The grid is small enough for all the results to be shown. The resultant plans can be seen in Fig. 2.4. It can be seen that they are very similar to the ones shown in [15, 3]. However, the number of expanded cells of all the search functions is generally slightly less. This is due to the fact that these papers use a different heuristic and cost function. The heuristic they use is the minimum of the horizontal and vertical distance between two nodes and the cost is the same for moving straight or diagonally. While this heuristic is valid and admissible, it creates ties^{9}^{9}9Situations where two paths are equivalent according to the provided cost function. that are broken arbitrarily for visually undesired trajectories sometimes. Because the same discussion shown in [15, 3] applies here, it is not repeated.






Because of the large sizes of the other grids, it is not possible to show the complete behavior of all the search functions over these grids in this brief report. Instead, light will be spotted on interesting and significant details of the different functions in different situations.
Figure 2.5 shows the behavior of both backward A* and D* Lite it the complexGrid. Initially (first row of the figure), the two plan optimal paths^{10}^{10}10Note that although the two resultant paths are slightly different, they both have the same length and are optimal using a backward search. It can be noticed, however, that backward A* expanded less nodes than D* Lite in the initial plan. This is due to the different ties braking strategy used by both algorithms. While backward A* breaks ties arbitrarily when there is two nodes with equal keys in the queue, D* Lite uses keys with two elements to break ties in a more tactical way. The strategy of D* Lite is beneficial in replanning, but in planning from scratch it results in more expanded cells. In fact, the author believes that rethinking how D* Lite should break ties (redesigning the key of the queue) may lead to significant improvements for both D* Lite and AD*.


After the agent moves for 3 cells, it finds that one of the obstacles is removed (second row of Fig. 2.5). Here, the difference between backward A* and D* Lite is seen clearly. D* Lite was able to quickly find the new optimal path expanding a small amount of nodes, while backward A* had to plan from scratch. After moving again, the agent discovers that the goal it was headed to is inaccessible (third row of Fig. 2.5). It can be seen that, in that case, planning from scratch was almost equivalent to replanning, again due to the way ties are broken. Finally, while approaching the second goal, the agent realizes that there is a shorter path to it (last row of Fig. 2.5). Backward A* plans from scratch, expanding states from the goal until the agent position. D* Lite replans. In short, while D* Lite spends a big effort in initial plans, it manages to save a lot when facing a change in the environment in most of the cases. The total number of expanded cells and the spent time over all the grid are shown in table 2.4.
An example of how AD* works is shown in Fig. 2.6. The agent starts planning on the complexGrid (upper left of the figure), expanding as few nodes as possible at the expense of the optimality of the found path. At the subsequent step (upper right), the inflation factor is decreased and nodes are expanded trying to find a better path. AD* succeeds to find the optimal path after the second step (bottom left) as was decreased more. After three steps (bottom right), the grid is changed and AD* is able to replan and find the new optimal path expanding a minimal number of nodes. An interesting case happens here. The initially suboptimal path allowed the agent to be closer to the shortcut road that appears in the grid after the third step (compare this to the second row of fig 2.5). This made the final total path taken by AD* from start to goal actually shorter than the one taken by methods that find the optimal path from the beginning like A* and D* Lite. In addition to that AD* expands less nodes to find this shorter path as seen in table 2.4.


Finally, Fig. 2.7 compares the behavior of D* Lite, ARA* and AD* at different stages on the largeGrid. It can be seen in Fig. 2.6(a) (top) that D* Lite expands most of the grid (2824 nodes) to find an initial optimal path of length 143 nodes. After moving for some time, the agent is faced with a new situation (middle) where the old path is not valid anymore. D* Lite expands only 13 nodes to find the new optimal path of length 124. Later, A new more significant change^{11}^{11}11Note that other changes that required modifying the plan have occurred in the grid previously in instances that are not shown in the figure because of the large number of changes that are programmed to occur in the largeGrid. is detected (bottom), so D* Lite replans again expanding 307 nodes to find an optimal path of length 95.
Figure 2.7(a) shows how ARA* deals with the same situation. Initially (top), the algorithm finds a suboptimal path using an inflation factor expanding only 359 cells. The found path is surprisingly optimal with a length of 143. However, the user and the algorithm have no guarantee on that. So as the agent moves, ARA* expands more cells trying to find shorter paths. These intermediate searches are not shown in the figure. When it reaches some point, it finds that the path is not valid (middle). So, the whole previous search results are discarded. The inflation factor is reset to 4.5 again and a quick new plan is prepared expanding 326 cells to find a path of length 136. The same thing happens again at a later point (bottom) and again ARA* discards all the previous efforts, reset and plan from scratch expanding 240 nodes to find a path of length 105. Note that the found paths in the last two situations are not optimal and are different than the ones D* Lite found.
As the agent is moving, ARA* will continue trying to improve the found path. While ARA* is able to converge to the optimal path after some amount of time, the continuous changes in the grid makes ARA* loses its optimality frequently. The total path followed by ARA* from start to goal is highly suboptimal compared to other methods as seen in table 2.4. One way to fix this is to stop resetting the inflation factor whenever a change happens and keep decreasing it as usual. While this will result in more optimal paths, it will gradually convert ARA* to be backward A* as gets closer to 1, which means that ARA* will expand much more nodes when a change happens in the grid as seen in table 2.5. Stopping resetting caused expanding more than four times the number of nodes expanded when was reset at each change in the grid.
In Fig. 2.8(a), we see how AD* deals with this situation. Initially (top) AD* finds a quick plan the same way ARA* did. It begins with an inflation factor of 4.5 and finds a plan by expanding only 359 nodes. Again, the found path happens to be optimal of length 143, but AD* is not aware of that and while it moves, it will expand larger parts of the grid trying to find a shorter path but will not find one. The intermediate searches are not shown in the figure. At some point, the agent finds that the old path is not valid (middle), so replanning is needed. However, AD* doesn’t have to start from scratch and doesn’t have to reset .^{12}^{12}12Unless the change was very significant, which is not the case here. At that instance, hasn’t reached 1 yet, and AD* is doing something between ARA* and D* Lite. It expands 270 nodes (much less than what ARA* did at the same situation but also much more than what D* Lite did) to find a suboptimal path of length 135. Note that the expanded cells are not around the goals but are mostly around the changed area. After sometime, when the agent detects a new change in the grid, AD* replans again. At this moment, AD* has converged to D* Lite as reached 1 and it is able to repan and find optimal paths. It expands 301 nodes (slightly less than D* Lite) to repair its path and find the optimal path of length 95, the same one found by D* Lite. It can be seen in table 2.4 that the total number of expanded cells of D* Lite on largeGrid is much less than that of AD*. However, this comes with a great advantage for AD* that will be discussed in the next section.
2.3.2 Performance Tests
Time tests were done in MATLAB to have an indicator of the relative performance of the different search functions on a real system. Table 2.4 summarizes the results of these tests for the three grids using seven different search functions: A*, backward A* (B. A*), D* Lite, optimized D* Lite (O. D* Lite), ARA*, AD* and optimized AD* (O. AD*). The time measurements are taken as averages over a number of repetitions depending on the grid size. The other included metrics are the number of expanded cells and the length of the followed path from start to goal.
Metric  A*  B. A*  D* Lite  O. D* Lite  ARA*  AD*  O. AD* 

paperGrid (average over 100 repetitions)  
time (s)  0.0065  0.0052  0.0196  0.0063  0.0077  0.0240  0.0079 
expanded  26  26  25  23  21  21  19 
path  8  8  8  8  8  8  8 
complexGrid (average over 50 repetitions)  
time (s)  0.1721  0.0994  0.5560  0.5113  0.1359  0.5900  0.2016 
expanded  734  439  354  349  357  243  245 
path  29  29  29  29  28  28  28 
largeGrid (average over 10 repetitions)  
time (s)  3.4615  2.6417  3.1217  1.2692  0.9787  22.9397  9.8100 
expanded  15,903  13,528  3,357  3,349  2,265  10,462  10,658 
path  159  159  159  159  183  159  159 
The time taken in the tiny paperGrid is not very indicative of the performance of the functions due to the very low number of nodes but is put for completeness. The search steps and the resultant paths over the paperGrid were also shown in Fig. 2.4. It can be seen that the length of the followed path is the same for all the functions. The number of expanded cells of the different search function on this small grid will not be discussed here as it was discussed before in [15, 3] and is not a good representative of the performance as in the case of larger grids.
It was possible to repeat the tests on the complexGrid 50 times and get the average of these tests for the time measurements. It can be seen that A* expands more nodes than any other method. It expands more than backward A* due to the nature of the obstacles as discussed in section 2.1.1. However, the cost of expanding a node in both methods is equivalent at about per node. D* Lite expands less nodes than both A* and backward A*, but the cost of expanding one node is much higher than in A*, so the total time is longer. The optimized D* Lite slightly decreases the number of expanded cells and also the time spent per one node, however, the significance of the optimization is not very clear in this grid as it is in the largeGrid. The total path followed by these four method is 29 cells long. ARA* expands slightly more nodes than D* Lite but still much lower than forward and backward A*. The cost of expanding one node is higher than that of A* reaching per node, though still considerably lower than that of D* Lite needing only one fifth of its total time. AD* expands nodes less than any other method, but the cost of expanding one node is higher than all other methods, the optimization succeeds to significantly decrease this cost and hence the total time needed by more than 65%. The length of the followed path by the three last functions is shorter than that followed by methods that plan optimally for the reason explained in section 2.3.1 and seen in Fig. 2.6.
It appears from the tests over the complexGrid that the benefits of methods like D* Lite and AD* can only be seen in situations where the number of nodes they expand is much less than the number of nodes expanded by other methods, which should be the case in the largeGrid. Indeed, forward and backward A* expand a huge number of nodes reaching 15903 and 13528 expanded nodes for the two respectively. Again, forward A* expands more because of the nature of the gird that slightly favors backward searches. D* Lite expands about one fifth of the nodes expanded by A* but because of the higher cost of expanding nodes, the total time of the two methods is close. The optimized version of D* Lite tremendously decreases the total time taken by the algorithm by about 60% making its total time less than half of that of backward A*. The four methods follow a path of length 159 from start to finish.
ARA* expands less nodes than all previous methods but follow a highly suboptimal path of length 183 cells. This is because it resets the inflation factor at each change in the grid as discussed in section 2.3.1. The initial inflation factor is 4.5 and the step by which this factor is decreased each time is 0.08. This means that 44 steps are needed until and ARA* is guaranteed to find an optimal path. However, changes are happening frequently such that never reaches 1. If is not reset at each step, ARA* will soon behave like backward A* and will expand a large number of nodes whenever a change happens. This can be seen in table 2.5, stopping resetting made the number of nodes expanded by ARA* close to that of backward A* but allowed ARA* to follow an optimal trajectory. On the other hand, AD* expands a large number of nodes that is comparable to that of backward A*, and because of the high cost of expanding a node, the total time is much higher than all other functions at 23 s. The optimized version decreases this time by more than half reaching less than 10 s but this is still much higher than the other methods, which would make the reader question the whole usefulness of AD*.
2.3.3 Insights on the Performance of AD*
The performance of AD* depends highly on the values of the initial inflation factor and the step by which this factor is decreased each time. Different choices of these two parameters lead to different results as shown in table 2.5. So, why choose and step = 0.08 while this leads to the largest number of expanded cells?
ARA*  Optimized AD*  
Settings  
4.5  4.5  3.0  3.5  4.5  6  4.5  
step  0.08  0.08  0.2  0.08  0.08  0.2  0.08 
rest  yes  no  no  no  no  no  yes 
Metrics  
time (s)  0.9787  3.3126  3.5613  9.6405  9.8100  4.1915  5.0884 
expanded  2,265  9,331  6,162  11,844  10,658  6,789  4,374 
path  183  159  159  159  159  159  155 
D* Lite does a very good job in replanning, but it spends a very large time to produce its initial plan as was seen in Fig. 2.6(a). AD* can be thought of as a way to use D* Lite without needing to spend a long time (that might not be available) on producing an initial plan. For that, the usefulness of AD* is not in decreasing the total number of expanded nodes, but to avoid having to expand a large amount of nodes at one search until AD* converges to the optimal replanning behavior of D* Lite. To better understand this, refer to Fig. 2.10 which shows the number of expanded cells at each step along the path followed by the agent from the start until reaching a goal on the largeGrid.
AD* begins by behaving like ARA* performing a suboptimal search that expands few number of nodes. The value of forces a bound on the optimality of the initial plan as explained in section 2.1.3. It can be seen in Fig. 2.9(a) that the number of nodes expanded at the first step by AD* with different values of is almost identical at around 400 nodes. The value of step defines how the optimality bound will change from a search to a subsequent one. If it is too large, AD* might need to expand many nodes in one search to satisfy the new optimality level. The ratio between and the step determines the number of searches needed for AD* to converge to D* Lite. The lower this ratio is, the faster for AD* to become D* Lite.
For the curves shown in the Fig. 2.9(a), the ratio between and the step is lowest for AD* with and step = 0.2. With these parameters, AD* converges fast to D* Lite and hence expands a minimal number of nodes as in table 2.5. However, AD* needs to expand a large number of nodes per one search reaching 1500 nodes at the peak. When this step is kept and is increased to be 6, AD* takes longer to converge to AD* and the total number of expanded nodes is slightly higher than the previous case, but AD* still needs to expand a large number of nodes per one search reaching around 1350. This indicates that the step should be decreased. When the step is decreased to 0.08 and is set to 3.5, AD* takes less to converge to D* Lite, and at some searches it needs to expand as much as 1400 nodes to improve the path while planning and replanning, which means that the initial inflation factor was not large enough for AD* to be able to expand less nodes per one search. The total number of expanded cells in this situation is almost double that of the previous two situations. If this step is kept and is increased to 4.5, we get a smoother curve with a peak at 700 nodes per one search. Note that most searches are lower than the peak, so when time is available AD* might do more than one search per step. However, a relatively low upper bound is successfully set on the number of nodes expanded per search. This comes at the expense of the total number of expanded nodes as seen in table 2.5. It can be seen in Fig. 2.9(a) also that after AD* converges to D* Lite it behaves the same way whatever was the values of and the step, replannning only when changes are detected in the grid (small peaks after the 50th step).
To conclude, decreasing the step means that optimality will be increased more slowly, which means that more searches will be done until and hence more nodes will be expanded. Selecting a suitably small step and an adequately large value for will modify AD* to expand a maximum number of nodes based on the needs and the limitations of the task. However, the appropriate values of these parameters depend on the nature of the graph, which makes the task of selecting them difficult.
Figure 2.9(b) shows the number of cells expanded by AD* compared to other methods. Backward A* expands a large number of nodes at the start, then does nothing until a change is detected in the grid, when it plans from scratch expanding also a large number of nodes. The number of expanded nodes in planning decreases as the agent gets closer to the goal because the space of the search gets smaller. D* Lit has a very high peak at the start expanding more than 2800 nodes (which is slightly more than backward A*), then it expands very few nodes (generally less than 500) whenever a change is detected in the grid. Again, the number of expanded cells in replanning decreases as the agent gets closer to the goal. ARA* expands few nodes at the start, then expands fewer or no nodes at the subsequent steps until a change happens where it expands again a few number of nodes. This is because it is never given more than 30 subsequent steps without a change in the grid making it unable to get close to 1. Finally, AD* starts by expanding few nodes, then improves the plan expanding more nodes whenever possible. It never expands more than 800 nodes in a search. Around the 50th step, AD* becomes equivalent of D* Lite expanding nearly the same number of nodes for replanning.
3.1 Dynamic Constraints in Incremental Search
As discussed before in section 1.3.2, A popular way of dynamic planning using incremental heuristicbased algorithms is to plan in the timeconfiguration space as in [6]. In this method, the state is described by the parameters of the configuration in addition to time. A backward AD* search is then done over this space after constructing a roadmap that extrapolates the positions of the moving obstacles along the time dimension. As the space is large and complex, a probabilistic roadmap technique is used.
Because the search is backward, it starts with a goal state trying to reach the start state. However, the goal state is not known because the time of arrival of the agent to the goal cannot be predetermined with certainty. To get around this, a minimum and maximum arrival times need to be determined. Multiple goals are then inputted to the search at the same position but with different time values starting from the minimum arrival time until reaching the maximum arrival time with a small time step between each goal and the next.
In addition to that, the heuristic and cost function have to include time such that the planner can be asked to prioritize time or distance as well as other factors (like smoothness, closeness to the center line… etc.) while finding a path. The cost of traversing a path is given by:
(3.1) 
where is the time taken to go through the path, is the cost of the path based on all relevant criteria other than time (in the simplest form, it is only the distance), and and are weighting factors specifying the extent to which each of these values contributes to the overall cost of the path . If is larger than , paths that take shorter time will be favored in the planning.
Obtaining a heuristic is more complex. The heuristic in this case should be a lower bound estimate on the cost from the start node to a certain node . Since the cost is composed of two parts, time and other aspects, lower bound estimations of the needed time as well as cost according to other criteria from the start node to are needed. To get these two estimations as accurate as possible (such that the search becomes as efficient as possible), Dijkstra’s algorithm is used for one time at the beginning over the configuration space only (without considering the time dimension) to find the heuristic values.
Despite all these complexities, this scheme, with the way it is presented in [6], only allows the agent to either move with a fixed speed or stop. Stopping after moving with a certain fixed speed and vice versa results in an infinite acceleration violating the dynamic constraints of a carlike agent. The reason this planner cannot plan for multiple speeds is that the state itself does not include any information about the speed. To set a bound on the change of speed, it has to be included in the state, such that when the planner expands a state, it checks the speed of the agent at this state and selects successors (or predecessors because this is a backward search) with speeds within a limited range of the speed at the current state.
The most straightforward way of adding speed to the state in the scheme described above is to make the state space 4dimensional consisting of the position, time and speed. This would explode the state space making the search very time consuming. In addition, it would create new complexities for this backward search. As the goal states are needed at the start of the search and as there is no information about the speed of the agent when it arrives to the goal, goal states that include all possible speeds and all possible times of arrival of the agent at the goal position must be added to the queue at the start of the search. This would create a huge number of goals.
From this, it can be seen that this scheme is unsuitable for applying the desired constraints on the agent. The complexities of this method and its disadvantages, from the perspective of a carlike agent and in the context of the autonomous driving system this path planning should be integrated in, come from several factors summarized here:

State representation: performing the search in the timeconfiguration space makes it very complex to have control over the speed. In fact, the standard way of adding multiple speeds to timeconfiguration roadmaps is to make connections between nodes in the roadmap denser along the time axis, such that an agent at a node can arrive at the next time step to a nearby node or to further nodes. The further the node that will be reached in one time step, the higher the speed. It is clear that this dramatically increases the complexity of constructing the roadmap and of planning itself. Even when a dense roadmap is constructed allowing multiple speeds, no rational decision can be taken for the speed of the next state because the speed at the current state is not known to the planner (speed is not part of the state).

Backward search: The use of backward search complicates controlling the speed along the planned path as it requires entering multiple goals states that include all possible times and speeds of the agent when it arrives to the goal position. The range of the possible values might be large and taking only part of them exposes the agent to the risk of losing the optimality of the planned path. To set bounds on the change of speed, it is more natural and intuitive to start the search from the start node where the initial speed of the agent is already known. Successors can then be chosen according to the current speed.
It is important to remember that the reason of using backward search is to allow anytime planning and replanning while the agent is moving for the reasons mentioned in section 2.1.1. However, the tests done in chapter 2 showed that AD* and D* methods are only efficient for systems where replanning is frequently required. While they expand much fewer nodes, the time needed to expand one node in these algorithms is much higher than the time needed to expand one node in A*. So, for roadmaps containing a relatively small number of nodes and do not require replanning many times, A* is more efficient. This is the case with the system the designed path planning method will be integrated in. The system only plans for a relatively short horizon of distance between 100  200 meters. Roadmaps created over this distance would not be of a size that rationalizes the use of AD* or D*. This is especially true noticing that the system runs periodically such that no replanning would be needed between successive runs.

Preprocessing: The scheme described above uses a lot of preprocessing (which are supposed to be done offline) such that path planning is done fast online. This includes constructing a large roadmap in the timeconfiguration space and running Dijkstra’s algorithm on the configuration space to obtain heuristics. This is unhandy for a system like the one dealt with here because roads data are obtained periodically from navigation systems and this preprocessed information is, most probably, not available from these systems. It would be difficult to do all this preprocessing every time the agent enters a new road and receives new data because of the very reactive nature of the task.
3.2 A Proposed Efficient Planning
At this point, a new scheme for dynamic planning using incremental search algorithms is proposed. The proposed planner is based on forward search for the reasons explained in the previous section. First, the idea of constructing the roadmap as needed while the planner is searching is explained. This is combined with a new idea to give the planner more flexibility in selecting the heading angle of the agent. All of this is done in the configuration space only. The Dynamics of the environment are then considered by adding velocity and time to the state space and on the same time limiting the search to three dimensions only rather than four.
3.2.1 Graph Formation and Position Discretization
In the traditional 8connected grid shown in chapter 2, the agent is in the center of a square cell and can move to the free centers of the eight surrounding cells. While this representation of the environment is efficient and simple, it is clearly not suitable for a carlike agent traversing a road. Such an agent is typically not able to take very sharp turns due to its kinematic constraints. Moreover, the flow of the road itself prevents the agent from having some heading angles that would violate the road protocols. For example, an agent cannot move backward in a road that goes forward.
To solve this issue, usually road data received from navigation systems includes a parameter known as the heading angle (). This parameter indicates the forward direction of the road at each point along its center line as can be seen in Fig. 3.1. In this proposed scheme, the idea is to limit the heading angle of the agent such that it is within a certain range of at any point on the road.
It is more efficient to build a roadmap while searching rather than constructing a complete roadmap first and then search over it. For that, a successor function is used. This function takes a position and returns a list of positions that are valid for the agent to reach from the input position. The pseudocode below shows the basic operation of this function.
algocf[htb]
The function initializes a list named as an empty list. If no valid successors were found, the list will be returned empty as it is. The function then iterates over a range of values for the deviation angle starting from the maximum allowed deviation angle from the heading angle of the road until reaching with a step value of . All successors are equally away from the current node. That is, they are all on the circumference of a circle centered at the current node. This makes calculations simpler and more efficient while in the same time increases the flexibility. The parameter determines the fixed length of movement. For each deviation angle, the location of the successor is determined and then a static collision checking function named checks if there is any static obstacles between the current node and the successor node. If no obstacle is found. The successor is added to the successor list.
The successor list does not only hold the successors but also the cost of moving from the current node to each one of them (hence no need for an additional cost function). In the simple case of considering only distance in the cost, all successors have the same cost of . If the cost is more complex (for example, if it considers also the deviation angle), they may have different cost values. It is more efficient to calculate this cost in the function because it has access to all information about the current node and the successors. For example, if it was decided to include the angle of deviation in the cost, only this function is able to do this because it is the only place where this data is known.
The complexity of the collision checking function depends greatly on the chosen value of . If this value is small enough such that if there is a static obstacle between and one of the successors, it will be touched by that successors (on real roads this can be safely 1 or 0.5 meter), then it is enough to check if the successor location touches any of the static obstacles in the environment. Otherwise, multiple points should be checked along the line between node and the successor.
With this successor function, a modified A* algorithm is used as shown in algorithm 3. This A* version does not allow any node to be expanded more than once. For this to still guarantee an optimal solution, the heuristic has to be consistent satisfying the condition in (2.2). Luckily, most naturally admissible heuristic (like the Euclidean distance) are also consistent [12]. A consistent heuristic guarantees that the expanded node was reached in the lowest possible cost and hence it will not be reached in a lower cost in the future such that it must be expanded again.
The proposed A* search forces two other limitations on the search making it very efficient. Once a node is expanded it enters the set. Then, it is not possible to expand this node again and also to expand any node within the same cell (line 9). More than that, the queue can only hold one node per cell (line 12). A node in the queue is replaced by another node in the same cell only if the cost of reaching from the start (the value) is lower (line 13). After successfully finishing the search. The path can be traced by back pointers starting from the node found on the goal cell.
The cell in which a node is located is calculated very easily:
(3.2) 
where is a system parameter determining the length of one cell. The lower this parameter is, the higher the resolution of the search, but also the more expensive as will be seen in section 3.4.1.
With these restrictions, and under the condition that the heuristic is consistent with respect to the cost, it can be easily intuitively proven that the resultant scheme returns paths that are either cheaper or have the same cost of paths returned by normal uniform grid schemes, with at most the same computational cost (but in most cases far lower because the step of constructing the roadmap is omitted). A mathematical proof of this statement, however, is not within the scope of this report.
The operation of the new successor function and the proposed A* search is best explained through an example. Figure 3.2 shows the operation of (left part) and the nodes A* keeps in the queue and the set (right part). At first (first row  left), the algorithm is given a start (orange) and a goal (green) positions. The start position is expanded and the successor function is called. Here, , , and . The successor function returns five valid successors. These successors are to be added to the queue. However, the modified A* algorithm doesn’t allow more than one node in a cell. So, only four nodes are admitted to the queue (first row  right). At the same time, the expanded state is added to the set and hence it is not possible to add any node to the same cell anymore.


The modified A* then, based on the priorities of the nodes in , expands another state (second row  left) and calls the successor function on it. The function returns three successors only because the others pass through an obstacle. Since two of them lie on the same cell, only two of the successors are added to the queue (second row  right), while the expanded state is put in the set.
A third node in the queue is expanded (third row  left). The successor function returns four valid successor positions. Since three of these positions are on the same cells as other nodes in and since they don’t have a lower cost from start (value), they do not replace the existent ones. At the end, only one successor is added to the queue, and the parent state is added to the set (third row  right).
The search algorithm then expands a fourth node (fourth row  left). The successor function finds five valid nodes. Only one node per cell is accepted to the queue. So, three nodes are added (fourth row  right). It can be seen that one of the nodes touches the goal cell. The search will end successfully whenever this node is expanded (most probably the next step of the search).
Comparing the resultant path (shown in dark arrows in the last plot in Fig 3.2), to a one that would result from a typical uniform roadmap method with the same cell size as seen in Fig. 3.3, demonstrates the merits of the suggested method. It can be seen that the suggested approach expands less nodes (magenta), keeps less nodes in the queue (orange) and finds smoother and shorter (or cheaper) paths. Another important advantage is that the position of a node (including the start and goal nodes) can be anywhere in the path while in the uniform sampling approach, only predefined locations are allowed.
It should be mentioned that for this modified A* algorithm to work properly, has to be greater than the diagonal length of the a cell (). This ensures that all the successor positions will be outside the cell of the parent node. Otherwise, all successors inside the cell of the parent node will be automatically eliminated by the modified A* which can result in the failure of the search.
3.2.2 Collision Checking and Speed Discretization
To be able to avoid dynamic obstacles and at the same time force dynamic constraints on the velocity of the agent, a suitable state representation is needed. In the proposed method, the state is represented by , where is the position of the center point of the agent, and are the speed the agent will be at when reaching this state, and the time at which this state is reached respectively. This is a 4D representation. However, it will be seen that the search is done over three dimensions only.
A new successor function named DynSucc is used to return successor states with this 4D representation as shown in the pseudocode below. The new successor function is very similar to the one presented in previous section, but it returns successors with different speeds. The function iterates over several values for acceleration , starting from the maximum allowed deceleration until reaching the maximum allowed acceleration returning valid successors for each acceleration value in the different directions determined by the inner loop iterating over values of the deviation angle .
algocf[htb]
For each acceleration value, the speed required to achieve this acceleration is calculated (line 4). The calculation assumes that the agent will change its speed instantly after leaving the parent node and continue applying a fixed speed until reaching the successor . This assumption is very important in simplifying collision checking calculations as will be seen later. With this, the time needed to reach the successor with this speed is:
(3.3) 
It can be seen that the average acceleration between node and a successor is then:
(3.4) 
Since all variables are known except , the equation above is a quadratic equation that can be solved to get:
(3.5) 
Because speed is not allowed to be negative (the agent is not allowed to go backward), only the positive solution is taken.
The function checks if the resultant will be real (line 3). If so, it is calculated (line 4) and checked against the speed limits of the agent (line 5). After ensuring that this speed is valid, the time needed to reach successors that are away with this speed is calculated (line 7), then successors with this speed and time but different deviation angles from are calculated exactly as done in the function Succ. The only difference is how collisions are checked, which happens in a function named isDynFree. It returns true only if it is safe for the agent to go from node to the successor with the specified speed. The principle behind this function is explained below.
After iterating through all acceleration values and deviation angles, DynSucc returns a list of successor. The maximum number of returned successors is equal to the number of possible acceleration values times the number of possible deviation angles. Each successor is returned with an associated cost. The cost is composed of a time part and a path part weighted by weighting factors and respectively such that . The path part includes the costs of all factors other than time. In the simplest form, it is only distance. The weighting factors allow the search function to find paths optimized for either time or other factors based on what is desired by the user. For example, if paths that take shorter time will be favored over paths that have shorter distances (if distance is what is considered in the path part of the cost). The cost is calculated in the successor function for the same reasons explained in the previous section.
To ensure that the transition of the agent from node to a successor is collisionfree, a new method, based on collision times, is used. In this method, the agent as well as the moving obstacle are modeled as moving 2D shapes with different velocity vectors^{1}^{1}1A velocity vectors has a magnitude equal to the speed and a direction equal to the direction of movement.. For simplicity, the agent and the obstacles are considered as circles with various radii as shown in Fig. 3.4.
Suppose that the agent is positioned, at the time of the current node , in , and has a velocity vector and a radius . Suppose also that the environment has one moving obstacle with position at the time of the current node, velocity vector and radius . Assuming that the agent and the obstacle are moving with fixed speeds, their positions, after a time interval passes, are given by:
(3.6)  
Now, if a collision between the agent and the obstacle will happen at (collision time with obstacle ), the distance between their centers at that time will be equal to the sum of their radii:
(3.7) 
Squaring both sides and rearranging the terms leads to:
(3.8) 
which is a quadratic equation that leads to two solutions for . If no one of them is a positive real number, then no collision will ever happen between the agent and the obstacle. If only one of the solutions is a positive real number then it is the time of collision. Otherwise, the collision time is the minimum of the two solutions.
If the collision time is greater than , the agent is safe from colliding with the obstacle while moving from to . When the environment contains multiple moving obstacles, collision times are calculated for all of them. It is enough to find out one collision time below to conclude that the transition is not valid as done in the pseudocode of the isDynFree function shown below.
algocf[htb]
Usually the agent receives information about the positions and velocities of the moving obstacles before planning, at the time of the start state. So, before calculating the collision times, isDynFree linearly extrapolates the positions of the moving obstacles to the time of the parent node. In addition to that, the radii of the agent can be made such that they grow up with time to compensate for uncertainty. This extrapolation step is better done in the successor function itself since the result is the same for all successors of one parent node. This is what was done in the MATLAB implementation that will be introduced in the next section.
With this successor function and this collision checking technique, the proposed scheme uses a modified ARA* (anytime repairing A*) algorithm to plan a path and construct the search space while planning. The pseudocode of the algorithm is shown below. The algorithm contains three functions: key() which calculates the priority of a node in the queue, improvePath which performs an suboptimal search, and main which initializes the search and manages it. The space over which this algorithm performs the search is threedimensional composed of the position and velocity. Time is only used to calculate the cost and the heuristic functions and to check for collisions.
Several optimization ideas are applied in this algorithm in addition to an anytime feature allow producing quick suboptimal paths in timecritical situations. The idea of limiting the number of nodes per cell to one node is still applied here. However, a cell is a 2D area over the 2D space. Speed serves as a third dimension in the search space here. So, this idea is extended to allow only one node per a speed value in a cell. Figure 3.5 shows a visualization of this idea where every cell carries a pile of layers. Each layer represents a different velocity and only one position is allowed per layer.
It can be noticed that the number of layers grows up as the cell gets away from the start position. In general, the number of possible speeds at a node grows fast the deeper this node is in the search tree, reaching very high numbers and making the search very time consuming. In many cases the differences between these speed values is very small. To solve this problem, the algorithm allows only one node per block. The block to which a node belongs is defined as a 3integers array:
(3.9) 
where is defined in (3.2) and is a system parameter chosen based on the needs of the user. The lower the value of the more precise the search with respect to the speed but also the more expensive it is. Figure 3.6 shows blocks over two adjacent cells. Some of these blocks contain nodes that have been expanded before and hence no node can be added to any position in these blocks (magenta), other blocks contain nodes in the queue (orange), while others are empty (white).
The algorithm gets a start state and a goal position. It doesn’t get a goal state because it is unknown^{2}^{2}2speed and time at reaching the goal are unknown as explained before.. The algorithm then starts by initializing the queue and the and (for inconsistent) sets to be empty (line 24). Three values are also initialized: value of the start node is initialized to zero, (a variable that holds the minimum cost the goal position has been reached by) is initialized to infinity and the inflation factor is initialized to its initial value . must be greater than or equal to 1. If it is one, the algorithm will perform an A* search.
The start node is then inserted to the queue (line 26) and a search is performed using improvePath. improvePath has a while loop that continues running as long as the minimum key in the is lower than the minimum found goal cost (initially infinity) and the queue is not empty. It picks the state that has the minimum key in the queue. Since this node is expanded, it is moved to the set (line 6) and it is not possible anymore to add any nodes from the same block to the queue.
The valid successors of the expanded state are obtained using the function DynSucc(). A successor is added to the queue only if there is no state in inside . If that is not the case, then is added to the set if there is no node inside that set that lie on the same block. The set contains nodes to be reconsidered in next searches because they may have gvalues lower than the gvalues of the nodes that were expanded before in the same blocks (hence the name).
If no node was expanded before in , the successor is considered for admittance to . First, it is checked if it lies on the same cell as the goal position. If so, is changed to the value of , and is labeled as the goal state. If there is another node in open that is on the same block as then it is replaced by only if has lower cost from the start node (value). If is the first state in , it is directly inserted in the queue. When the search finished successfully, the path can be traced by following back pointers starting from until reaching
After improvePath returns, the found solution is published and is guaranteed to be suboptimal (its total cost is guaranteed to be less than or equal times the cost of the optimal path). If time is still available and has not reached one yet, the following sequence of actions is repeated: is decreased by a certain determined amount, states in are moved to and all keys in are updated (keys have changed because changed). Then improvePath is called again.
It is to be noted that, for this scheme to work, a consistent heuristic with respect to path cost and time cost is needed. A simple one would be as shown below assuming that the path cost only considers the distance. This heuristic estimates the distance to the goal as the Euclidean distance and estimates the time needed to reach the goal as the time needed to traverse this Euclidean distance at the maximum speed the agent can have. Note that the weighting factors in the heuristic have to be the same as those in the cost. While this heuristic is consistent, it is not very informing. Obtaining a more informing heuristic would be more computationally expensive. However, the proposed anytime planner can quickly find paths even with this not so informing heuristic.
algocf[htb]
The resultant scheme that has been shown in this section produces paths that are more consistent with the kinematic constraint of the agent (heading angle) as well as the dynamic constraint (speed). The presented algorithm gives flexibility to optimize the found path with respect to time or other aspects (like distance, closeness to the center line, smoothness,… etc.). It is also able to quickly produce suboptimal paths in timecritical situations and continue improving the path as time allows.
The speed of an agent following a path produced by this scheme is a staircase function of time, whenever the agent leaves a state , it immediately change its speed at time to the speed of the following state along the path . This section is concluded by discussing the consequences of changing the speed function of the agent to a more linear one. Suppose when the agent leaves a state along the path, it does not change its speed immediately. Instead, it linearly accelerates until reaching the next state. In this case, we should go back to the successor function and change some calculations. Suppose that we want to check if is a dynamically valid successor of for some acceleration value . From equations of motions, it can be found that:
(3.10) 
all the variables in this equation are known except . This is a quadratic equation that can be solved easily and has only one solution due to the fact that moveTime cannot be negative:
(3.11) 
The speed that will be reached by the agent applying constant acceleration from to the successor is then:
(3.12) 
Now, the position of is calculated as in the DynSucc function (line 9) and the validity of the successor is checked. The transition between and was found to be free of static obstacles, and the collision times of dynamic obstacles are calculated. However, with the current modeling of the agent motion between the two states, it does not move with a constant speed. So, (3.6) and (3.8) are not valid anymore. For an agent and a moving obstacle , we continue to assume that the obstacle is moving in a constant speed, but the agent has acceleration now. Hence:
(3.13)  
When substituting this in (3.7), squaring both sides and rearranging terms, the result is a fourth degree polynomial in that has four different possible solutions. Obtaining these solutions analytically or numerically is very computationally expensive that it will significantly affect the execution time of the whole algorithm. After obtaining the solution, the collision time can be decided as done with solutions of the quadratic equation in (3.8).
This modeling, however, allows having a much more informing heuristic where the time needed to reach the goal is estimated assuming that the agent will continue accelerating with the maximum available acceleration until reaching the goal instead of assuming that it will directly move with the maximum speed. It could be possible, with a small value of and some assumptions about the environment to use this motion modeling (linearly accelerating form a node to a successor) and still check the collision time assuming that the agent will move from to with a fixed speed equal to that reached at . This idea, however, is not further investigated in this report.
3.3 Simulation Environment
A simulation package, very similar to the one discussed in section 2.2, was written. Because of the strong similarity, this section is only to discuss the main differences. The main aim for this package is to provide a quick and reliable way to validate and test the ideas presented in the previous section, mainly using fixedstep simulation. A block diagram of the main components of the package is shown in Fig. 3.7.
The main difference is the use of a new structure named searchProblem. This structure holds all the information needed to do the search, including the grid, the moving obstacles and the known information about them, the successor and heuristic functions as well as the system parameters like the weighting factors of the cost and heuristic, , and . An empty version of this structure can be created using the function createEmptySearchInfo. The different elements of the search problem can then be added to this empty structure.
The grid is still represented with the structure grid similar to the one seen in section 2.2 with a slight difference. Now when creating a grid using grid formation functions, an input representing the length of one cell inside the grid in meters has to be provided. This input does not have to be the same as . However, the input is necessary to determine which position lies in which cell (which index in the obstacles matrix) and hence be able to check for static obstacles. The grid is used just as a simple mean of representing the shape of the environment and the positions of the static obstacle. It doesn’t play any other role beyond that, unlike what was done on the simulation package seen in section 2.2 where the search itself was done over the cells of the grid. Here the search and discretization of movement are completely independent of the grid cells. In a practical situation, other representations of the environment might be more efficient^{3}^{3}3as will be seen in the C++ implementation presented in the next section..
Two readily available grids are provided. The first one is oneRoadGrid. It represents a one direction road that goes from left to right as can be seen in Fig. 3.9. The cell length^{4}^{4}4again, this has nothing to do with the discretization parameter . of this grid is one (meaning that the smallest static obstacle is a square). The second grid (oneRoadGrid2) has the same dimensions as the first one but the length of one cell in the grid is (so the obstacles matrix is half the size of that of the first one and the smallest static obstacle possible is a square). It was used to ensure that the search is independent of the number of cells in the grid, which was successfully confirmed. Note that the direction of this road is always to the right ( all over the road). This makes it simpler to do the planning and use the successor functions discussed earlier. To allow this package to deal with curved roads, however, a way must be found to encode the information about the forward direction of the road in any specific location in the grid (which is not done in the current version of the package).
The package includes two different static obstacles checking function. The function isFree returns true simply when the center position of the agent (which is modeled as a circle) is on a cell that does not contain a static obstacle in the obstacles matrix. This makes the calculation very fast. However, it doesn’t guarantee that the complete agent circle doesn’t touch any static obstacle. isFreeR guarantees this by checking the positions of the four corners (two horizontal and two vertical) of the agent circle as well as the center position. However, it takes a considerably longer time.
The successor function succRightCSV is a direct implementation of the DynSucc function presented in the previous section. In this function, is assumed to be always 0 and hence it is only suitable for roads where the forward direction is rightward. The heuristic function hEuclideanMaxV is also a direct implementation of the heuristic shown in the previous section. Both the successor and the heuristic functions take as inputs a state as well as an instance of the searchProblem structure. At the same time, handles of the functions are kept in searchProblem, hence the double arrows in the block diagram.
Moving obstacles are modeled as circles with different radii and fixed velocities and are saved in a matrix named agents inside the structure, where each row of the matrix represents the information of one obstacle. An obstacle can be added using the function addAgent which takes the initial position, radius, speed, direction (angle). Positions, speeds and other parameters of the moving obstacles can be modified during the fixedtime simulation by directly accessing the matrix or by using functions like updateAgentsBounce and updateAgentsRepeat. Both of them change the current position of the agent by extrapolating their initial positions to the current time in the simulation. The first function “bounces back” any agent touching the borders of the grid, by changing its direction of motion to the reflection of the angle this agent touched the border with. The second one, brings the agent, when it goes out of the grid, from the opposite side of the grid with the same direction of motion. The package also allows adding moving obstacles with zero speed. In this case, they are effectively static obstacles.
The search functions in this package are implemented in a completely different way than it was done in the first package. This change is primarily because of the fact that, unlike the 8connected gridbased planning done in the first package, the maximum number of nodes to be considered in the search here is far greater than the number of cells in the grid. So, values cannot be saved in matrices of the same size as the obstacles matrix. For that, a matrix named is used to keep all the search nodes and the associated information, including keys, values, back pointers… etc. Each row in this matrix represents one node and hence, the number of nodes that can be considered is not limited in any way. Two search functions were implemented in the package. ARAStar, which is a direct implementation of the algorithm shown in the previous section, and AStar, which is the same but without the anytime feature (it performs the search for only one value of the inflation factor and does not change during the search). A path is exctracted in a very similar way to what was done before by the function tracePath. The results of the search are saved in the familiar searchInfo structure together with some other useful information like the execution time of the search.
The executable test script oneRoadTests is where everything is wrapped together. The script initializes a searchProblem and calls a search function to find a path. oneRoadTests runs a fixedstep simulation. The time step can be determined by the user. At each step, the locations of the moving obstacles are updated. If any change happened (detected by the function isChanged) such that the trajectories of the obstacles are changed, the search function is called to start searching from scratch again. Graphing is done via a function named plotD in a similar way to what is done in the previous package. However, the current graphing function is optimized to make animation very fast by changing the locations of the drawn objects at each time step rather than deleting and redrawing them again. The animation loop is run inside the test scrip itself and no separate function is provided for that.
3.4 Results and Analysis
Various tests were done to explore the behavior, advantages and limitations of the proposed scheme. A C++ implementation of the modified A* algorithm shown in section 3.2.1 was written and tested over a real road data from DLR. In addition to that, MATLAB simulations, using the package presented in the previous section, were also done to validate the ideas shown in section 3.2.2.
3.4.1 C++ Implementation and Tests
A C++ Implementation of all the ideas presented in section 3.2.1 was done. The code was written to be used with environments described by a usual format of road data that can be obtained from navigation systems. In this format, the road is described by position points along its center line. From these points, the curvature of the road can be known. For each point along the center line, there is a value for (the road forward direction) and a parameter named indicating the center line distance from the start point of the road. Combined with the width of the road, this representation has all the needed information to plan a path. The forward direction of the road at any point can be easily obtained by finding the closest point on the center line and getting its associated .
Data from a real test road in DLR was used to test the implementation. The road can be seen in Fig. 3.8. Planning was done from several start points to ensure that the search method is working properly. The implemented method is able to plan from any start location in the road to any goal location. The plan, if found, guarantees that the complete agent circle will completely stay within the road. The current implementation does not consider any static obstacle other than the road borders. However, once a suitable representation of static obstacles within the road is decided, the code can be easily extended to avoid them with minimal additional computational cost.
Table3.1, shows execution times as well as other important information about the search, including number of nodes inserted to , number of expanded nodes (nodes inserted to ), number of nodes in the found path and length of the found path (in meters). These metrics are shown for a specific search problem, where the agent starts from the green circle shown in Fig. 3.8 and plans for 100 ahead to the red circle. The execution times are calculated by averaging the times of one thousand executions to get more reliable results. Note that these times where measured by running the code in a desktop operating system. In a practical situation where the code is run in an embedded processor with a real time operating system (RTOS), the execution time would be much shorter. The cost function used is the Euclidean distance added to the absolute value of the deviation angle () from the road direction (), which causes the planner to favor paths that closely follow the center line of the road (this can be seen in the resultant plan shown in Fig 3.8). The used heuristic is the center line distance between a location and the goal, which is simply calculated as the difference between two values of the parameter that comes with the road data.
time (ms)  #  # Expanded  # Path  Path length  Path cost  
= 0.5  
0.4    1  1      
0.5  1.392  729  694  202  100.5  108.005 
0.6  1.615  682  543  168  100.2  101.564 
0.7  2.858  875  856  144  100.1  101.671 
0.8  2.645  819  797  127  100.8  101.847 
0.9  2.184  731  688  113  100.8  101.673 
1.0  1.820  693  577  101  100.0  101.047 
=  
0.5  11.842  2346  1735  202  100.5  101.373 
1.0  1.752  693  577  101  100.0  101.047 
2.0  0.266  174  149  51  100.0  100.873 
3.0  0.170  108  108  35  102.0  103.745 
The upper part of the table explores the effect of having different ratios between and . It can be seen that when is shorter than , no path can be found at all. Only one node is expanded and when the successor function is called, it returns the successors, but all of them are not accepted in the queue as they lie on the same cell as the parent. When is the same as or just longer than it (second and third rows of the table) a path could be found, but it can be seen that the number of nodes inside is lower than the case when is higher (at 0.7 and 0.8) this is because many of the nodes are rejected from for the same reason mentioned above. In some cases, this might prevent the planner from finding a path, but here it did not, most likely because of the position of the start node. However it can be seen that the cost of the path when is much higher than the length of the path. This indicates that only successors with large deviation angles were accepted in and others were rejected. To prevent nodes from being initially rejected from because they lie on the same cell as their predecessors, must be greater than as explained in section 3.2.1.
At = 0.9 and 1.0, the number of nodes inserted to as well as the number of expanded nodes decrease again. This is because one step along the planned path with a larger gets the agent closer to the goal than one step with a smaller value. It can be noticed that, unexpectedly, when the resolution of the planned path is decreased to , the found path has a smaller length than when the resolution is higher (at 0.8  0.9). This is because of two facts: 1) the length of the path is always a multiple of .^{5}^{5}5path length = (number of nodes in the path  1) 2) the cost is composed of the distance and the deviation angle. So with = (0.8  0.9), the smallest number of steps needed to reach the goal at minimum cost made the path longer. It can be seen, however, that for higher resolutions in (0.6  0.7) the goal cell can be reached with shorter distances because higher resolution allows more curvature in the planned path and offers more flexibility as the length of the path is a multiple of a smaller numbers. It can be concluded that choosing to be double the length of the cell offers a good tradeoff between allowing as much successors as possible to be included in the search per one node and at the same time keeping the search more efficient.
In the lower part of the table, is fixed to be half the length of one movement. The effect of changing is then seen on the efficiency and quality of the search. It can be noticed that as the value of decreases, the execution time increases exponentially, so is the number of included^{6}^{6}6those inserted in . and expanded nodes. This suggests that should be chosen as long as possible with respect to the desired resolution of the road. It can be seen that when is doubled from 0.5 to 1, the execution time decreases by a factor of 10. At the same time, the length of the resultant path decreases by , which means that computational time is saved at almost no cost over the quality. The cost and path length when is higher than when because with smaller values of , the planner is able to produce paths that follow the center line closely, which requires more deviations and more distance. As is further increased, the execution time still drops significantly but the length of the found path is larger and the cost is higher. This is primarily due to the fact that, with larger values of , the planner is less able to find paths with precise curves such that the center line is followed and small deviations have greater effects on the trajectory of the agent. This shows that the suggested scheme offers a way to increase the “maneuverability” at the cost of the computational complexity.
3.4.2 MATLAB Simulation Results
The ideas presented in section 3.2.2 were verified through MATLAB simulations using the package described in section 3.3. Two search algorithms were implemented and tested. One of them is the same as the modified ARA* algorithm presented in section 3.2.2, and the other is the same method but without the anytime feature as mentioned before. At first, the behaviors of the algorithms were tested using fixed timestep animations, which can be created in the package for any searchProblem provided by the user. Figure 3.9 shows instances of one of these animations at different points in time. At these shown instances, the agent changes its previously planned path as some changes are observed on the trajectories of the moving obstacles. The agent starts with an initial speed of , and moves to a goal near the end of the road. Note that the 2D figure does not show any information about the timing of the path. Thus, although the shown path may go through some obstacles, the car does not collide with any of them because as it moves the obstacles also move.
Tests were done also to see the effect of changing the weighting factors of the cost and the heuristic. Figure 3.10 shows the paths and speed profiles that result from running an inflated A* search with , , , and initial speed . Note that an inflated search is used in the tests presented in this section because running a completely optimal A* search takes a very long time. In the upper part of Fig. 3.10, the paths when the search optimizes time (
Comments
There are no comments yet.