Conditional Task and Motion Planning through an Effort-based Approach

11/04/2017 ∙ by Nicola Castaman, et al. ∙ Università di Padova 0

This paper proposes a Conditional Task and Motion Planning algorithm able to find a plan that minimizes robot efforts while solving assigned tasks. Unlike most of existing approaches that replan a path only when it becomes unfeasible, the proposed algorithm takes into consideration a replanning every time an effort saving is possible. The effort is here considered as the execution time but it is extensible to the energy consumption. The computed plan is both conditional and dynamically adaptable to the unexpected environment changes. Authors prove the completeness and scalability of their proposal.



There are no comments yet.


page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Let a human assign a task to a robot. e.g., the pick of a can of coke from a cluttered table or the resolution of a Navigation Among Movable Obstacles (NAMO) [1, 2] problem (see Figure 1). In order to achieve these targets, the robot needs to fulfill high-level task planning in conjunction with low-level motion planning. As stated in [3], efficient algorithms exist to solve task and motion planning problems in isolation; however, their integration is still challenging in terms of generality, completeness, and scalability.

Authors provide a Task and Motion Planning (TAMP) system which combines a Fast-Forward (FF) task planner [4] and a revisited version of the Lazy Kinodynamic Motion Planning by Interior-Exterior Cell Exploration (L-KPIECE) motion planner [5], as proposed in [6, 7]. The goal is to give the possibility to any type of robot to use this system for solving any type of task, from manipulation to navigation, in an unknown, real-world scenario. The implementation aims to reflect the human behavior: humans take plans while efficiently managing their time and energy. In detail, given the actions that the robot can perform, together with its initial and final states, one feasible task plan, namely a reachability graph, is computed by using FF (no optimality check is required). The corresponding motion plan is generated by using [7]. Based on the knowledge of the robot world at its current state, a collision checking is performed so that the final plan would take into consideration only those objects and sub-tasks that minimize robot efforts (the execution time in this case, but the energy used by the robot can also be considered). The robot starts moving. While acting, if failures or environment changes occur (e.g., objects are detected which obstruct the computed path), an online replanning routine is adopted. Starting from the robot current state, it regenerates the remaining task plan by evaluating and substituting those actions which preconditions are no longer feasible. Starting from the new task plan, the algorithm proposed in [7] is invoked to find a new feasible set of motions.

Fig. 1: A Navigation Among Movable Obstacles problem: while navigating towards a goal, the robot takes into consideration the possibility of moving movable obstacles.

Completeness. Acting in the real world is non-deterministic: the planner has no complete and certain knowledge of the environment and actions can have unpredictable effects. This means that a robot should be able to dynamically adapt to changes and efficiently recover from failures while avoiding the explosion of the computed reachability graph because of the new alternatives introduced. In the case of analysis, starting from a relax task, i.e., a plan which ignores the delete lists of operators, FF extracts an explicit solution to the plan by using a search strategy called enforced hill-climbing (EHC). This strategy does not use backtracking, meaning that some parts of the search space are lost, but if FF fails to find a solution using EHC by getting stuck in dead-ends, it switches to standard best-first searches (e.g., greedy or A* search) which expands all search nodes by increasing order of goal distance evaluation. This recovery routine guarantees completeness. As stated in [5]

, KPIECE is, instead, probabilistically complete. Moreover, the proposed planner handles known and unknown events through the online replanning routine until a maximum number of attempts is reached. This means that if the number of attempts tends to infinity, the probability of finding a plan, if one exists, will tend to one.

Scalability. The task planning problem is exponential in the number of sub-tasks and the motion planning problem is exponential in the number of collision objects populating the workspace. Indeed, multiple sub-tasks exist fulfilling the assigned task and different combinations of sub-tasks can bring to the same result. Moreover, when operating in the clutter, a robot has to decide which objects to move, where to place them, if moving them is necessary or convenient. Focusing on the task-level planning, FF employs an helpful actions pruning that maintains in the relaxed plan only those actions that are really useful in a state, so one can restrict the successors of any state to only those produced by members of the respective relaxed solution. Focusing on the motion-level planning, [7] is based on KPIECE because KPIECE uses the information collected during the planning process in order to decrease the amount of forward propagation it needs [5]. Moreover, by using the Lazy strategy, edges of the computed path are not immediately checked for collision: they are checked only when a better path to the goal is found. By eliminating the majority of collision checks and efficiently forward propagating the exploration during the planning, the proposed algorithm speeds up to convergence in complex problems, like that of TAMP, where collision checking is relatively expensive. It also requires restrained runtime and memory so that it can handle high dimensional systems with complex dynamics.

Ii Related Work

Various researchers investigated the problem of combining task and motion planning. Most of them tried to combine the symbolic reasoning of task planning with the geometric reasoning of motion planning. Dornhege et al. [8, 9], for instance, proposed a system that calls the motion planner only to check the geometric feasibility of the planned tasks. Another example is the FFRob approach proposed by Garrett [10, 11]: it integrates geometric information with the state-of-the-art Fast-Forward (FF) [4] task planner by sampling a fixed set of object poses and robot configurations and then planning with them by using FF. Benefits of FFRob are its probabilistic completeness and exponentially convergence. Its major limit is its focus: the approach solves only a particular class of pick-and-place problems. Moreover, it does not use planning to guide sampling.

Other approaches such as [12, 13, 14] still focus on manipulation tasks but they propose solutions based on Hierarchical Planning that evaluate task-level decisions by using low-level geometric-reasoning modules. In particular, Srivastava [12] combines off-the-shelf task planners with an optimization-based motion planner [15, 16]

that exploits a heuristic function to remove potentially-interfering objects. This approach first plans a task and then tries to produce a motion plan that satisfies the computed set of discrete actions. If the induced motion planning problem is infeasible, the task planning is repeated taking into consideration the set of preconditions that identifies the infeasibility.,

Finally, Dantam et al. [3] propose an incremental task and motion planner which combines discrete decisions about objects and actions with geometric decisions about collision free motion: they use an incremental constraint solver that adds motion constraints to the computed candidate task plan. The task plan is computed by using a Satisfiability Modulo Theories (SMT) approach, while the Open Motion Planning Library (OMPL) is used to find a feasible motion plan. At each failure, the algorithm iteratively increases the plan depth and motion planning timeouts such that it guarantees probabilistically completeness for fixed placements and grasps.

Instead of only focusing on manipulation domains, the proposed approach aims to integrate task and motion planning routines while performing generic tasks, that means robots should be able to use this system in order to handle both navigation and manipulation domains. Moreover, while most of the state-of-the-art approaches evaluate the objects relocation only when free-space motion planning is unfeasible, the proposed algorithm revalues a plan every time an action can save effort, not only when the ongoing trajectory becomes unfeasible.

Iii Problem Statement

This Section defines the Deterministic Task (III-A) and Motion Planning (III-B) problems. These concepts will be the definition basis of the Conditional Task and Motion Planning authors will introduce in III-C.

Iii-a Deterministic Task Planning

Suppose the assignment of a task to a robot . A task planner aims to find an optimal plan solving . moves from its start state to a goal state by combining the set of actions that is able to perform according to its capabilities.

The problem is deterministic if the actions domain is fully observable and every action is fully defined as a sentence in the Planning Domain Definition Language (PDDL) [16] with a set of preconditions and a set of effects, described as conjunctive lists of literals in first-order logic.

computes a set of plans , where is defined as

and iff is satisfied by and brings to .

A plan is optimal if it has the lowest cost among all the computed plans:

is the cost of action being executed in state .

Iii-B Deterministic Motion Planning

A motion planner tries to find an optimal path that lets move from to while avoiding collisions. The problem is deterministic if the working space is fully observable. In this case, MP can find a set of paths , where is a path in the free space:

is optimal if its trajectory is of minimum length:

Iii-C Conditional Task and Motion Planning

Suppose the existence of a Task and a Motion Planner (See III-A and III-B). Suppose that is assigned to . finds the optimal plan performing and returns the optimal trajectory executing . The solution is optimal if is of minimum cost:

where is the cost of the trajectory necessary to perform the i-th planned action. Without loss of generality, in this paper the solution is optimal if it minimizes the robot’s effort: and . The effort is defined as the execution time.

The problem is deterministic if the actions space is a priori fully defined and each action is executed infallibly. However, in the real world actions can generate unexpected effects and the robot can perceives changes at its surroundings. This means that the outcomes of environment and actuation actions should be processed in order to address uncertainties due to partial observability at the time of offline planning [17]. The definition of is unchanged but the way used to find it is new: the plan should handle every known condition through the definition of a reachability graph and an online recovery procedure should handle unexpected events by combining sensing and actuation actions and minimizing the global cost of the computed path.

Iv Algorithm

Fig. 2: Algorithm Pipeline. The algorithm computes a candidate Task Plan (TP), explores the best branch, and computes the corresponding candidate Motion Plan (MP). It checks for collisions in the candidate Motion Plan. If a collision exists, it generates a new subtask to handle the collision object, it replans, and updates the reachability graph. If there is no collision, it updates the reachability graph with the MP. It repeats the pipeline until the goal state is reached.

A task is assigned to a robot . asks to reach a goal state . The complete set of actions that can perform and its initial state are known. Actions are expressed in PDDL. Starting from this information, Algorithm 1 is applied.

Let: - be the set of plans found by recursively applying the TP (FF) when an online replanning is necessary; - be the set of objects encountered during the robot motion; - be the reachability graph having as edges the actions of and as nodes the states of . More in detail, every edge models the motion path used to execute that action. Initially, all these sets are empty.

Algorithm 1 starts by applying the TP (FF) and computing one sequence of possible actions letting accomplish : . The plan may not be optimal. Starting from , Algorithm 2 initiates with the set of actions to be performed and the list of states that are consequently reached.

Given the sequence of actions to be performed, [7] (a variation of the Lazy KPIECE) is used to compute the motion path connecting, when possible, every couple of states of through a motion trajectory not checked for collisions. The feasibility of the connection is checked in terms of actions’ preconditions and effects. The aim is driving the robot towards the shortest path, in terms of Euclidean distance to the goal (see [7]). If the environment is unknown, is computed until the last visible state. If expects the execution of a navigation action, is computed in terms of the mobile base reference system. If it expects a manipulation action, combines the motion of the base with that of the manipulator robot. Moreover, [18] is used to detect all possible 2-fingers grasps and a geometric variation of it should be used to detect 3-fingers grasps.

Starting from , the sequence of control inputs letting perform the trajectory is computed. This sequence lets deduce the effort , in terms of costs, needed by the robot to perform motions. In the case in analysis, the algorithm takes into consideration the execution time (see  [7]).

Input: : Start state; : Goal state; : Set of actions that can do
Output: : Path of minimum cost letting execute the best plan
= {};
  // set of plans
= {};
  // list of encountered obstacles
1 Initialize an empty reachibility graph ;
2 ;
3 .pushBack();
4 ;
5 ;
6 ;
7 ;
8 = 0;
9 = {};
10 Traverse();
11 return ;
Algorithm 1 TMP algorithm
Input: : the set of plans
Output: : the reachability graph of P
1 foreach state  do
2       if  then
3             .children[].pushBack();
4             .parent[].pushBack();
Algorithm 2 RGraph()
Input: : the node to be expanded
1 while (.hasChild()) && !(MaxAttempts reached) do
2       ;
3       while  has new  do
4             ;
5             if  then
6                   .pushBack();
8            ;
9             foreach  do
10                   ;
11                   foreach  do
12                         if  then
13                               foreach  of  do
14                                     if  then
15                                           ;
16                                           ;
23      ;
24       ;
25       if  then
26             if  then
27                   ;
28                   ;
29                   return;
31            else
32                   Traverse()
Algorithm 3 Traverse(): Expand to find the best

Being the robot at its current state (i.e., at the very beginning), the graph expansion starts (see Algorithm 3). Given , every node and every edge are checked for collisions. Authors remember that already contains the best plan performing the assigned task: the one which motion path has minimum Euclidean distance to the goal. Let the robot be at state , for every new collision detected in the space of movable obstacles , a state is sampled. Based again on the set of feasible actions , their preconditions, and effects, is extended by adding the task plans from/to (it is a new iteration of FF).

The algorithm evaluates all the task plans that connect the robot current state to the current plan passing through . The related motion paths are evaluated too. In detail, starting from the original path , if the new path has a cost less than , the algorithm continues the exploration of this branch until reaching or until exceeding . In this case, continuing this road is no longer convenient and the exploration must focus on other branches. The exploration proceed until a better solution is found, all children have been visited, or the maximum number of attempts has been reached. This means that is revalued not only when unfeasible, but every time better solutions exist with respect to the selected one. For example, once an obstacle is found and a path avoiding it is computed, the algorithm evaluates both the action of avoiding the object and the one of manipulating it. It then selects the alternative of minimum effort, that in the case in analysis means the one requiring the minimum execution time.

Figure 2 depicts the proposed algorithm pipeline, starting from the desired task assignment to the TAMP solution. The online replanning routine is also depicted; its details follow.

Iv-a On-Line Replanning

(a) TP calculates a sequence of actions that achieves the assigned task. MP computes the corresponding sequence of motions.
(b) During the execution of the motion plan, the robot perceives a new obstacle.
(c) TP samples new states and tries to connect them to the existing plan. If a connection exists, a new sequence of actions is generated, and the less onerous branch is expanded.
Fig. 3: Online Replanning: a robot perceives a new obstacle while trying to pick up a known object. On the left of each figure, the plan of the Task Planner (TP) is depicted. On the right, the Motion Planner (MP)’s search space is visible.

Suppose the robot is executing a planned trajectory , coming from an action , and in the meanwhile it continuously perceives its surrounding. While acting in the real world, unpredictable faults could happen. E.g., the manipulated object may fall down from the gripper or an underestimated effort may be used to move the object itself. Moreover, a new obstacle blocking the way could be perceived by sensors. In all these situations, a task and path replanning is required. It should consider the new updated knowledge of the robot’s world.

The online replaning procedure exploits the same method described above to add a new sub-task to the reachability graph. Indeed, if the ongoing action fails, a new sub-task bypassing the faulty state is generated. The sub-task is computed by the task planner using the set of actions that the robot is able to perform and it is added to the reachability graph. This sub-task is naturally connected to the next non-faulty action node, so that the approach does not need to perform a total replanning from the actual state to the goal state . In the end, the function is invoked in order to try to find a path that minimizes the execution time. A number of attempts is chosen a priori: if no plan is found and totally executed within those attempts, the system outputs a failure.

Figure 3 presents a possible scenario where a robotic hand has to pick up an object. The Task and Motion Planner elaborates a plan to achieve the assigned task in which the robot moves near the object and then picks it up, as in Figure 2(a). During the execution of the Move action a new object that prevents the robot to accomplish the movement is perceived by the sensors (see Figure 2(b)). The online replanning generates a new sub-task corresponding to the obstacle replacement and adds it in the reachability graph, as in Figure 2(c). The algorithm explores the graph and expands the less onerous branch that, in this example, corresponds to the elusion of the obstacle. As expected, the subsequent Pick action is exactly the same for both suitable branches.

Iv-B Conditional Planning

Suppose is executing a trajectory , coming from an action , and meanwhile it is perceiving its surrounding. Contemplated events may occur. E.g., the robot has to manipulate an object but its gripper is not empty or the object is occluded. From the literature, these events are treated by Condition Planning routines. In the case in analysis, they are already handled by the planner through the construction of the reachibility graph. No replanning procedure is involved, just the right sequence of actions is chosen (see Figure 4). This approach’s capability justifies the choice of authors to call the proposal as a Conditional approach.

Fig. 4: Conditional Planning: the robot has to pick an object. The figure depicts the reachability graph generated in accordance with the satisfied preconditions. On the left, there is the sequence of actions to be performed when the gripper is empty. On the right, there is the sequence of actions to be performed when the gripper is not empty and the object it is holding has to be placed down before executing a new pick.

V Experiments

Figure 5 depicts the use cases authors are studying at the time of the submission. Figure 4(a) shows a mobile manipulator robot trying to solve a NAMO problem. Figure 4(b) shows the same robot trying to pick up an occluded can of coke from a cluttered table. The robot can perform four different actions:

  • ;

  • ;

  • ;

  • .

If is not given as input, randomly samples it on the free space and does the same on a flat surface in the neighborhood of the manipulated object.

Experiments aim to prove:

  1. the adaptability of the algorithm when dealing with a perceived workspace;

  2. the effectiveness of weighing paths based on the effort done.

Authors consider the effort as the time spent and they aim to prove that the obtained solution is the fastest one.

Vi Conclusion

Authors presented a new algorithm able to solve a Task and Motion Planning problem through an effort-based approach. The effort is the time spent to accomplish the task and the algorithm finds the plan that can be executed in the shortest possible time. The non-determinism of the real world is faced by providing a Conditional Planning and a recovery routine that handles unexpected events and new scene detections. The proposed algorithm is complete and scalable.

Authors expose some use cases whose implementation is still in progress. They aim to prove the adaptability and effectiveness of the proposed approach.

(a) A mobile manipulator robot trying to solve a NAMO problem.
(b) The same robot trying to pick up an occluded can of coke from a cluttered table.
Fig. 5: Use cases.