Humans trivially perform complex manipulation tasks, such as picking up a tool from a cluttered toolbox, or grabbing a book from a shelf by re-arranging occluding objects. For us, as humans, these tasks seem to be routine, and they do not require much thought. Yet, for robots, this is not the case. Such complex manipulation tasks require symbolic reasoning to decide which objects to re-arrange so as to reach a target object with motion planning to account for the geometric feasibility of the discrete actions. This interaction between symbolic reasoning and motion planning is the subject of integrated Task and Motion Planning (TMP). Single-robot TMP has been an area of active research. Yet, current approaches do not naturally account for the capabilities afforded by the presence of multiple robots, such as tasks that can be decomposed in a variety of ways, task allocation involving many robots, or collision avoidance among robots and between each robot and the workspace. A naive extension of single-robot TMP approaches to the multi-robot case would have to treat the multi-robot system as a combined set of single-robot system, which becomes computationally intractable as the number of robots increases.
TMP finds applications in a variety of areas. In search and rescue, locating/picking an object of interest may require re-arranging cluttering objects in debris. This presents two main challenges: (i) the amount of debris to be re-arranged before being able to reach and pick the target object is unknown beforehand, (ii) the debris should be safely re-arranged so that it poses no further impediment to reaching the target. In this paper, we address the above two challenges from a multi-robot perspective, that is, how multiple robots can interact to achieve reaching a target object in clutter while (i) and (ii) hold. As for challenge (i), in scenarios such as search and rescue, one may not always be interested in finding optimal solutions. Therefore, currently we are not concerned with optimality in terms of the number of object re-arrangements or load-balancing among the involved robots. To tackle challenge (ii), in this work we assume the availability of a safe region where the objects to be re-arranged are placed so that they pose no further challenge to reach the target object.
We present an approach for multi-robot integrated task and motion planning which first allocates tasks to the available robots, and then plans via AND/OR graphs a sequence of actions for the multiply decomposable tasks that are optimal with respect to an available utility function for the multi-robot system. As a case in point, we focus on re-arranging clutter in manipulation tasks using multiple manipulators. Specifically, we consider a cluttered table-top where different target objects are to be picked by the robots by re-arranging occluding objects, as in Fig. 1. This requires task allocation among the available robots, followed by a TMP method to complete the allocated tasks. It is noteworthy that to achieve each task, that is, picking a target object, the number of cluttering object re-arrangements is not known beforehand. Though off-the-shelf Planning Domain Definition Language (PDDL) [mcdermott1998AIPS] based planners are available for task planning, one needs additional expertise to incorporate task-motion interaction that comply with the state-space search of the planner. Moreover, integrating multi-robot capabilities still remains a challenge. As it will be discussed in Section 3, we address these challenges by encoding the task-level abstractions efficiently and compactly within an AND/OR graph that grows iteratively to a tree.
2.1 Related Work
Single-robot TMP is an area of research which has received considerable attention in the literature, and spanning two main areas— TMP for manipulation [erdem2011ICRA, kaelbling2013IJRR, srivastava2014ICRA, toussaint2015IJCAI, dantam2016RSS, garrett2018IJRR] and TMP for navigation [stilman2008IJRR, munoz2016RAS, lo2018AAMAS, jiang2019IROS, thomas2019ISRR]. These approaches combine symbolic-level action selection of task planning with the continuous trajectory generation of motion planning [lagriffoul2018RAL]. Yet, such methods do not consider the implication of having multiple robots in task allocation and collision avoidance, and would have to treat the multi-robot system as a combined single-robot system, which becomes intractable as the number of robots increases.
Current TMP approaches for multi-arm robot systems focus on coordinated planning strategies, and consider simple pick-and-place or assembly tasks. As such, these methods do not scale to complex manipulation tasks [rodriguez2016IROS, umay2019ROSE]. TMP for multi-arm robot systems in the context of welding is considered in [basile2012RCIM], whereas [umay2019ROSE] discusses an approach for multi-arm TMP manipulation. However, the considered manipulation task is a simple pick-and-place operation involving bringing an object from an initial position to a goal position, where two tables and a cylindrical object form the obstacles. A centralized inverse kinematics solver is employed in [mirrazavi2018IJRR]. Motion planning for a multi-arm surgical robot is presented in [preda2015IROS] using predefined motion primitives. Yet, a fine tuning of such primitives towards real experimental platforms remains a challenge.
TMP for multi-robot systems has not been addressed thoroughly, and therefore the literature is not sufficiently developed. Henkel et al. [henkel2019IROS] consider multi-robot transportation problems using a Task Conflict-Based Search (TCBS) algorithm. Such an approach solves a combined task allocation and path planning problem, but assigns a single sub-task at a time and hence may not scale well to an increased number of robots. Interaction Templates (IT) for robot interactions during transportation tasks are presented in [motes2019RAL]. The interactions enable handing over payloads from one robot to another, but the method does not take into account the availability of robots and assumes that there is always a robot available for such an handover. Thus, while considering many tasks at a time this framework does not fare well since a robot may not be immediately available for an handover. A distributed multi-robot TMP method for mobile robot navigation is presented in [thomas2020STAIRS]
. However, they define task-level actions for a pair of robots and therefore optimal solutions are available for an even number of tasks, and only suboptimal solutions are returned for an odd number of tasks. Moteset al. [motes2020RAL] present TMP-CBS, a multi-robot TMP approach with sub-task dependencies. They employ a CBS method [sharon2015AI] in the context of transportation tasks. Constructing a conflict tree for CBS requires the knowledge of different constraints which depend on the sub-task conflicts, for example, two robots being present at a given location at the same time. However, in the table-top scenario considered in this paper, the number of sub-tasks is not known beforehand. The capabilities of the discussed methods are summarized in Table 1.
2.2 Task-Motion Planning and AND/OR Graphs
Task planning or classical planning is the process of finding a discrete sequence of actions from the current state to a desired goal state [ghallab2016book].
A task domain can be represented as a state transition system and is a tuple where:
is a finite set of states;
is a finite set of actions;
such that ;
is the start state;
is the set of goal states.
The task plan for a task domain is the sequence of actions such that , for and satisfies .
Motion planning finds a sequence of collision free configurations from a given start configuration to a desired goal [latombe1991robot].
A motion planning domain is a tuple where:
is the configuration space;
, for collision , else ;
is the initial configuration;
is the set of goal configurations.
A motion plan for finds a collision free trajectory in from to such that for . Alternatively, a motion plan for is a function of the form such that and , where is the configurations where the robot does not collide with other objects or itself.
TMP combines discrete task planning and continuous motion planning to facilitate efficient interaction between the two domains. Below we define the TMP problem formally.
A task-motion planning with task domain and motion planning domain is a tuple where:
, maps states to the configuration space;
, maps actions to motion plans.
The TMP problem for the TMP domain is to find a sequence of discrete actions such that , and a corresponding sequence of motion plans such that for , it holds that (i) , (ii) , and (iii) .
We now provide a brief overview of AND/OR graphs [darvish2020hierarchical, karami2020task]. An AND/OR graph is a graph which represents a problem-solving process [chang1971AI].
An AND/OR graph is a directed graph represented by the tuple where:
is a set of nodes;
is a set of hyper-arcs.
For a given AND/OR graph , we have that , where is a many-to-one mapping from a set of child nodes to a parent node. In that sense, a hyper-arc induces a logical and relationship between the child nodes/states, that is, all the child states should be satisfied to achieve the parent state. Similarly, a single parent node can be the co-domain for different hyper-arcs . These hyper-arcs are in logical or with the parent node. Nodes without any successors or children are called the terminal nodes. The terminal nodes are either a success node or a failure node. We now mathematically define an AND/OR graph network, a detailed treatment of which can be found in [karami2021arxiv].
For an AND/OR graph , an augmented AND/OR graph is a directed graph represented by the tuple where:
with being the virtual node;
An AND/OR graph network is a directed graph where:
is a set of augmented AND/OR graphs ;
is a set of transitions such that ;
where is the total number of graphs in the network. Alternatively, is also the depth of the network.
2.3 Problem Definition
We present a framework which (i) allocates manipulation tasks to a set of robots, and (ii) plans for a sequence of motions-actions for each robot to achieve the respective tasks. To this end, we consider a multi-robot setting with a cluttered table-top wherein different target objects need to be picked by re-arranging the clutter. Thus, each manipulation task corresponds to picking up a target object.
In order to carry out each task, objects need to be re-arranged – depending on the clutter degree – and the number of objects to be re-arranged is not known beforehand. By representing the task-level abstractions within an AND/OR graph network (see Section 3), we can overcome this issue.
3 Multi-robot Task and Motion Planning
In this Section, we detail our multi-robot TMP method. We begin by describing a heuristic approach providing a rough estimate of the objects to be re-arranged to pick a target object. The approach allows us to define a combined utility function for the multi-robot system to perform task allocation. Allocated tasks are then carried out using a TMP method based on an AND/OR graph network and a sampling based motion planner.
3.1 Obstacles Selection
An overview of the method can be seen in Figure 2. First, it finds different feasible plans to the target, each one corresponding to different grasping angles, by ignoring all the obstacles in the workspace. This is done by discretising the set of graspable angles, that is (the axis is located at the center of the target). It is noteworthy that we consider only side grasps to make the scenario more challenging. Among the available plans, the maximum and minimum grasping angles, namely and , are then obtained. The method then constructs two lines starting at the center of the target object and towards the robot with the corresponding angles and . The lines are terminated when there are no more obstacles along their paths. The end-points are then joined to form a triangle. This triangle is then enlarged on all the three sides by the radius of the bounding volume sphere of the end-effector. The objects within the constructed triangle are the objects to be re-arranged to facilitate the target grasp.
We note that what we describe here is an approximate method to identify the objects to be re-arranged. The actual set of objects depend on other such factors as the degrees of freedom of the robot, the size of the links and the end-effector, or thedegree of clutter. As it will be discussed later, for all practical purposes we are interested only in an approximate measure so as to perform multi-robot task allocation.
3.2 Task Allocation
Let be the number of available robots, and the number of tasks to be allocated, that is, we have target objects to be picked, such that . Task allocation is performed offline and we assume that each task, that is, picking up a target object from clutter, is performed by a single robot, and that each robot is able to execute only one task at a time. We also recall here our assumption from the Introduction that the objects to be re-arranged to reach and pick a target are placed in a safe region. Our task allocation strategy falls under the Single-Task, Single-Robot, Time-extended Assignment (ST-SR-TA) taxonomy of Gerkey and Matarić [gerkey2004IJRR], since the multi-robot system contains more tasks than robots. In order to allocate tasks, we define as the utility function for a robot executing a task . In this work, utility is inversely proportional to the number of object re-arrangements required to grasp the target object. To determine such a measure, we first (randomly) select a target object , and then for each robot we run the obstacles selection algorithm described above. For each , such a run returns the set of objects to be re-arranged to reach . Let us denote this set by (and by for the th task). The robot whose set is of minimum cardinality (i.e., maximum utility) is then allocated task . For the next target this step is repeated. We note here that is computed offline and returns the number of objects to be re-arranged by robot to execute task . However, it may be the case that some objects appear in both and , that is, . Since each robot executes one task at a time, can execute only after having performed . Thus, during the execution phase, may have already removed the common objects while executing , and therefore these objects may be ignored to avoid intra-robot double counting while computing the utility offline.
Once each robot is allocated a task, the set of remaining tasks is completed only after the execution of the assigned tasks. For the remaining tasks, the inter-robot or robot-robot double counting must be considered. Reasoning in a similar manner for intra-robot double counting, the set restricted to for the remaining tasks may have common objects with respect to restricted to of the assigned tasks. Thus, the total number of objects to be re-arranged for robot to execute task is
where denotes the cardinality of a set,
with the terms and modeling the intra-robot and inter-robot double counting, respectively. We therefore have the following utility function
The maximum utility of is therefore achieved when no object re-arrangement is required to execute task , that is, . Using the taxonomy in [korsah2013IJRR], we thus have In-schedule Dependencies (ID) – the effective utility of an agent for a task depends on what other tasks that agent is performing as well as Cross-schedule Dependencies (XD) – the effective utility of an agent for a task depends not only on its own task but also on the tasks of other agents.
We now define the combined utility of the multi-robot system, which consists of maximising
From (4) and (5) we see that the robot with the minimum number of object re-arrangements for a given task is thus assigned the maximum utility. In case of a tie, we select the robot which has not been allocated any task. If all the robots with the same utility have been allocated tasks already, or if none has been allotted, then a robot is selected randomly.
3.3 Task Decomposition
Each manipulation task is decomposed into a set of sub-tasks which correspond to pick-and-place tasks, that is, re-arrangement of the objects that hinder the target grasp. As seen above, the number of sub-tasks for a given task are not known beforehand. Moreover, multiple decomposability [zlot2006phd] is possible since the clutter can be re-arranged in different ways. We seek a decomposition minimizing the number of sub-tasks for the multi-robot system. This can be achieved during task allocation since utility is computed based on the obstacles selection method. In this work, we consider complex task decomposition [zlot2006phd] – a multiply decomposable task for which there exists at least one decomposition that is a set of multi-robot allocatable sub-tasks. Though in this work we ignore the multi-robot allocatability property, this can be incorporated trivially. For example, let us consider the case where two robots have the same utility to perform a task . In this case, the sub-tasks can be equally divided to achieve multi-robot allocatability or one robot may be selected randomly (or depending on previous allocation) to perform the entire task.
3.4 Task and Motion Planning
PDDL [mcdermott1998AIPS] is the de facto standard for task planning, and most TMP approaches resort to it. Integrated TMP requires a mapping between the task space and the motion space. Semantic attachments are used in [dornhege2009SSRR, dornhege2009ICAPS, dantam2016RSS, thomas2019ISRR] to associate algorithms to functions and predicate symbols via external procedures. However, it is assumed by these approaches that the workspace be known in advance. Moreover, they reduce motion space to a finite space since the robot configuration, the grasp poses, and other geometrical constraints need to be specified in advance. This issue is addressed in [garrett2020ICAPS] using streams, which enable procedures within PDDL to sample values of continuous variables, therefore encoding an infinite set of actions. Though off-the-shelf PDDL planners are available, one needs additional expertise to incorporate semantic attachments or streams complying with the planner semantics. Furthermore, for the cluttered table-top scenario, the number of objects that need to be re-arranged is not known beforehand. As a result, one would need observation based re-planning within the PDDL framework, which require additional mapping between the PDDL action space and the observation space of the robot [bertolucci2021TPLP].
A cluttered table-top scenario in the context of single-robot TMP is considered in [thomas2020IRIM, karami2021arxiv]. To address the above mentioned challenges associated with PDDL modeling, in [karami2021arxiv] the task-level abstractions of the TMP problem is efficiently and compactly encoded within an augmented AND/OR graph (see Section 2) that grows iteratively until solutions are found, where each iteration is meant at managing a sub-task as defined above.
In general, AND/OR graphs are constructed offline. However, the number of objects to be re-arranged depends on the clutter degree and is not known beforehand. [karami2021arxiv] introduces AND/OR graph networks, consisting of augmented AND/OR graphs which iteratively deepen at run-time till a solution to the modeled problem is found. The key idea is that the set of task-level abstractions, i.e., the states (nodes) and actions (hyper-arcs) of the AND/OR graph defined for the robot remain the same irrespective of the number of re-arrangements, i.e., sub-tasks. Therefore, one can define an initial AND/OR graph encoding task-level actions, which is augmented with the initial workspace configuration. The augmented graph is then iteratively expanded with the updated workspace configuration as long as sub-tasks are carried out resulting in a network of AND/OR graphs. In this work, we use this AND/OR graph network based task planner and customize it for multi-robot task planning. For motion planning, we simply use MoveIt [sucan2013moveit], which supports RRT [kuffner2000ICRA] from OMPL [sucan2012RAM]. The motion planner is first employed to execute the obstacles selection algorithm. Once the tasks are allocated, the motion planner is called to (i) achieve the re-arrangement of each sub-task identified by the task planner (i.e., the AND/OR graph network) and (ii) to grasp the target object.
3.5 The Multi-robot Task-Motion Planning Loop
Fig. 3 depicts the overall system’s architecture of our multi-robot TMP method. Once the tasks have been allocated, Task Planner selects the abstract actions whose geometric execution feasibility is checked by Motion Planner. The Task Planner layer consists of the AND/OR Graphs module – the initial augmented AND/OR graphs for the robots, and the Network Search module – the search procedure iterating the initial augmented graphs. As discussed in Section 3.4, the initial augmented AND/OR graph consists of the task-level actions for each robot augmented with the current workspace configuration. AND/OR Graphs provides a set of achievable transitions between the states to Network Search, and receives the set of allowed states and transitions as the graph is expanded. Task Planner then associates each state or state transition with an ordered set of actions in accordance with the workspace and robot configurations. The Knowledge Base module stores the information regarding the current workspace configuration, that is, the objects and their locations in the workspace as well as the robot configuration. This module augments the graphs with the current workspace configuration to facilitate Network Search. Since we consider here two robots and their respective configurations need to be updated, we have two Network Search modules corresponding to each robot.
TMP interface acts as a bridge between the task planning and the motion planning layers. It receives action commands from Task Planner, converts them to their geometric values (for example, a grasping command requires various geometric values such as the target pose or the robot base pose), and passes them on to Motion Planner to check motion feasibility. To this end, the module retrieves information regarding both the workspace and robots from Knowledge Base. If an action is found to be feasible, it is then sent for execution. Upon execution, Task Planner receives an acknowledgment regarding action completion and the Knowledge Base is updated accordingly.
4 Experimental Results
|Activity||Average [s]||Std. Dev. [s]|
|Motion planner (attempts)||18.320||6.450|
|Motion execution (attempts)||6.3126||0.8176|
|Motion planner (time)||0.8010||0.2170|
|Motion execution (time)||4.7490||2.0240|
We validate and demonstrate the performance of our multi-robot TMP approach by performing experiments in the state-of-the-art robotics simulator CoppeliaSim [coppeliaSim], employing two Franka Emika manipulators.
As seen in Fig. 1, we consider a cluttered table-top scenario where manipulation tasks correspond to picking up different target objects.
For each task, the sub-tasks consist of removing objects hindering each target grasp.
In this work, such objects are picked and placed outside the working area, in a safe space.
The number of objects to be removed is unknown (and therefore likewise the number of sub-tasks) at planning time.
Experiments are conducted on a workstation equipped with an Intel(R) core email@example.com GHz 12 CPU’s and 16 GB of RAM.
The architecture is developed using C++ and Python under ROS Kinetic. A video demonstrating the results can be found at
We begin with 6 objects in the workspace with two of them being target objects. To test the scalability to an increasing number of objects, we perform experiments with up to 64 objects. For a given number of objects the experiment is conducted 3 times, and in each experiment the target objects are chosen randomly. Table 2 reports the average combined planning and execution times for robots and , focusing on the architecture’s modules. AND/OR graph and graph network search times are per number of grown graphs. It must be noted that motion planning failures due to actuation errors or grasping failures lead to re-plans, and therefore to a larger number of motion planning attempts.
Table 3 report the average network depth , the average total task planning time, the average total motion planning time, the average number of motion planning attempts, and the average number of objects to be re-arranged for robots and , respectively. As the AND/OR graph network depth increases, task planning times are almost linear with respect to . This is so because for an AND/OR graph network with each graph consisting of nodes, the time complexity is only [karami2021arxiv]. In contrast, PDDL-based planners are characterized by a search complexity of , where for the table-top scenario.
|Objects||d||TP [s]||MP [s]||MP attempts||Objects re-arranged||TP [s]||MP [s]||MP attempts||Objects re-arranged|
Fig. 4 shows different histograms for an increasing network depth . In particular, Fig. 4(a) shows the total task planning time with . One can readily observes that the linearity in planning time is not strictly followed. For example, the time for is greater than the time for . In many cases, due to motion planning failures a new graph is expanded before reaching the terminal node, which implies that more nodes are traversed for compared to , therefore explaining the variations. Fig. 4(b) plots the number of motion planning attempts and the total executions with an increasing . An increase in in most cases correspond to a higher degree of clutter. Therefore, as depth or the graphs increase the motion planning attempts increases as well. However, motion planning failures can also increase the depth since a new graph need to be expanded. This explains the slight deviation in the trend. Fig. 4(c) reports the total motion planning and execution times with an increasing , and the plot readily follows from the discussions above.
We present a task allocation approach for integrated multi-robot task and motion planning capable of handling tasks with an unknown number of sub-tasks. Our obstacles selection strategy combined with the introduced utility function allocates task to the available robots. Allocated tasks are then achieved by our TMP method, which comprises an AND/OR graph network based task planner that is robust to the issue of an unknown number of sub-tasks. We show that our method is capable of handling varying degree of clutter, i.e., the number of objects in the workspace. Currently, the objects hindering each tasks are removed from the workspace. As a result, subsequent tasks need not consider theses objects while computing the utility. There may be situations in which removing objects from the workspace will not be possible, and the planner will have to select free space for object placement. Future work will seek to develop a non-myopic technique to predict possible future trajectories based on the previous ones and the current environment configuration so as to enable the selection of smart object placements.