Combined task and motion planning (TAMP) breaks down complex sequential manipulation problems into two separate processes: high-level decision-making and motion planning [8, 6, 21, 19, 14, 17, 23, 5, 15]. A high-level decision (or action) is represented by symbolic abstractions that describe its pre- and post-conditions. Given a goal (usually symbolic), a decision sequence is first found, and then a feasible motion trajectory is computed for the robot(s) to realize the given task. However, when the task requires executing a long sequence of high-level actions, and there is a large set of objects to consider in the environment, the combinatorial complexity of those problems makes them computationally intractable. This is exacerbated when a team of robots has to carry out such tasks together.
In essence, developing an efficient TAMP solver for long-horizon tasks remains a challenging problem [22, 9]. There are two main approaches: (i) integrating heuristics to reduce the symbolic search space, and (ii) decomposing the problem into solvable subtasks. Prior studies on TAMP use heuristics [12, 25, 24, 13], some of them leveraging geometric information acquired from the environment to improve planning performance [3, 16]. However, these methods rely on calling the motion planner to compute heuristic cost values, which deteriorates their performance as the required action length increases. In , geometric abstractions, which do not require using the actual motion planner, are utilized for effective task planning. However, the motion planning problem is simplified since the tasks require 2D navigation in maze-like environments. In , a relaxed geometric reasoning approach enables pruning geometrically infeasible actions. However, most of the previous works that used heuristics for TAMP have focused on smaller problem instances, which either required shorter action sequences to be solved [1, 10] or contained few objects in the environment .
Another technique to handle the challenges of long-horizon TAMP is iterative planning or problem decomposition. Such planners plan and/or execute only a portion of the task at each step. The benefit of this approach is that large-scale problems are partitioned to computationally tractable subproblems. Previous works have tackled those tasks by defining several subgoals that have to be fulfilled to complete an entire long-horizon task [11, 22, 18], or receding horizon approaches are adopted, where in each planning iteration, an action sequence of fixed length is planned , and only a subset of those actions of each plan is executed . However, these methods either rely on task-specific subgoal definitions , or focus on single robot pick-and-place tasks , while lacking a computationally efficient integrated TAMP approach that can be applied on various long-horizon settings, including the dynamic formation of modular robots.
We tackle the problem of solving long-horizon robotic manipulation tasks by proposing a heuristics augmented TAMP approach. Our method builds on logic-geometric programming (LGP), which formalizes the relation between discrete decision variables and a continuous nonlinear mathematical program (NLP), that represents the underlying continuous world problem [28, 27, 7]. However, the prior LGP approaches have the drawback of exploring geometrically infeasible action sequences, since the symbolic search over discrete actions ignores geometrical constraints of the scene. In addition, solving the entire problem at once is computationally difficult not only due to the combinatorial complexity of tasks involving many objects but also because motion planning of a long sequence of actions requires solving an NLP with a large set of kinematic constraints. To address those challenges, this work introduces the following contributions:
a novel geometry-based heuristics integrated LGP approach for solving long-horizon TAMP problems,
a receding horizon formulation that further improves solving complex sequential manipulation problems
We demonstrate the effectiveness of the proposed approach in various settings all requiring solving complex long-horizon tasks (Fig. 1). These tasks include picking goal objects (single or multiple) that requires climbing a large set of stairs (up to 64) by snake-like locomoting robots (crawlers, Fig. 0(a)), rearrangement tasks in which heterogeneous robots have to autonomously form novel kinematic structures to carry out manipulation actions (Fig. 0(c)), and a complex parkour setting where the crawler has to find ingenious movement strategies to reach the goal while avoiding obstacles (Fig. 0(d)).
Ii-a Logic Geometric Programming
This work is based on Logic Geometric Programming (LGP),[26, 27] as the underlying framework for solving TAMP problems. We will now briefly outline the optimization formulation of LGP that is given by eq. (1).
Let be the configuration space of rigid objects and articulated joints with initial condition . The goal of LGP is to optimize a sequence of symbolic actions and states and the corresponding continuous path in configuration space to achieve a symbolic goal . Position, velocities and accelerations are denoted with . The space of symbolic states and actions is discrete and finite, and the transitions are determined by a first-order logic language (similar to PDDL).
For a fixed action and state sequence and , the functions are continuous and piecewise differentiable, and the optimization of the path is a nonlinear program (NLP). Therefore, the main idea behind LGP is to combine a discrete search on the symbolic level (i.e., find a sequence of symbolic states that leads to a goal) with joint nonlinear optimization to compute paths that fulfill the constraints and to assert whether a symbolic action sequence admits a continuous realization.
Ii-B Multi-Bound Tree Search
The discrete variables of the LGP formulation in eq. (1
) induce a decision tree that contains sequences of symbolic states, starting from. The leaf nodes where are potential candidates for a solution. Each node can be tested for feasibility by solving the NLP induced by the sequence of states from the root to the current node. However, solving the continuous path problem is expensive and the number of candidate NLPs is generally too high. To alleviate this issue, Multi-Bound Tree Search (MBTS)  first solves relaxed versions of eq. (1). The feasibility of each relaxed problem is a necessary condition for the feasibility of the original NLP, i.e., acting as lower bounds, whilst being computationally simpler. In this work, we use the two bounds: (joint optimization of the mode-switches of the sequence), and (full problem).
Iii Informed Tree Search with Novel Geometric Heuristics
Iii-a Limitations of standard MBTS for solving LGP
The running time of the MBTS algorithm is determined by the number of calls to a nonlinear optimizer to solve NLPs (that could be either feasible or infeasible). The interleaving between symbolic search and continuous optimization of paths can be achieved in two different ways: either by (i) solving NLPs only for the sequences of symbolic actions that lead to the goal or (ii) solving NLPs of intermediate subsequences to guide the symbolic search. Both approaches are inefficient in large symbolic spaces. With approach (i), the goal-reaching sequences have never been informed about the geometry and most of the candidate action sequences are infeasible, while in approach (ii) most of the computed NLPs correspond to subsequences that do not lead to the goal.
Iii-B Heuristics-enriched LGP
In this work, we propose a heuristic-based formulation that guides the symbolic search and prunes geometrically infeasible solutions. Our heuristics are based on fast geometry checks and are an order of magnitude faster than solving intermediate NLPs. The heuristic-based LGP involves two components: action-specific heuristics
, which incorporate geometric information to estimate the costs of performing a given action in a certain state, and theinterface for heuristics which integrates the action-specific heuristics into the LGP planning algorithm.
An interface for heuristics
The optimal (but intractable) cost-to-go of a node in the LGP-tree is defined as the cost of the LGP problem conditioned on the current sequence of symbolic states (the path from the root to the node), i.e. solving (1) with fixed , taking infinite value if the conditioned LGP problem is infeasible.
To inform symbolic search, we propose an LGP-interface for heuristics that uses a cost-to-go function, which assigns a heuristic cost-to-go value to each node expanded in the search tree. This heuristic value can be computed from the current symbolic state , from the previous symbolic action (i.e. transition ) or from the whole subsequence starting at the root . To label a node of the tree as infeasible, the cost-to-go function assigns an infinite cost that prevents future expansions. We define a problem-specific cost-to-go function upon initialization of the LGP problem. For example, we define the cost-to-go function that assigns a infinite cost-to-go value to any action sequence that includes a robot picking up an object which is out of its reach (see details below). The new cost-to-go is used to inform the multi-bound tree search in two steps: choosing which node to expand next and choosing which nodes’ NLP to optimize next.
The complete RHH-LGP algorithm is shown in Alg. LABEL:alg:lgp. Multi-bound tree search with geometric heuristics corresponds to the inner loop (with our contribution highlighted in pink).
algocf[!htb] In short, the multi-bound tree search uses three queues: SymbolicSearchQueue (nodes discovered during tree search); SequenceNLPQueue (candidate nodes for solving the bound sequence NLP); PathNLPQueue (candidate nodes for solving the full path NLP). The tree search performs the following steps iteratively until a path is found: new nodes are discovered, nodes that reach the symbolic goal are moved to the queue of candidates for sequence NLP, and nodes with a feasible sequence NLP are moved to the path queue.
Geometry-based reachability check
The heuristic cost-to-go is computed based on the symbolic actions and states, but the underlying geometric functions require a kinematic configuration. Since calling an NLP solver to compute such configuration would be too expensive, the geometry-based reachability checks are performed using the initial kinematic configuration of the continuous state space. The reachability checks that we use compare the distances between objects with the reach of the robots, which in our case is defined by the crawler robot length. This strategy can be applied when the objects in the environment are immobile, e.g., stairs, floor, and tables. The idea is illustrated in Fig. 1(a), where the distance of the target to the base is longer than the length of the extended crawler, from which it follows that picking the object immediately will be infeasible.
Iii-C Heuristics for Solving Long Horizon Tasks with Modular and Crawler Robots
In this section, we present action-specific heuristics that help to formulate cost-to-go functions for solving long locomotion and manipulation tasks with two types of modular robots: the crawler and the mobile-base:
crawler: snake-like robot, composed of 7 joints and two end-effectors. The end-effectors are spherical and can be used to provide stable support for walking and for touch-based picking and placing. In addition, two crawler modules can connect at their end-effectors to form one large crawler.
mobile-base: cube-shaped robot, able to move freely in the xy-plane. The crawler robots can connect to the mobile base in order to form a mobile manipulator robot that can in turn pick and place objects.
Given a symbolic action, an action-specific heuristic function returns a cost-to-go value that approximates how the remaining costs to reach the goal change when executing that action. We present the following heuristics (Table I), that can be combined as building blocks to create novel cost-to-go functions.
The pickHeuristic, Fig 1(a), checks if the object to be picked is within reach. To this end, the heuristic checks if the target object is within reach of a crawler robot that stands on the given base object , returning infinite costs if it is too far for picking. Similarly, the placeHeuristic checks if the target surface for placing is within reach.
The stepHeuristic, Fig. 1(c), checks if the object , which the robot is standing on before the step is within reach of the target object . If the step is feasible, the heuristic returns the amount by which the distance of the robot to its goal of navigation will be reduced by the step’s size, effectively providing guidance for navigation planning.
The connectHeuristic, Fig. 1(b), checks if the two robot modules that are to connect, are within reach of each other. Furthermore, the heuristic prunes decision sequences that contain contiguous disconnect and disconnect actions. Similarly, the disconnectHeuristic prevents the disconnection of modular robots that have been connected in the prior action, as such a decision sequence will never be optimal.
|step & stepTogether|
Iv Receding horizon-LGP
With the guidance of our heuristic framework, LGP can solve manipulation tasks that require very long action sequences (up to 50 actions, cf. Table III). In this section, we present a further improvement of this framework, which performs iterative planning.
In the standard version of LGP (Sec. II) paths in configuration space (for a fixed sequence of symbolic actions) are optimized jointly, i.e., considering all steps of the manipulation sequences. This strategy does not scale efficiently to very long sequences, as NLP solvers require more iterations to converge, and get often trapped in infeasible local optima, in comparison to shorter manipulation sequences. To tackle this issue, we propose a receding horizon formulation to optimize paths, that we call RHH-LGP. Given a full-length candidate sequence of actions to reach the goal, the RHH-LGP path optimization uses only a predefined number of actions as a planning horizon and iteratively plans motions for that horizon (between 3 and 10 actions in our experiments). Once the motions for a horizon are executed, the optimization time window is shifted, and we optimize motions and actions for the subsequent part of the task, keeping the previous path and actions fixed. See Alg. LABEL:alg:lgp, where the contributions related to the receding horizon formulation are highlighted in blue. With this receding horizon formulation, RHH-LGP can be applied directly to dynamical environments, where planning and execution run iteratively until the symbolic goal is achieved. If the environment changes during the execution of the current horizon, a new action sequence is computed from the current symbolic and kinematic state. This strategy is especially useful when the long action sequence can be decomposed into smaller steps, for example in locomotion tasks. One of the advantages of this approach is that path infeasibility arising from earlier actions in the sequence can be detected efficiently.
|LGP + Action-specific heuristics||LGP + heuristic||LGP + No Heuristics|
|Time in s.||# Tree nodes||# Expanded||Sol. len.||Time in s.||# Tree nodes||# Expanded||Sol. len.||Time in s.||# Tree nodes||# Expanded||Sol. len.|
|Note: scenarios without metrics could not be solved at all|
We report five different tasks that we used to compare the performance of our LGP solver111See supp. video for task demonstrations. The source code for the experiments can be found on the author’s website..
One crawler climbing to reach a distant target: In this scenario, the crawler robot has to climb stairs to touch a distant target (Fig. 0(b)).
Two crawlers climbing to reach a distant target: This scenario is similar to the first one, but with two robot modules in the scene. This means that a possible solution is that the crawlers connect before climbing the stairs to reach the target, taking bigger steps than a one-module crawler, as illustrated in Fig. 3.
Climbing with multiple goals: This task is a variation of the previous task with two targets that have to be reached simultaneously, which makes it a multi-goal task. Furthermore, the crawlers initially stand far away from each other and have to walk towards each other in order to connect (Fig. 0(a)).
Modular robot pick-and-place task: This setting includes one crawler robot and one mobile base robot, initially unconnected (Fig. 3(a)), and several objects that must be transferred from their initial position to a target tray. However, a crawler that holds an object cannot walk, and the solution of this problem involves that the crawler module connects to the mobile base, forming a mobile manipulator (see Fig. 4).
Obstacle avoidance task: A single crawler robot is supposed to navigate towards a target, which is at the end of a series of obstacles (Fig. 5): a gap in the ground that is covered by a barrier, and a barrier that must be climbed.
Scenario 2 is used to benchmark our heuristic framework against two different variations of the standard LGP solver. 1 and 2 are used to highlight the benefits of modular robots. 3 and 5 demonstrate the scaling of our approach to more complex scenarios, including multiple goals. Scenario 4 showcases yet another modular robotics capability that involves dynamic reconfiguration of heterogeneous robot modules. Lastly, we evaluate the impact of the RHH-LGP solver in all scenarios 1 - 5.
In our implementation, the reachability check is a fast geometric check that slightly underestimates the true size of all reachable regions. Although this turns the heuristic inadmissible and one could potentially prune feasible solutions, we did not observe this issue in practice and our experiments demonstrate it is accurate and informative.
V-a The impact of action-specific heuristics
To measure the impact of the action heuristics on solving TAMP problems, we consider task 2 (Fig. 3) and compare the performance of the planner using the action-specific heuristics with the performance of the classic LGP solver and another baseline version of the solver using the result of the bound optimization problem as a heuristic for symbolic search. In the experiment, we compare the planning performance of the three solvers for several numbers of stairs in the scene, to assess the scalability of our approach. The results of the simulation are shown in Table II. We highlight that:
For , only the MBTS solver that used the action-specific heuristics can solve the task. This underlines the impact of the presented heuristics on the planning performance.
Using the action-specific heuristics, instances, which are 16 times bigger than the largest problem instance that is tractable for the other two planners, can be solved.
The heuristic solver expands the smallest number of nodes, but this solver is slower in doing so than the action-specific heuristics solver for .
Just as the action-specific heuristics that we present, the solver considers geometric information during symbolic planning already, but it uses the motion planner of LGP to estimate the feasibility of the action sequence. The results illustrate that the geometry-based heuristics that we present may not be as accurate as the actual motion optimizer, but are much faster and thus more efficient at planning than the optimizer upon each node expansion.
The action-specific heuristics solver enabled to plan action sequences that were up to 3-times longer than the solver and 5-times longer than the standard MBTS solver without any heuristics.
V-B Iterative planning of the previous tasks
|Task||Non-iterative LGP + action-specific heuristics||RHH-LGP (Iterative LGP + action-specific heuristics)|
|Time in s.||# Tree nodes||# Expanded||Sol. len.||Horizon||Time in s.||# Tree nodes||# Expanded||Sol. len.|
|One crawler climbing 1||444.26||5849||171||50||6||85.75||5055||151||33|
|Two crawlers climbing 2||138.03||2066||47||19||3||51.54||2341||63||13|
|Climbing w. multiple goals 3||29.48||1157||77||12||10||50.25||1718||114||13|
|Pick and place 4||17.36||63||16||17||4||8.73||131||109||17|
|Obstacle avoidance 5||58.01||211||34||12||6||9.66||115||19||9|
|*: refers to the number of stairs in the scene. **: refers to the number of objects that need to be placed. ***: refers to the the number of obstacles.|
|Note: scenarios without metrics could not be solved at all|
In the following, we will discuss some of the results (Table III) in more detail:
The RHH-LGP solver can solve more complex tasks than the non-iterative version: Apart from task 2, the results show that the RHH-LGP solver can solve the tasks for bigger than the non-iterative LGP counterpart.
Modular robots facilitate planning: for task 1, we find that reaching an object that is 64 stairs high is computationally intractable. On the other hand if two crawlers perform the same task (setting 2), the task is computationally tractable, because the solution requires fewer steps than a possible solution involving a single crawler. Furthermore, task 3 can only be solved using modular robots, as the initial configuration of the robot modules cannot execute this task because a crawler cannot move while holding an object. Hence, our results demonstrate that modular robots can improve TAMP by facilitating symbolic planning and by increasing the number of available actions.
Our approach generalizes to various types of TAMP tasks: We can see that pick-and-place tasks, climbing tasks and obstacle avoidance tasks can be planned using the action-specific heuristics and RHH-LGP. For each of those tasks we use different combinations of the same set of action-specific heuristics, which underlines the versatility of the mix-and-match principle after which the presented heuristics can be combined.
We demonstrate that our solver can handle heterogeneous robot module configurations in task 4.
Our work can be seen as a first step towards fully autonomous modular robots that can solve a variety of complex and long-horizon tasks. In this light, there are some aspects of our work that we would like to discuss in more detail.
The reachability check that we use is simple: The approximation of the robot position that we make during symbolic planning is based on the objects that the robots are standing on, assuming that these objects do not move. This assumption does not hold for all manipulation tasks.
The horizon length of the RHH-LGP solver is an important hyper-parameter of our algorithm. In the environments we investigated, choosing a suitable horizon length turns out to be sufficient to prevent the receding horizon strategy approach from inducing undesirable infeasibility, i.e., fixing the current path does not result in infeasible paths for future actions.
The issue of fixing previous motions could be solved by adding backtracking: our algorithm could naturally include a backtracking mechanism that restores to full joint optimization if necessary. If a motion for subsequence is not feasible, we iteratively increase the time window backwards and optimize for .
There exist many TAMP approaches, which mostly differ in their integration of logical decision-making process with the motion planning component and/or whether they use sampling- or optimization-based motion planning methods.
Optimization-based approaches provide high computational efficiency, especially for robotic manipulation tasks that require physical contact with the environment.
While standard pick-and-place tasks are common and the robustness of TAMP methods has been the focus,
understanding their scalability and applicability to different types of settings is crucial.
An exploration of use-cases not only informs us about their drawbacks but also offers opportunities for improvement.
This paper is an attempt to investigate such research questions.
By introducing simple yet generic geometry-based heuristics along with an iterative receding horizon approach, which we call RHH-LGP, the performance of logic-geometric programming improves remarkably.
This improvement allows tackling significantly longer horizon tasks than previously shown.
An interesting set of problems, where heterogeneous robots dynamically form novel kinematic structures, has been effectively solved.
Our work highlights that TAMP approaches have the potential to be applied to different types of long-horizon problems by an iterative and heuristics augmented formulation.
-  (2019) Combined heuristic task and motion planning for bi-manual robots. Autonomous robots 43 (6), pp. 1575–1590. Cited by: §I.
-  (2016) Task planning using physics-based heuristics on manipulation actions. In 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1–8. Cited by: §I.
-  (2009) A hybrid approach to intricate motion, manipulation and task planning. Int. J. Robotics Res. 28 (1), pp. 104–126. Cited by: §I.
-  (2020) Receding horizon task and motion planning in dynamic environments. arXiv preprint arXiv:2009.03139. Cited by: §I.
-  (2019) Towards blended reactive planning and acting using behavior trees. In ICRA, pp. 8839–8845. Cited by: §I.
-  (2016) Incremental task and motion planning: A constraint-based approach. In Robotics: Science and Systems, Cited by: §I.
-  (2020) Learning efficient constraint graph sampling for robotic sequential manipulation. arXiv preprint arXiv:2011.04828. Cited by: §I.
-  (2009) Integrating symbolic and geometric planning for mobile manipulation. In 2009 IEEE International Workshop on Safety, Security & Rescue Robotics (SSRR 2009), pp. 1–6. Cited by: §I.
-  (2021) Learning geometric reasoning and control for long-horizon tasks from visual input. In Proc. of the IEEE International Conference on Robotics and Automation (ICRA), Cited by: §I.
-  (2020) Deep visual reasoning: learning to predict action sequences for task and motion planning from an initial scene image. arXiv preprint arXiv:2006.05398. Cited by: §I.
-  (2019) Hierarchical task and motion planning using logic-geometric programming (hlgp). In RSS Workshop on Robust Task and Motion Planning, Cited by: §I.
-  (2011) Combining high-level causal reasoning with low-level geometric reasoning and motion planning for robotic manipulation. In ICRA, pp. 4575–4581. Cited by: §I.
-  (2015) Integrating hybrid diagnostic reasoning in plan execution monitoring for cognitive factories with multiple robots. In ICRA, pp. 2007–2013. Cited by: §I.
-  (2017) Combined task and motion planning as classical AI planning. CoRR abs/1706.06927. Cited by: §I.
-  (2021) Integrated task and motion planning. Annual review of control, robotics, and autonomous systems 4, pp. 265–293. Cited by: §I.
-  (2015) FFRob: an efficient heuristic for task and motion planning. In Algorithmic Foundations of Robotics XI, pp. 179–195. Cited by: §I.
-  (2018) KABouM: knowledge-level action and bounding geometry motion planner. J. Artif. Intell. Res. 61, pp. 323–362. Cited by: §I.
Relay policy learning: solving long-horizon tasks via imitation and reinforcement learning. arXiv preprint arXiv:1910.11956. Cited by: §I.
-  (2016) Sequential quadratic programming for task plan optimization. In IROS, pp. 5040–5047. Cited by: §I.
-  (2020) Robust task and motion planning for long-horizon architectural construction planning. arXiv preprint arXiv:2003.07754. Cited by: §I.
-  (2016) Combining task and motion planning: A culprit detection problem. Int. J. Robotics Res. 35 (8), pp. 890–927. Cited by: §I.
Hierarchical foresight: self-supervised learning of long-horizon tasks via visual subgoal generation. arXiv preprint arXiv:1909.05829. Cited by: §I, §I.
-  (2018) SMC: satisfiability modulo convex programming. Proc. IEEE 106 (9), pp. 1655–1679. Cited by: §I.
-  (2014) Combined task and motion planning through an extensible planner-independent interface layer. In 2014 IEEE international conference on robotics and automation (ICRA), pp. 639–646. Cited by: §I.
-  (2013) Using classical planners for tasks with continuous operators in robotics. In Intl. Conf. on Automated Planning and Scheduling, Vol. 3. Cited by: §I.
-  (2018-06) Differentiable physics and stable modes for tool-use and manipulation planning. In Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania. External Links: Cited by: §II-A.
-  (2017) Multi-bound tree search for logic-geometric programming in cooperative manipulation domains. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4044–4051. Cited by: §I, §II-A, §II-B.
-  (2015) Logic-geometric programming: an optimization-based approach to combined task and motion planning.. In IJCAI, pp. 1930–1936. Cited by: §I.
-  (2018) Admissible abstractions for near-optimal task and motion planning. arXiv preprint arXiv:1806.00805. Cited by: §I.