Search algorithms are the backbone of symbolic planning. Their task is to find a sequence of actions that allows a system to transit into a state that satisfies a set of goal conditions. However, the planning complexity can quickly grow with increasing goal distance. The scalability of plan generation to long-horizon tasks with a large number of objects is identified as an important problem in the planning community. Several countermeasures like hierarchical planning  or macro operator generation [2, 3, 6] were proposed. Apart from the goal, however, also the domain complexity impacts the time that is required to generate plans. A planning domain provides blueprints of actions, also called planning operators, which define the available state transitions. Therefore, the more operator choices the planner has to consider for its search, the longer it will take to find a solution.
The operator and domain definition is usually done by hand, which is time-consuming or even requires a human expert . We have, therefore, recently introduced a system that automatically generates robotic planning operators from demonstrations . Furthermore, we have integrated the operator generation process with a plan generation and execution procedure, see Fig. 1. All collected operators are automatically parsed into a PDDL domain file. Plans are generated with the Fast-Downward planning system  and executed on a simulated TIAGo robot. Our system keeps track of how often each of the operators occurred during the demonstrations through the operator count.
This count helps us measure each operator’s importance and can help the planning procedure guide the search. The next logical step is to investigate how the operator count could also be used to reduce the search complexity and thus speed up the plan generation.
We therefore present a method that creates planning domains based on a subset of all collected planning operators. First, we will start analyzing only the most often observed operators and iteratively increase the number of operators in case that the plan search is not successful. We then analyze the impact of this approach in terms of its impact on the search time on the concrete example of the Fast-Downward planning system. The analysis is based on real operators generated through our system, as proposed in .
Ii Operator generation system and operator frequency measure
Our system  automatically generates planning operators from human demonstrations performed in a Virtual Reality setup. The first step of the system is the activity recognition, which is based on analyzing the hand movements of the demonstrators, as it was proposed in . The demonstration is segmented and classified into high-level actions like Reach, Take, Put, Stack or Idle
. The classification is based on a set of general rules in form of a C4.5 decision tree, which maps symbolic state variables likeinHand, actOn, handOpen, handMove to the segmented activities. Since these parameters describe the state of the hand or gripper, we refer to them as hand-state variables. Because of the generalisability of such semantic-based recognition methods, we could reuse the same set of rules (displayed in Table I) from previous work , which means that no training was required in our application.
The system utilizes the segmentation to create new operators and its classification for a meaningful operator name, but additionally, also all the relevant preconditions and effects need to be extracted. For the state description, we re-used the set of hand-state variables which we already required for the activity segmentation and classification (inHand, actOn, handOpen, handMove). On top of that, we added three commonly used state variables, graspable, onTop and inTouch. Graspable describes the situation where the hand is in close range to an object. Besides the hand-state variables, the operators should reflect changes in the environment as well. In particular, the hand activities need to be mapped to all the respectively caused environment changes (e.g., in terms of environment-state variables like onTop and inTouch). The resulting set of state variables (displayed with respective groundings in Table II) is minimalistic, but nevertheless expressive enough for representing the cube stacking environment on a high, symbolic level.
|State variables ()||Grounding|
|A hand/gripper has closed its fingers|
|around a cube.|
|Dist. betw. cube and hand m.|
|& hand moving towards cube.|
|Hand is moving|
|Dist. betw. cube and hand m.|
|Hand is open.|
|VR physics engine detects contact|
|between 2 objects.|
|Obj. A on top of obj. B if inTouch and|
|A higher than B.|
), and their respective grounding during demonstration and execution. The object categories inside the brackets after the sv-name indicate the range. The thresholds were determined heuristically.
Our system iteratively collects operators with each new demonstration. If the operator is new in terms of its preconditions and effects, it is added to the list of operators. If the operator has already been encountered on a prior occasion, the operator counter is incremented. The system uses the operator frequency as a quality measure. The goal is to prefer more often observed operators for the following two reasons. First of all, are more often observed operators less susceptible to potential segmentation errors. Second, they are the preferred choice of the human and should therefore be considered as an essential measure for the plan generation. Since planning systems like Fast-Downward  perform minimization, we introduced a score inverse proportional to the operator count:
In , we have tested the operator generation system based on 12 demonstrations from 3 different participants, on the task of stacking one or two cubes with the left or right hand. Our system currently classifies operators into the five different categories/types of Reach, Take, Put, Stack and Idle. For each of the operator types, several different configurations in terms of preconditions and effects were observed. In total, 115 activities were retrieved from the data, producing 30 unique operator configurations distributed among the five categories. The system, for example, generated five different Put operators, of which the two most often observed configurations are displayed in Figure 2.
In most cases, the user first reaches for a cube, takes and then puts it somewhere. This yields the (Put, Count: 12) operator. However, occasionally (Put2, Count: 4), the user was picking up objects without resting the hand, which is one of the decision rules for Take (see Table I). Thus there was a direct transition from Reach to Put. Note, for example, that the inHand predicate (5th row) is changing from false to true for Put2, whereas it must already be satisfied as a precondition (3rd row) for Put.
Iii Domain optimization
With increasingly longer planning horizons, finding a plan takes substantially more time. For example, finding a plan to build a tower of six cubes takes more than 180 seconds with A* search and iPDB heuristic and even more than 1725 seconds with A* search and landmark-cut heuristic.
One way of improving plan generation performance would be to reduce the number of planning operators. We, therefore, propose a method to slim down our planning domains to operators who were observed most often and iteratively increase the scope to less popular operators in case no plan can be found on the more exclusive subset of operators.
The exact procedure of the domain optimization works as described in Algorithm 1. The variable prio denotes how many operators of each type should be included in the domain: for , only the operators with the highest count, for , the first and second most frequent operators and so on. If there is a tie among the demonstration frequency of several operators of the same type, all of these operators are included.
To test the impact of considering different sets of operators we have conducted several experiments. First, we use the same set of 12 demonstrations from our user experiment in . Based on our proposed method, we generate in total five different domains with an increasing number (5, 11, 18, 23, 30 respectively; see Figure 3) of operators from this data. We then evaluate the search time for each domain on a set of increasingly difficult planning problems (Figure 4).
Our domains require planning systems to support the action-cost: feature from PDDL3.1. We, therefore, chose the Fast-Downward planning system , which offers a wide variety of different heuristics and planning algorithms. For the purpose of our experiment we use the two configurations of astar(ipdb()) (A* search with iPDB heuristic) and astar(lmcut()) (A* search with landmark-cut heuristic). Note that the purpose of testing two different heuristics is to show that the search time reduction, achieved by slimming the domains, is independent of the planning system. Results are displayed in Figures 5 and 6.
|domain||Goal 1||Goal 2||Goal 3||Goal 4||Goal 5|
|domain||Goal 1||Goal 2||Goal 3||Goal 4||Goal 5|
First off, we can see that for all test goals of this particular experiment, the specific set of 5 operators (the one most often demonstrated operator for each operator type) is enough to produce a plan. We also observe from tables III and IV, that the complexity of the first three goals is not large enough to detect significant differences between the domains in terms of search time. In particular, the minimal operator domain is not necessarily the fastest. However, the difference is significant for the fifth planning goal, which takes at least 20 consecutive actions to reach. The search takes around 72 times (2.499s vs. 180.652s) more time for the iPDB heuristic and even 120 times (14.3s vs. 1725.18s) for the landmark-cut heuristic comparing the smallest and the largest domain.
The experiment clearly shows significant gains in search time reduction for complex goals when slimming the domain to only the most often used operators. Note that, in general, there is no guarantee that the first batch of operators always finds a plan. Nevertheless, for long-horizon plans, the difference in search time is so significant that we would save time even if the first domains fail to generate any plans, and the scope of operators would need to be extended one or two times. Additionally, this gap in search time will grow even larger with more complex plans.
Another advantage is that reducing the number of operators can also force the planning system to follow the consensus of the human demonstration more closely, rather than using loopholes for optimization purposes. Consider our Put operators (see Figure 2) as an example. A plan like [Reach, Put2, Stack] is cheaper than [Reach, Take, Put, Stack], because the cost of Take + Put is higher than just Put2, even if Put2 was less often observed than Put. In the first domain with just 5 operators, however, only Put is available, which results in a plan that follows more closely what most users were demonstrating.
While the domain slimming method has many upsides in the discussed example, there are also some potential challenges. When scaling up the domain by including operators from different demonstration scenarios, there could be a bias towards environments where the number of observations was large. For example, compare the cube stacking experiment with 12 demonstrations with a potential dish stacking experiment with 100 demonstrations. The resulting operators will have a much higher count and therefore be added earlier into the subset of considered operators. That will happen even for applications where there are no dishes at all. Consequently, it will also be essential to work on additional operator merging techniques, like operator generalization over several object types. Generalization would possibly allow extending application environments and tasks. However, its success might depend on how detailed state variables describe the world state. If, for example, we introduce additional state variables which closely describe the shape of objects, the resulting operator will also be more specific and less generalizable. Essentially it is a question of what an appropriate level of abstraction looks like and could also require a hierarchical approach.
In this paper, we have investigated the problem of reducing search time for long-horizon planning goals. We present a method that creates planning domains based on a subset of all collected planning operators. First, we will start analyzing only the most often observed operators and iteratively increase the number of operators if the plan search is unsuccessful. We showed that this significantly reduces the planning time. As a next step, we would like to work more on merging operators, for example, by generalizing over object types. This could go hand in hand with the transition to probabilistic operators and will also comprise the interesting investigation of how to balance operator expressiveness vs. generalizability in terms of abstraction and choice of state variables.
The research reported in this paper has been supported by Chalmers AI Research Centre (CHAIR).
-  (2017) On-line simultaneous learning and recognition of everyday activities from virtual reality performances.. In IROS, External Links: Cited by: §II.
Macro-FF: Improving AI planning with automatically learned macro-operators.
Journal of Artificial Intelligence Research24, pp. 581–621. External Links: Cited by: §I.
-  (2015) On the online generation of effective macro-operators. IJCAI International Joint Conference on Artificial Intelligence 2015-January (Ijcai), pp. 1544–1550. External Links: Cited by: §I.
-  (2021) Automated Generation of Robotic Planning Domains from Observations. arXiv preprint arXiv:2105.13604. Cited by: §I, §I, §II, §II, §IV.
-  (2006-07) The fast downward planning system. J. Artif. Int. Res. 26 (1), pp. 191–246. External Links: Cited by: §I, §II, §IV.
-  (2020) Macro operator synthesis for adl domains. Frontiers in Artificial Intelligence and Applications 325, pp. 761–768. External Links: Cited by: §I.
A review of machine learning for automated planning.
The Knowledge Engineering Review27 (4), pp. 433–467. Cited by: §I.
-  (2011) Hierarchical task and motion planning in the now. Proceedings - IEEE International Conference on Robotics and Automation, pp. 1470–1477. External Links: Cited by: §I.
-  (2017-05) Transferring skills to humanoid robots by extracting semantic representations from observations of human activities. Artificial Intelligence 247, pp. 95–118. Cited by: §II.