We propose a human-operator guided planning approach to pushing-based robotic manipulation in clutter. We present an example problem in Figure 1. The target of the robot is to reach and grasp the green object. To reach this target object, however, the robot first has to push other objects out of the way, as in Figure 0(c). This requires the robot to plan which objects to contact, and where and how to push those objects, so that it can reach the target object.
These reaching through clutter problems can be difficult to solve due to several reasons: First, the number of objects make the state space of high-dimensionality; second, this is an underactuated problem, where the robot can control the system state only on a much lower-dimensional space; and third, predicting the evolution of the system state requires running computationally expensive physics simulators, to predict how objects would move as a result of a robot motion. There have been multiple effective algorithms developed to solve this problem [1, 2, 3, 4, 5, 6], however the problem remains a challenging one, where the planning times are still in the order of tens of seconds or minutes, and the success rates are low for difficult instances of the problem.
Further study of the reaching through clutter problem is important to develop planners that are more successful and faster, because reaching through clutter is also a problem that has a potential for major near-term impact in warehouse robotics (where robots need to reach into shelves to retrieve objects) and personal home robots (where a robot might need to reach into a fridge or shelf to reach and retrieve an object). The Amazon Picking Challenge  was a competition which gained particular attention for this potential near-term impact of robotic manipulation to warehouse robotics. The algorithms that we currently have, however, are not able to solve reaching through clutter problems in the real world in a fast and consistent way.
Most recent approaches to this problem employs the power of randomized planning to produce a fast working solution. For example Haustein et al.  use a kinodynamic RRT [8, 9] planner to sample and generate a sequence of robot pushes on objects to reach a goal state. Muhayyuddin et al.  use the KPIECE algorithm  to solve this problem. These planners report the best performance (in terms of planning times and success rates) in this domain so far, to the best of our knowledge.
In this work, we build on these control-based randomized planning approaches, but we investigate using them in conjunction with human-operator input. In our framework, the human operator supplies a high-level plan, to make the randomized algorithms solve the low-level planning problem faster and with higher success rates. For example in Figure 0(a), the human operator supplies the high-level plan of first pushing the object pointed by the arrow to the green region. The robot approaches the pointed object in Figure 0(b) and then pushes the object within the region in Figure 0(c). The human operator, points another object and a goal region in Figure 0(d). Finally, in Figure 0(g) the user indicates to the planner to autonomously plan and grasp the goal object (Figure 0(h)). The human operator’s input here is limited to selection of the object and an approximate goal location for that object. The actual pushing actions are planned by the autonomous low-level planner. After these objects specified by the human operator are pushed, the autonomous low-level planner plans to reach the actual target object.
The use of high-level plans is related to recent work in robotic hierarchical planning [11, 12, 13, 2, 14] and task-and-motion planning (TAMP) [15, 16, 17]. This line of work shows that with a good high-level plan for a task, the search of the low-level motion planner can be guided to a relevant but restricted part of the search space, making the low-level planner faster and more successful. For example, Stilman et al.  formulates the problem of manipulation planning among movable obstacles as a high-level search over the orderings of objects to be moved, combined with a low-level motion planner that try to pick objects up and move in that order. We use a similar high-level plan structure here, but we focus on pushing/non-prehensile manipulation of objects, rather than pick-and-place. Lee et al.  proposes hierarchical planning for non-prehensile manipulation of a single object, where the high-level plan includes the sequence of contacts to be made with the object.
The hierarchical/TAMP planners we mention above generate high-level plans autonomously. In this work we investigate using a human operator to suggest high-level plans.
While a good high-level plan can make the low-level planning problem easier to solve, the autonomous generation of a good high-level plan is itself a computationally expensive problem. For example, for our problem, i.e. reaching through clutter, a high-level planner would need to search in a space of all possible permutations of objects, combined with all possible goal locations for these objects. Furthermore, the high-level planner must be able to choose high-level actions that are feasible for the low-level planner. In our problem, this feasibility would mean checking/predicting whether the robot would be able to push a certain object to a certain location, which either requires the use of computationally expensive physics simulations, or a heuristic to estimate the probability of successful push.
While these decisions are computationally expensive for an autonomous planner, they can be easy for a human.
Human-in-the-loop manipulation planning systems have been investigated before for pick-and-place tasks [18, 19, 20, 21] but to the best of our knowledge, a human-in-the-loop approach has not been applied to non-prehensile manipulation before.
The rest of this paper is structured as follows. In Section II, we formally define the problem of reaching through clutter and new version of this problem which we call guided reaching through clutter. In Section III we present our concrete hypotheses for this work. In Section IV we briefly describe the randomized planners that are used to solve the reaching through clutter problem. In Section V we present our extension of these planners, which can solve the guided reaching through clutter problem, and we also describe the user interface that the human uses to provide the guidance. In Section VI we present our experiments and results, where we test our hypotheses in simulation, and on a real robotic setup.
Ii Problem Formulations
We define the reaching through clutter problem and then a new problem formulation that we name the guided reaching through clutter problem.
Ii-a The Reaching Through Clutter (RTC) problem
Our environment is comprised of a robot , movable obstacles and static obstacles. is the set with all the movable and static obstacles. is a set with the movable obstacles that the robot can interact with and a set with all the static obstacles that the robot should not interact with. We also have that is the goal object we are trying to reach.
The robot motion is constrained on the plane and therefore, its state space, , is . The state of a movable object , , and the goal object, , is its position (), as well as, its velocity and acceleration.
We define the state space, , as the Cartesian product of the corresponding state spaces of the robot, the goal object, and all the movable obstacles.
Let be the initial state of the system, and a set of possible goal states. A goal state, , is defined as a state where is within the robot’s end-effector (see Figure 0(h)).
Let be the control space comprised of the robot velocities. Let the system dynamics defined as that propagates the system from with a control for a duration . The problem can be defined as the tuple .
The solution to the RTC problem is to find a sequence of controls from to move the robot from to a . We will call planners that tackle the RTC problem directly as the baseline planners or the planners without guidance. We describe such planners in Section IV.
In the next subsection we will motivate a new problem, the guided reaching through clutter problem, that aims to solve instances of these problems in shorter (planning) time and with higher success rate.
Ii-B The Guided Reaching Through Clutter (GRTC) problem
In the guided reaching through clutter (GRTC) problem we introduce the notation of a guidance . Each is called a guide. A guide indicates an object and the goal region . The region, , is a region with a centroid on the plane with diameter. The order in which the guides appear in a guidance is important: . For a problem instance, there is an infinite number of possible guidances and we denote this set with . includes all the possible permutations of objects in (repetition of objects allowed), for all possible goal regions in the environment.
The GRTC problem can be formally defined as the tuple . Note that the only difference here is that a guidance is provided and the solution should satisfy this guidance. Therefore, given a RTC problem, , and a guidance, , we say that we convert the RTC problem into a GRTC problem, . In Section V-A we present a planner to solve GRTC problem instances.
The solution to the GRTC problem is to find a sequence of controls from to move the robot from to rearrange every object in the guidance, , to their corresponding regions. Once the environment is rearranged according to the guidance then we say that the guidance is satisfied. Once the guidance is satisfied, we plan to reach for the goal object . We describe such planners in Section V.
Consider Figure 1 as an example. consists of two guides: (Figure 0(a) and Figure 0(d)). The goal object is the green one. The solution to this GRTC problem instance is to find the controls to move the robot from the initial state (Figure 0(a)) to first push to (Figure 0(c)) and then to (Figure 0(f)). Note here that during these two actions other movable obstacles are implicitly rearranged as a result of the manipulation. Once the guidance is satisfied (Figure 0(g)) the problem becomes an instance of the RTC problem and a RTC planner plans to reach (Figure 0(h)).
refers to the empty set which means no guidance. When it implies that the problem is an instance of the RTC problem as there is no guidance involved.
We will call the planners that tackles the GRTC problem as the planners with guidance.
Ii-C Planner Evaluation Metrics
We evaluate the effectiveness of the planners for the RTC and GRTC problems based on two metrics. First, the planning time which is defined as the time it takes for a planner to come up with a solution to the problem with a maximum planning time limit . Second, the success rate which indicates the successful solutions a planner found in over a set of problem instances.
Iii Working Hypotheses
Given a RTC problem, p, there exists a small set of guidances, such that given a the GRTC problem of is faster and more successful.
In the first hypothesis we believe that there is a small set of good guidances in , and if we can find such good guidances we can convert the RTC problem into a GRTC problem and solve it faster and more successfully. If is true that the set is small, then it is difficult to find such guidances among a large set of irrelevant guidances.
In Section V-A we present a planner to solve GRTC problem instances. In Section VI-A we used this planner and we conducted experiments on a large subset of to check Hypothesis 1. We compared these results with a planner with no guidance. This experiment aims to verify whether the set exists and whether yields to better plans than (i.e., planning without guidance).
The problem of planning in clutter with multiple movable obstacles is proven to be an NP-hard problem . Finding a good guidance in a continuous space is difficult. For this reason, we believe that a planner with human intuition could be better than a planner without guidance. Therefore, given that hypotheses 1 holds, our final hypothesis is:
A planner with guidance and human intuition is better than a planner with no guidance.
In Section V-B we present a modified version of the GRTC planner that we name GRTC Planner with Human-In-The-Loop (GRTC-HITL) where the human operator is the source of the guidance at every step. We conducted a number of experiments, both in simulation and on a real robot, to verify Hypothesis 2. These results are presented in Section VI-B and Section VI-C where we compared the GRTC-HITL planner with planners with no guidance.
Iv Sampling-based Kinodynamic Planners
Two well known sampling-based kinodynamic planners for RTC-like problems are Rapidly-exploring Random Trees (RRT)  and Kinodynamic Motion Planning by Interior-Exterior Cell Exploration (KPIECE) . RRT and KPIECE will be referred as the baseline planners or the planners with no guidance throughout this work.
Kinodynamic planning is concerned with planning with dynamic constraints. Sampling-based planners is a family of random algorithms that builds a tree with root until a goal state is added to the tree. These algorithms are characterized by their property of being probabilistically complete. That is, given enough time, and if a solution exist, they will eventually find it .
Iv-a Kinodynamic RRT
RRT is a sampling-based planner that builds a tree from the initial state, , and then samples a random state and tries to extend the tree from the nearest neighbor . Kinodynamic RRT samples controls to bring near to .
KPIECE is a sampling-based planner that operates well in problems with complex dynamics . As like RRT, KPIECE builds a tree from the state and control space until a goal is reached. KPIECE uses the notion of space coverage to guide its exploration in the state space by constructing a discretization of the state space. At each step, a cell chain is sampled. Given the cell chain, a state is sampled. KPIECE, samples a control from the and it tries to expand the tree and update the discretization. Unlike RRT, KPIECE does not require a distance metric, it instead uses a projection of the state space.
Both Kinodynamic RRT and KPIECE have been used before in the literature to solve problems similar to the RTC problem [4, 5, 9, 24, 25]. In this work, when we are planning with a kinodynamic planner (either RRT or KPIECE) we will use the notation kinodynamicPlanning(goal) with some goal input. To solve a RTC problem, , we will use kinodynamicPlanning().
V GRTC Planner
In this section we provide a description of our approach. The GRTC planner is our approach to tackle the GRTC problems as defined in Section II-B. First we describe a generic implementation of the GRTC planner where the guidance is provided to our algorithm as an input. We then present a variation of the GRTC planner that takes the guides from a human operator through a graphical user interface.
V-a Generic GRTC Planner
The GRTC planner is a two-step high-level planner. First, given a guidance, the planner rearranges the environment to satisfy the guidance. Second, once the environment is rearranged, the planner will try to solve the remaining RTC problem. When the robot is planning to push the object to the desired goal region, object-to-object interaction is permitted. Therefore, it is possible for a single guide to rearrange implicitly other movable obstacles that are not part of the guide. We again use kinodynamic planners (e.g. kinodynamic RRT or KPIECE) to support the underlying motion planning. Algorithm 1 presents the GRTC planner. The for loop at line 2-6 performs the first step of the planning which is to rearrange the environment according to the provided guidance, . On line 7, the planner tries to reach a goal (i.e., tackling the RTC problem) using an RTC planner (Section IV).
For each guide, given the initial position of the object, , and the goal region’s centroid, , we compute a point, , that is on a straight line from to plus some that is the minimum enclosing circle of the desired object and a fixed value from end-effector’s coordinate frame to its edges. Given we calculate two approaching states and . The first approaching state has the end-effector’s side edge facing the goal region. The second approaching state has the end-effector facing the goal region. This is shown in Figure 2.
Given the two approaching states, the robot is planning using an underlying sampling-based planner to approach the object as specified by one of the two approaching states and from there to push to object towards the goal region. When the robot plans to approach the object, we steer the controls towards the object from the robot’s current position. Extensive experiments showed that steering towards the approaching state yields to faster solutions.
Note that once the for loop finishes on line 6 we record the overall planning time, . The baseline planner on line 7 has a time bound of if . In the case where the planning is declared as failed.
V-B GRTC with Human-In-The-Loop (GRTC-HITL)
Guidance Planner with Human-In-The-Loop (GRTC-HITL) is a variation of the Guidance Planner where the guidance is not known in advance but instead a human operator through a graphical user interface provides the guidance at each step. In Algorithm 2 we present an overview of the GRTC-HITL planner. An example interaction of the GRTC-HITL is presented in Figure 1. This algorithm, does not accept a guidance in advance, but instead a user through a graphical user interface provides the inputs (guides) at each step.
The state of the system changes at each input and the human operator is presented with the resulting state each time. The interaction is over once the human operator decides that the problem can be solved by a planner without guidance.
We use a simple user interface to communicate with the human operator. The human operator at every step is presented with a window showing the environment and the robot. The operator provides the input by first clicking on the desired object and then a point on the plane that will become the centroid of the guide’s region. The planner uses this input to plan and find a solution to push the object to operator’s desired region. At the end of each sub-planning the operator can decide whether the problem is simple enough for the planner with no guidance to solve or if further guidance is needed.
Vi Experiments & Results
For all the experiments we use the Open Motion Planning Library (OMPL)  implementation of RRT and KPIECE. We use MuJoCo  to implement the system dynamics, , with . For all the experiments we will mention below, is 300 seconds.
In Section VI-A we address Hypothesis 1 by conducting experiments to find out if exists and if is a small set. In Section VI-B we are comparing the GRTC-HITL with the baseline planners in simulation. Finally, in Section VI-C we present real-world experiments comparing RRT and KPIECE with GRTC-HITL on 10 different problems.
Vi-a Does exists? How big is it?
To answer hypothesis 1 we generated a large set of guidances, , in a discretized environment of four goal regions and ten movable obstacles. The scene is presented in Figure 4. Due to the dimensionality of the guidance set, we restricted the guidances to at most five guides per guidance and a guidance includes only distinct objects (i.e., object repetition not allowed). The cardinality of this guidance set is . In this experiment we tested guidances from . The first guidances includes all the guidances of one and two guides for all ten objects. For the remaining guidances (guidances with 3, 4, and 5 guides) we randomized the process of picking a guidance. For each guidance we ran the Guided Planner on the same problem. We repeated this process four times.
In Figure 5 we present the results from the four runs of the experiments we described above. In all four runs, failed to find a solution. There were guidances out of distinct we tested that lead to successful plans. That is, 0.15% of the guidance set. This supports Hypothesis 1. There is a small set, , that simplifies the RTC problem and yields to better plans.
Vi-B Is guided planning better than planning without guidance?
For the simulation results we generated ten random scenes. Our randomiser places the goal object at the back of the shelf in a random position and then incrementally places the remaining movable obstacles in the shelf such that no object is colliding with each other. We ran each baseline planner (KPIECE and RRT) ten times over each of the ten scenes. We then solved each problem using GRTC-HITL. The human operator provided a guidance once per problem. When the human operator provides the last guide, we executed RRT from that state to solve the RTC problem ten times. For all experiments, we recorded the result of the planning (success/failure), the planning time and the solution’s execution time. In case of GRTC-HITL we also record the guidance planning time, as well as, the user interaction time. An example of GRTC-HITL planning is presented in Figure 1.
As can be seen from Figure 5(b) GRTC-HITL improved planning time over 40% in 6 out of 10 problems and in some instances the speed-up was over 60%. GRTC-HITL managed to improve the planning time in every problem instance. GRTC-HITL also improved execution time by 60% in 6 out of 7 times. All interactions (shown in Table I) required less than five guides to simplify the problem. The user interaction time accounts only 16.4 seconds on average. This suggests that the user spent less than 20 seconds on average on each problem which that yielded to better plans.
Out of 100 runs, GRTC-HITL accounts 67% success rate in comparison with 17% and 26% for RRT and KPIECE respectively.
Vi-C Real-robot results
We performed experiments on a UR5 robot using the Ridgeback omnidirectional base. We control the Ridgeback to move the manipulator. We used OptiTrack motion capture system to capture the initial state of the system for the baseline planners and to capture the state of the system after every GRTC-HITL action.
We tested the effectiveness of RRT, KPIECE and GRTC-HITL in ten different problems in the real world. Some of the scenes are shown in Figure 3.
|Planner||S||F (Planning)||F (Execution)|
RRT failed 80% of the time and always during planning. Although, KPIECE managed to find a solution 60% of the time, it failed during execution 50% of the time because of the uncertainty. GRTC-HITL 70% of the time solved the problem successfully. It failed once during execution and two during planning. Due to the fact that GRTC-HITL is executed in a closed-loop manner, the uncertainty of the actions is corrected in the subsequent planning. The state left by the human operator was usually less cluttered than the initial state which minimised uncertainty when solving the RTC problem.
We mentioned two hypotheses about the problem we are trying to solve. First, a small guidance set exists and leads to better plans for a problem instance than . Second, a planner with guidance and human intuition is better than a planner with no guidance for the RTC problems. Through our experiments we proved both hypotheses and we showed an improvement both in terms of the planning time and the success rate of the problems we have tested.
We envision a future where robots can perform complex manipulation actions in cluttered environments in conjunction with human guidance. One such environment are warehouse shelves. We tested GRTC-HITL in such scenario in simulation. We executed four, different, problems in parallel with a single human operator. We evaluated if a single human operator can guide multiple robots at the same time. We then compared this with a parallel planner without guidance. The human operator provided effective guidance that lead to successful plans in three of the four problems. The planners without guidance failed in every problem, however.
To the best of our knowledge, this is the first work to look into non-prehensile manipulation with human-in-the-loop.
In the future, this work will investigate different ways to leverage human intuition to solve challenging manipulation problems in clutter.
-  M. Dogar, K. Hsiao, M. Ciocarlie, and S. Srinivasa, “Physics-based grasp planning through clutter,” in Robotics: Science and Systems, 2012.
-  G. Havur, G. Ozbilgin, E. Erdem, and V. Patoglu, “Geometric rearrangement of multiple movable objects on cluttered surfaces: A hybrid reasoning approach,” in Robotics and Automation (ICRA), 2014 IEEE International Conference on. IEEE, 2014, pp. 445–452.
-  N. Kitaev, I. Mordatch, S. Patil, and P. Abbeel, “Physics-based trajectory optimization for grasping in cluttered environments,” in 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2015, pp. 3102–3109.
-  J. A. Haustein, J. King, S. S. Srinivasa, and T. Asfour, “Kinodynamic randomized rearrangement planning via dynamic transitions between statically stable states,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 3075–3082.
-  M. ud din, M. Moll, L. Kavraki, J. Rosell et al., “Randomized physics-based motion planning for grasping in cluttered and uncertain environments,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 712–719, 2018.
-  W. Bejjani, R. Papallas, M. Leonetti, and M. R. Dogar, “Planning with a receding horizon for manipulation in clutter using a learned value function,” in 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids). IEEE, 2018, pp. 1–9.
-  C. Eppner, S. Höfer, R. Jonschkowski, R. M. Martin, A. Sieverling, V. Wall, and O. Brock, “Lessons from the amazon picking challenge: Four aspects of building robotic systems.” in Robotics: Science and Systems, 2016.
-  S. M. LaValle, “Rapidly-exploring random trees: A new tool for path planning,” 1998.
-  S. M. LaValle and J. J. Kuffner Jr, “Randomized kinodynamic planning,” The international journal of robotics research, vol. 20, no. 5, pp. 378–400, 2001.
-  I. A. Şucan and L. E. Kavraki, “Kinodynamic motion planning by interior-exterior cell exploration,” in Algorithmic Foundation of Robotics VIII. Springer, 2009, pp. 449–464.
-  M. Stilman, J.-U. Schamburek, J. Kuffner, and T. Asfour, “Manipulation planning among movable obstacles,” in Proceedings 2007 IEEE international conference on robotics and automation. IEEE, 2007, pp. 3327–3332.
L. P. Kaelbling and T. Lozano-Pérez, “Hierarchical planning in the now,”
Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010.
-  M. Dogar and S. Srinivasa, “A framework for push-grasping in clutter,” Robotics: Science and systems VII, vol. 1, 2011.
-  G. Lee, T. Lozano-Pérez, and L. P. Kaelbling, “Hierarchical planning for multi-contact non-prehensile manipulation,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015, pp. 264–271.
-  F. Lagriffoul, D. Dimitrov, J. Bidot, A. Saffiotti, and L. Karlsson, “Efficiently combining task and motion planning using geometric constraints,” The International Journal of Robotics Research, vol. 33, no. 14, pp. 1726–1747, 2014.
-  A. Wells, N. Dantam, A. Shrivastava, and L. Kavraki, “Learning feasibility for task and motion planning in tabletop environments,” IEEE Robotics and Automation Letters, 2019.
-  J. Lee, Y. Cho, C. Nam, J. Park, and C. Kim, “Efficient obstacle rearrangement for object manipulation tasks in cluttered environments,” arXiv preprint arXiv:1902.06907, 2019.
-  A. E. Leeper, K. Hsiao, M. Ciocarlie, L. Takayama, and D. Gossow, “Strategies for human-in-the-loop robotic grasping,” in Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction. ACM, 2012, pp. 1–8.
-  S. Muszynski, J. Stückler, and S. Behnke, “Adjustable autonomy for mobile teleoperation of personal service robots,” in 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication. IEEE, 2012, pp. 933–940.
-  T. Witzig, J. M. Zöllner, D. Pangercic, S. Osentoski, R. Jäkel, and R. Dillmann, “Context aware shared autonomy for robotic manipulation tasks,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2013, pp. 5686–5693.
-  M. Ciocarlie, K. Hsiao, A. Leeper, and D. Gossow, “Mobile manipulation through an assistive home robot,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012, pp. 5313–5320.
-  G. Wilfong, “Motion planning in the presence of movable obstacles,” Annals of Mathematics and Artificial Intelligence, vol. 3, no. 1, pp. 131–150, 1991.
-  M. Elbanhawi and M. Simic, “Sampling-based robot motion planning: A review,” Ieee access, vol. 2, pp. 56–77, 2014.
-  R. B. Rusu, I. A. Şucan, B. Gerkey, S. Chitta, M. Beetz, and L. E. Kavraki, “Real-time perception-guided motion planning for a personal robot,” in 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2009, pp. 4245–4252.
-  M. Nieuwenhuisen, D. Droeschel, D. Holz, J. Stückler, A. Berner, J. Li, R. Klein, and S. Behnke, “Mobile bin picking with an anthropomorphic service robot,” in 2013 IEEE International Conference on Robotics and Automation. IEEE, 2013, pp. 2327–2334.
-  I. A. Şucan, M. Moll, and L. E. Kavraki, “The Open Motion Planning Library,” IEEE Robotics & Automation Magazine, vol. 19, no. 4, pp. 72–82, December 2012, http://ompl.kavrakilab.org.
-  E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012, pp. 5026–5033.