Pick-and-place is among the most common tasks robot manipulators are applied for today. Grasp planning, which is the process of autonomously selecting grasps, still receives much attention and effort from the robotics community [1, 2, 3]. In contrast, the problem of placement planning, which is the process of autonomously deciding where and how to place an object with a robot, has received considerably less attention.
An autonomous robot tasked with placing a grasped object can generally not assume to know the environment in advance, rather it faces the following challenges when perceiving the environment for the first time:
It needs to identify suitable locations that afford placing. For instance, an object may be placed flat on a horizontal surface, leaned against a wall, placed on a hook, or laid on top of other objects. Determining how and where a particular object can be placed, requires analysis of both the environment’s and the object’s physical properties.
It needs to be able to reach the placement. Placing requires the robot to move close to obstacles, which make it difficult to compute collision-free arm configurations reaching a placement. In addition, the obstacles render planning an approach motion computationally expensive.
Not all placements are equally desirable. For many tasks, there exists an objective such as stability, human-preference on location or clearance from other obstacles, that is to be maximized.
Our contribution is an algorithmic framework that addresses these challenges and our main focus lies on computing reachable placement poses (challenge 2) that maximize a user-specified objective (challenge 3). In particular, we consider a dual-arm robot in difficult to navigate environments, such as cluttered shelves and cupboards, fig:figure_one. Our approach addresses challenge 2 by integrating a motion planning algorithm with a novel hierarchical search for a placement pose. We address challenge 3 by designing the algorithm such that it finds an initial feasible solution quickly, and then incrementally improves the user-specified objective in an anytime fashion.
Ii Related Works
Ii-a Placing objects
Previous works on placing objects predominantly focus on challenge 1, i.e. searching poses in the environment, where an object can rest stably. A naïve solution consists in identifying horizontal surfaces in the environment and placing the object flat on the surface where there is enough space. This technique is, for instance, commonly employed in manipulation planning works which focus on planning complex sequences of pick-and-place operations rather than individual placements [4, 5, 6, 7, 8].
The object’s orientation for a horizontal placement can be obtained by analyzing the object’s convex hull and extracting the faces that support a stable placement [7, 8]. Each of these faces gives rise to a base orientation when aligned with the support surface. Different poses with the same base orientation can then be obtained by translating the object on the support surface and rotating it around the surface’s normal. To locate collision-free and reachable placement poses (challenges 1 and 2), these works then employ rejection sampling of positions and orientations using a collision-checker and inverse kinematics solver. This is sufficiently efficient, if there are few obstacles and most sampled poses are within reach. If this is not the case, however, a more efficient strategy such as the one presented in this work is required.
. Schuster et al. present a data-driven segmentation algorithm to discriminate clutter from support surfaces, and apply this segmentation to extract candidate placement poses. Similarly, Jiang et al. follow a data-driven approach and train a classifier to score the placement suitability of candidate poses based on manually defined features. These features are extracted from 3D point-clouds of the object and the environment, and include physical feasibility, stability, as well as human placement preference. The approach is capable of identifying a variety of placements, such as placing a plate in a dish-rack, hanging a mug on a bar or laying a box on a flat surface. In order to evaluate the classifier, however, the approach requires a set of candidate poses. Obtaining these in cluttered environments is non-trivial, as random sampling, for instance, has low probability of sampling good candidate.
Harada et al.  locate placement poses by matching planar surface patches on the object with planar surface patches in the environment. This allows the approach to locate placements on large, flat surfaces and also placements where the handle of a mug is hanging on a flat bar. While the work also integrates this algorithm with a motion planner (challenge 2), it does not perform any optimization of an objective (challenge 3).
In contrast to these previous works, our work’s focus lies on computing reachable placement poses among obstacles (challenge 2) that maximize a user-specified objective (challenge 3). We follow aforementioned previous works when addressing challenge 1 and place objects on horizontal support surfaces.
Ii-B Integrated grasp and motion planning
Ensuring that a collision-free approach motion to a placement exists (challenge 2) requires us to closely integrate the placement search with a motion planning algorithm. This relates our problem to integrated grasp and motion planning [12, 13, 14, 15, 16]. These works present algorithms that simultaneously compute grasps with corresponding approach motions, and demonstrate that in cluttered environments separate planning of grasps and approach motions is inefficient. This is due to the fact that many potential grasps are in collision or out of reach. Our work addresses the analogous challenge for placing, with the extension of optimizing an objective function on the placement (challenge 3). The method that we employ relates to our previous work on integrated grasp and motion planning .
Iii Problem Definition
We consider a dual-arm robot equipped with two manipulators, , that is tasked to place a rigid object in a user-defined target volume in its workspace. We assume that the object can be grasped by either arm, and the process of acquiring a stable grasp is known a priori. The target volume is a set of positions for the object and restricts the search space for placement poses to . Obviously, not all poses in facilitate a stable placement, since for many of these the object might be, for example, in midair or intersecting obstacles. We denote the constraint that a pose facilitates the stable placement of the object as binary mapping , that is if is a stable placement and otherwise. Additionally, we denote the constraint that a pose is physically feasible, i.e. that there is no intersection of the interior of the object with any obstacle, as binary predicate .
A placement pose must be reachable by the robot manipulator. For this, let denote the configuration space of arm , and let denote the pose of the grasped object when the arm is in configuration . We say a pose is path-reachable, , if for some arm there exists a known collision-free continuous path starting from the initial configuration of the robot and ending in a configuration such that it reaches , i.e. .
With these constraints and a user-provided objective function , we formalize our task as the following constraint optimization problem:
Independently of the objective function, the optimization problem is challenging to solve due to the constraints. The collision-free constraint renders the problem non-convex. The stability constraint, , is difficult to model, as it is a function of the physical properties of the object and the local environment. Lastly, the path-reachability constraint requires a motion planning algorithm to compute an approach path, which is generally computationally expensive.
Note that after releasing an object at a placement pose, the robot might not be able to retreat without colliding with the placed object. Hence, in principle, there is the additional constraint that a collision-free retreat motion must be possible. In this work, however, we choose to exclude this constraint from our problem definition and instead assume that a collision-free retreat is always possible.
Iii-a Assumptions on prior information
We assume access to the kinematic and geometric model of the robot, the geometry of the object, the location of its center of mass, and the geometry of the environment in form of surface points . Furthermore, we assume that the environment is rigid, and that gravity acts antiparallel to the -axis of the world’s reference frame.
We also assume that for each manipulator the grasp transformation matrices from the object’s frame to the respective gripper frame are known. We assume that these grasps are selected such that a stable placement pose can be acquired without releasing the object.
For a pose , let be its position and its orientation expressed in rotation angles around the world’s and axis respectively. Our algorithm treats the objective function as a black-box, however, we will assume that the function is numerically differentiable w.r.t. the components of .
We address the problem in (1) with the framework shown in fig:overview. This framework consists of a pre-processing stage and an optimization stage, algo:high_level. The framework receives the information listed on the left as input and produces paths for the different arms as output. The final configuration of each path represents a placement solution using a particular arm, and places the object at a stable and collision-free placement pose . The optimization algorithm, algo:high_level, operates in an anytime fashion, meaning that it iteratively produces new solutions that achieve better objective than the previous solutions .
The base idea of our approach is to decompose the problem into a search for feasible placement poses that fulfill all constraints, and only subsequently optimize the objective. In general, we can not model all of the constraints in (1) in closed form. However, for a particular pose we can verify whether it fulfills the constraints. We therefore address the optimization problem in a sampling-based manner. For each constraint in (1) our framework has a component designed to verify it or to provide samples fulfilling it:
Stable placement. A necessary condition for a pose to be stable, , is that the object is in contact with the environment. Therefore, in the pre-processing stage, we extract surfaces in the target volume and on the object that afford placing. With these surfaces we can obtain an approximation of the set of stable placement poses that serves as search space for our optimization. In addition, these surfaces allow us to verify, whether the object is placed stably at a given pose.
Physically feasible placement. Within the set we need to locate object poses that are physically feasible, , i.e. poses that do not result in penetration of any obstacles. In addition, we need to verify that these poses can be reached by collision-free arm configurations for at least one arm, as this is a necessary condition for a placement pose to be path-reachable, . This verification as well as the sampling is performed within algo:sample, which we refer to as goal sampling algorithm.
Reachable placement. To verify the path-reachability of candidate poses, , we need to construct approach paths to them. For this, we employ a sampling-based motion planning algorithm  that receives arm configurations sampled by the goal sampling algorithm as goals.
Preferred placement. The optimization of the objective function is achieved through two concepts. First, we employ a greedy local optimization algorithm on the poses for which all constraints have been verified to be fulfilled. Second, whenever the motion planner succeeds in verifying path reachability for a new pose sample , we constrain following iterations to only validate path-reachability for poses that achieve a better objective .
Iv-a Defining Potential Contacts
Modeling the set of stable poses of stable placement poses is challenging, due to the large variety of possible placements. While we only require samples from this set, rejection sampling on does not suffice, due to low probability of satisfying the constraint . We therefore approximate this set by the set of poses at which the object is placed on horizontal surfaces. For this, we extract a discrete set of placement contact regions, , from the surface geometry in the target volume. A placement contact region is a contiguous set of surface points that share the same height, see fig:regions_and_faces. In our implementation, we extract these from an occupancy grid of the environment, however, also other techniques could be employed.
To determine the orientation in which the object should make contact with these regions, we extract contact points from the object’s surface. For this, we follow a similar approach as in aforementioned previous works [7, 8] and select faces from the object’s convex hull to place the object on. The convex hull of a finite set of points, i.e. a point cloud of the object, is a convex polyhedron. A face of this polyhedron supports a stable placement on a horizontal surface, if the projection of the object’s center of mass along the face’s normal falls into this face, see fig:regions_and_faces. We refer to the number of faces for which this is the case as placement faces, .
Only the vertices of the boundary of a placement face, , are guaranteed to be part of the actual object’s surface. It is therefore these vertices that need to be in contact with the support surface in order for the object to be placed stably. For each face, we select one of these vertices as reference contact point and define a transformation matrix as shown in fig:regions_and_faces. With this, the combination of a contact region and a placement face defines a class of object poses
where is the -coordinate of the placement region, the rotation matrix around the -axis by angle , and an operator that combines these to a transformation matrix. These poses vary in translation within the contact region and rotation by around the -axis going through the reference point located at . The union of all these sets is then a parameterized approximation of that we can search for feasible placements.
Note that not all poses in are stable, since it’s definition only guarantees that the reference contact point is in contact with the placement contact region. To verify that a pose is actually stable, we need to verify that all vertices of the respective placement face are in contact with a placement region. They may, however, be in contact with different regions, which allows placements where, for instance, an object is placed over a gap in the support surface.
Iv-B Sampling-based Optimization
The optimization of the objective is performed by algo:high_level. The algorithm alternates between executing the sub-algorithms SampleGoals, PlanMotion and OptimizeLocally until termination is requested by the user. SampleGoals computes a finite set of collision-free arm configurations , such that for each the stability constraint, , and the physical feasibility constraint, , are fulfilled. Furthermore, it only returns configurations for which the placement objective improves over the best solution found so far, . Initially, is set to and is updated, whenever the motion planner succeeds at finding a new path to any of the configuration , where stems from the union of all sampled goals.
Motion planning towards the goals, , is performed in PlanMotion. The set is maintained to only contain configurations that reach placements of greater objective than . This guarantees that whenever a new path is found, it reaches a placement with a better objective than any previous solution. Subsequent to finding a new path, OptimizeLocally is executed to further improve the solution locally.
In each iteration SampleFeasible and PlanMotion receive parameters respectively. The presence of the parameters emphasizes that both sub-algorithms maintain an internal state across iterations of the algorithm, which is crucial for the efficiency of the overall approach and will be detailed in the following sections. The parameters limit the computation time budget for each function, to balance the computational burden of sampling new goals and planning motions to sampled ones.
Iv-C Sampling Kinematically Feasible Placements
The function SampleGoals needs to solve a constraint satisfaction problem:
This subproblem by itself is challenging to solve. While we only require samples from the feasible set of this problem, rejection sampling on does not suffice, due to low or zero probability of satisfying . Hence, instead we sample the parameterized pose set , for which it likely that is fulfilled, and employ an inverse kinematics solver to compute arm configurations for the sampled poses. This, however, can be rather inefficient in the presence of obstacles, where the probability of randomly sampling a collision-free object pose and arm configuration reaching is low. We remedy this by employing a sampling procedure that adapts its sampling and focuses on regions of that are likely to fulfill all constraints.
Sampling a pose from involves choosing a placement contact region and a placement face . In addition, to compute an arm configuration reaching a sampled pose, we need to select an arm . While there is an overlap of the poses that each arm can reach, some may be more easily reached by one than the other. Whether a particular placement face is a good choice to place an object on depends on the grasp, and thus on the arm that is selected. Similarly, whether a placement region allows a stable and obstacle penetration free placement strongly depends on the placement face, as this determines the footprint and the base orientation of the object. Furthermore, if a pose for a particular region and face is reachable by an arm , it is likely that poses in close proximity are also reachable by the arm. Hence, there exists a spatial correlation of feasibility within a set , as well as between different sets of with similar categorical choices for and arms .
This observation leads us to the definition of the AFR-hierarchy shown in fig:arf_hierarchy. On the first level of this hierarchy, an arm is selected, on the second level a placement face , and on the third a placement contact region . From the third level on, each node in the hierarchy defines all quantities that we require to sample poses and compute arm configurations. Subsequent level of this hierarchy recursively partition the sets . On these lower level, every node represents a tuple , where define a range of orientation angles. The nodes at depth cover all of and thus it is and . The children of this node, however, will only cover subsets of that are constrained in the positions and orientations.
Let be a node at depth , then the children of this node arise from subdividing the region and the interval . The placement region is divided into four subregions by splitting it along its mean and positions. The interval is split into equally sized sub-intervals. The resulting children of then arise from combining each subregion with each interval of the orientation ranges. This subdivision is continued until some user-specified minimal region area and interval length.
Iv-C2 Monte Carlo Tree Search-based Goal Sampling
To obtain samples that satisfy (2) we exploit the aforementioned correlation and employ a Monte Carlo Tree search (MCTS) -based algorithm for sampling. The algorithm is shown in algo:sample and algo:select, and uses the AFR-hierarchy to produce the desired samples. algo:sample is the SampleGoal procedure that is called by algo:high_level.
The key idea of the algorithm is that it incrementally constructs a tree of the nodes in the AFR hierarchy, and stores for each node the proportion of valid samples that have been obtained from its subbranch. This knowledge is then used to focus sampling on branches of the hierarchy that are likely to contain valid samples, while still maintaining some exploration. The tree is stored in the variable , and thus steadily constructed across all executions of SampleGoal in algo:high_level.
Every time algo:sample is executed it attempts to produce goal samples , where is a collision-free arm configuration for arm reaching a feasible placement pose, that improves the objective . For each sample, the algorithm first selects a node from the AFR hierarchy using algo:select. algo:select ensures that this node fully specifies a set or a subset thereof as described in IV-C1. It then randomly samples a pose from this set and evaluates whether the pose constraints and are fulfilled. If this is the case, it employs an inverse kinematics solver to compute an arm configuration for the arm specified by the AFR node . If such a configuration exists and it is collision-free, we obtained a new goal sample that can be provided to the motion planning algorithm.
After each sample step, the tree stored in is updated according to whether we successfully obtained a new goal sample or not. For each sampled node , we store the following information in :
, the number of samples obtained from or any of its descendants
, the sum of all rewards obtained for sampling or any of its descendants
, the children of that have been added to
The numbers and are updated by the Update function, whereas is updated within algo:select. When we sampled we obtain a reward
where , if all constraints are fulfilled, i.e. and . This reward is binary, if is a leaf of the AFR hierarchy and there are no further subdivisions of the pose set. If
, however, is not a leaf, the reward is a heuristic valuethat also gives non-zero rewards to samples that fulfill some, but not all of the constraints. In any case, the reward is recursively propagated to the ancestors of to update their respective . The number of samples, , is always increased by one as we acquired a single sample, whereas the accumulated reward is updated by the reward obtained by the sampled node .
The decision on which node to sample is made in the SelectAFRNode function, algo:select. The algorithm always starts at the root of the AFR hierarchy and descends to a node in the hierarchy that it decides to sample. Since we can produce samples only for nodes at depths greater than , the algorithm always needs to descend at least to depth before returning any node. For nodes at depth greater than the algorithm descends to its children, as long as is not a leaf and has been sampled before.
Initially, only contains the root of the AFR hierarchy. Hence, the only option the algorithm has is to select a child in the AFR hierarchy that is not in yet. This is done by the AddChild operation, which selects a random child from the AFR hierarchy and adds it to . Let now be any selected AFR node that is already stored in . We distinguish between its children that are also stored in and its total set of children as defined by the hierarchy. If has children in , the algorithm may either descend to one of them, or add a new child to , if . The decision on what to do is based on the UCB1 policy , which is common to employ in Monte-Carlo Tree search and shown in algo:select. It allows to the algorithm to balance between re-sampling branches (exploitation) that have led to valid samples before and exploring new branches. The score for adding a new child is based on the conservative assumption that any unsampled node is as good as the average of its siblings.
Iv-D Motion Planning
The subalgorithm PlanMotion plans motions for each arm separately, as we assume that only one arm is required to perform the actual placement, while all other arms remain in a resting position. In principle, any motion planning algorithm could be employed for this subalgorithm. The only requirement on the algorithm is the possibility to efficiently add and remove goal configurations from the goal set , desirably without loosing information, e.g. samples in a search tree, that could be beneficial for planning paths to future goals.
In our implementation, we employ a modification of OMPL’s  bidirectional RRT algorithm . The algorithm constructs a single forward tree and one backward tree for each goal in . Whenever the algorithm succeeds in connecting the forward tree with a backward tree, the two trees are merged and success is reported. When is modified, the backward trees rooting in goal configurations that have been removed are still maintained, as they may still prove valuable to reach other goals. When connecting to any of these, however, the algorithm no longer reports success and only merges it into the forward tree.
Iv-E Local Optimization
Whenever the motion planning algorithm succeeds in computing a new path , we locally optimize the reached placement pose by following the gradient of the objective function . For this, let be the final configuration of the path that reaches a placement pose . We can locally improve the solution using the following update rule:
where is the pseudo-inverse of the arm’s Jacobian at , a step size, and lifts the three dimensional gradient to a six dimensional end-effector velocity. As long as the updated is collision-free and is not violating any constraints, we concatenate the new configurations to the path and obtain an improved feasible solution.
We implemented our approach in Python using OpenRAVE  and the Open Motion Planning Library . For evaluation, we plan and optimize placements for four different objects on three different environments with varying degree of clutter, see fig:figure_one and fig:scenes_plots. The objects differ in size, shape and in number of placement faces. As robot model, we employ ABB’s dual-arm robot Yumi, where each arm has DoFs. All experiments were run on an Intel Core i7-4790K CPU @ 4.00GHz with GB RAM running Ubuntu 18.04.
As objective function we employ two variations of clearance to obstacles within the placement volume. We define the clearance to obstacles as:
where denotes a finite set of points approximating the volume of when it is located at pose . The function denotes the distance in and positive direction to the environment’s surface within the target volume . Maximizing this function, i.e. , leads to placements where the object is distant to obstacles. This is useful, for example, if the robot is tasked with manipulating the object further after placing. Minimizing this clearance function, i.e. , on the other hand, is a good heuristic when the robot is tasked to pack multiple objects into a limited volume. This objective is particularly interesting, as the closer the object is to be placed to obstacles, the more difficult it is to obtain a collision-free approach path.
As can be seen in fig:figure_one, and better so in the accompanying video, our algorithm succeeds at computing placements with high objective values for all test cases. To evaluate the planning and optimization performance of our approach, we compare it to two simplified modifications. In the first modification, we replace the Monte-Carlo Tree search-based sampling algorithm, algo:sample, with a naive uniform sampler. In addition, in the second modification we remove the local optimization from algo:high_level.
We ran each instance for each objective, object and scene times for minutes and recorded the objective values of the found solutions. The progress of the average objective values as a function of runtime is shown in fig:scenes_plots.
In all test cases all instances of our algorithm compute initial solutions within a few seconds, and succeed at locating better solutions as time progresses. We observe that our algorithm using MCTS performs better than, or as good, as the baselines. The uniform sampler without local optimization performs worse than the one with. Hence, we conclude it is both the local optimization and the MCTS sampling that enable our algorithm to find better solutions faster, and on average achieve larger final objectives.
The mean objective values increase quickly in the beginning before slowing down as they approach the maximum objective ever observed in the respective scene. This is likely due to the fact that the probability of locating poses that improve the objective declines the higher the best achieved objective is.
Vi Discussion Conclusion
We presented an algorithmic framework that computes robot motions to transport a grasped object to a stable placement that optimizes a user-provided objective. Our approach is capable of achieving this even in environments cluttered with obstacles. The approach considers all available arms to reach the best placement and operates in an anytime-fashion, computing initial low-objective solutions quickly and improving on it as computational resources allow.
Our approach combines sampling-based motion planning and local optimization with a hierarchical sampling algorithm based on MCTS. A key novelty lies in the AFR hierarchy and applying MCTS as sampling algorithm. While our experiments already demonstrate the advantage of this approach, we believe it has more potential. For instance, in this work the grasp, that we place the object with, is determined by the chosen arm. An interesting future extension is to incorporate different choices of grasps into the hierarchy. In addition, the sampling algorithm could employ pruning of branches of the hierarchy, if a branch can be proven to not contain any solutions.
Further, we believe the approach can be extended to different types of placements. The different combinations of placement faces and regions constitute disjoint contact classes of object poses. In future work, we intend to extend the AFR hierarchy to more diverse contact classes, such as an object leaning against a wall.
Lastly, one of the weaknesses of the sampling-based optimization approach is the decreasing convergence rate observed in our experiments. To remedy this, we intend to investigate whether we can exploit gradient information not only in the local optimization step, but also in the goal sampling algorithm.
-  A. Bicchi and V. Kumar, “Robotic grasping and contact: A review,” in ICRA, 2000.
-  J. Bohg, A. Morales, T. Asfour, and D. Kragic, “Data-driven grasp synthesis – a survey,” TRO, vol. 30, no. 2, 2014.
-  M. Roa and R. Suárez, “Grasp quality measures: review and performance,” Auton. Robots, vol. 38, no. 1, 2015.
-  P. S. Schmitt, W. Neubauer, W. Feiten, K. M. Wurm, G. V. Wichert, and W. Burgard, “Optimal, sampling-based manipulation planning,” in ICRA, May 2017.
-  Z. Xian, P. Lertkultanon, and Q. Pham, “Closed-chain manipulation of large objects by multi-arm robotic systems,” IEEE RA-L, vol. 2, no. 4, pp. 1832–1839, Oct 2017.
-  C. R. Garrett, T. Lozano-Pérez, and L. P. Kaelbling, “FFRob: Leveraging symbolic planning for efficient task and motion planning,” IJRR, vol. 37, no. 1, pp. 104–136, 2018.
-  W. Wan, H. Igawa, K. Harada, H. Onda, K. Nagata, and N. Yamanobe, “A regrasp planning component for object reorientation,” Autonomous Robots, pp. 1–15, jul 2018.
-  P. Lertkultanon and Q. Pham, “A certified-complete bimanual manipulation planner,” IEEE Transactions on Automation Science and Engineering, vol. 15, no. 3, pp. 1355–1368, July 2018.
-  M. J. Schuster, J. Okerman, H. Nguyen, J. M. Rehg, and C. C. Kemp, “Perceiving clutter and surfaces for object placement in indoor environments,” in Humanoids, Dec 2010, pp. 152–159.
-  K. Harada, T. Tsuji, K. Nagata, N. Yamanobe, and H. Onda, “Validating an object placement planner for robotic pick-and-place tasks,” Robotics and Autonomous Systems, vol. 62, no. 10, pp. 1463 – 1477, 2014.
-  Y. Jiang, M. Lim, C. Zheng, and A. Saxena, “Learning to place new objects in a scene,” IJRR, vol. 31, no. 9, pp. 1021–1043, 2012.
-  J. A. Haustein, K. Hang, and D. Kragic, “Integrating motion and hierarchical fingertip grasp planning,” in ICRA, May 2017, pp. 3439–3446.
-  J. Rosell, R. Suárez, and A. Pérez, “Path planning for grasping operations using an adaptive pca-based sampling method,” Auton. Robots, 2013.
-  D. Berenson, R. Diankov, K. Nishiwaki, S. Kagami, and J. Kuffner, “Grasp planning in complex scenes,” in Humanoids, 2007.
-  N. Vahrenkamp, T. Asfour, and R. Dillmann, “Simultaneous grasp and motion planning: Humanoid robot armar-iii,” IEEE Robotics Automation Magazine, 2012.
-  J. Fontanals, B. A. Dang-Vu, O. Porges, J. Rosell, and M. A. Roa, “Integrated grasp and motion planning using independent contact regions,” in Humanoids, Nov 2014.
-  M. Elbanhawi and M. Simic, “Sampling-based robot motion planning: A review,” IEEE Access, vol. 2, pp. 56–77, 2014.
-  C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A survey of monte carlo tree search methods,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 4, no. 1, pp. 1–43, March 2012.
-  P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the multiarmed bandit problem,” Machine Learning, vol. 47, no. 2, pp. 235–256, May 2002.
-  I. A. Şucan, M. Moll, and L. E. Kavraki, “The Open Motion Planning Library,” IEEE Robotics & Automation Magazine, vol. 19, no. 4, pp. 72–82, December 2012, http://ompl.kavrakilab.org.
-  J. J. Kuffner and S. M. LaValle, “Rrt-connect: An efficient approach to single-query path planning,” in ICRA, vol. 2, April 2000, pp. 995–1001 vol.2.
-  R. Diankov, “Automated construction of robotic manipulation programs,” Ph.D. dissertation, Carnegie Mellon University, Robotics Institute, 2010.