I Introduction
Seemingly complex manipulation tasks can often be decomposed into a sequence of simpler behaviors. For example, picking a creditcard off a table may consist of a pull to cantilever the card followed by a grasp to acquire the card. In part motivated by this observation, researchers have studied manipulation behavior segmented into manipulation primitives such as grasping, pulling, pushing, etc.
These primitives are often used to facilitate planning and control for robotic manipulation; however, defining primitives, planning within a primitive, and scheduling primitives are all current areas of research. One approach is to use narrowly defined primitives that are simpler to plan and control at the expense of needing more of them; for example, when every contact mode/type is a primitive [31]. On the other hand, complex primitives are often more expressive, but incur a higher computational cost and can be challenging to realize on a physical system. One example is when a sequence of contacts switches is treated as a single behavior [20, 15].
This work proposes a fast planning and control framework that supports a small number of hybrid switches for primitives of moderate complexity with underactuated frictional dynamics. Switching contact formations within a primitive increases its expressiveness, which can reduce the number of primitives needed and, consequently, ease their scheduling.
Contributions We develop a hybrid differential dynamic programming (DDP) algorithm for executing manipulation primitives with frictional contact switches. Our approach extends inputconstrained DDP to handle hybrid dynamics. We also present a numerical study on the convergence properties and computational requirements of our algorithm for two manipulation primitives: planar pushing and planar pivoting. We can plan a finitehorizon trajectory while considering a small number of contact switches (up to four) within a reasonable amount of time (one to five seconds). Our experiments show that:

The ability to select and switch contact locations is key to the success of a primitive.

Only a couple (one to two) contact location switches are needed to converge from most initial configurations.
Finally, we show that our framework can plan and control hybrid trajectories on a physical planar pushing system.
Paper Structure We begin by reviewing the basics of DDP and extensions to reconcile constraints and hybrid dynamics characteristic of frictional interaction in Sec. III. We derive the motion models for planar pushing and pivoting in Sec. IV. Section V describes a simulation study to evaluate the success rate and computation time of the algorithm for these two primitives. We experimentally validate the algorithm for the planar pushing primitive in Sec. VI, and finally, we summarize the results, limitations, and directions of future work in Sec. VII.
Ii Relevant work
In the following sections, we cover relevant research on manipulation primitives and DDP .
Manipulation Primitives There is a long history in robotic manipulation of developing the mechanics of and planning for primitives. Mason [16] introduced the mechanics of planar pushing, which have since been studied by a number of researchers [14, 8, 34]. This line of work has been extended to many other primitives, including prehensilepushing [3], tumbling [21], pivoting [9, 13, 10], scooping [29, 30], tilting [4], and dynamic inhand sliding [22].
A number of researchers have also focused on sequencing primitives to achieve complex manipulations [28, 26, 1, 23, 6]. For example, Toussaint et al. [27] use a few expressive primitives to realize diverse set of behaviors; however, this approach is only verified in simulation. On the other hand, Woodruff et al. [31] treat each contact formation as a different primitive and use this to execute closedloop dynamic motions with a fixed primitiveschedule on a physical system. Our framework balances these approaches and is similar to that of Hou et al. [10], who develop controllers for two moderatecomplexity primitives and demonstrate posetopose reorientation on a physical system.
Differential Dynamic Programming DDP is an iterative, indirect trajectory optimization method that leverages the temporal structure in Bellman’s equation to achieve local optimality. Originally developed by Jacobson and Mayne [12] for unconstrained systems, it has since been extended to systems with box input constraints [25], linear input constraints [18], and nonlinear constraints on input and states [32].
Relevant to this work, Tassa et al. [24] and Mordatch et al. [17] use DDP with smoothed contact models to plan and stabilize trajectories for legged robots. Moreover, Yamaguchi and Atkeson [33] apply DDP to the problem of planning for graphdynamical systems, and they use a sample based approach to determine the mode sequence. Finally, Pajarinen et al. [19] consider DDP for planar pushing, and they optimize over a continuous mixture of discrete actions that is forced back into fully discrete actions at convergence.
Iii Hybrid Planning and Control
We first review the basics of DDP (Sec. IIIA) and its extension to inputconstrained systems (Sec. IIIB). We then describe our hybrid DDP algorithm in Sec. IIIC.
Iiia DDP Pereliminaries
Consider a discretetime dynamical system of the form
(1) 
where is a smooth function that maps the system’s state () and control input () to the next state. The goal is to find an input trajectory that minimizes an additive cost function,
(2) 
Here is the timestep, is the running cost, is the final cost, is the timehorizon, is the initial state, and are determined by integrating (1) forward in time. We can define the optimal costtogo at the th timestep using Bellman’s equation [2],
(3) 
with the terminal condition .
To handle the nonlinearity in (3), DDP iteratively optimizes a quadratic approximation near an initial trajectory. The algorithm starts with a forward pass that integrates (1) for an initial state and input trajectory . This is followed by a backwards pass that computes a local solution to (3) using a quadratic Taylor expansion. This sequence of forward and backward passes is repeated until convergence.
The Taylor expansion of the argument of (3) about a nominal pair is given by
(4) 
The quadratic approximation of can be written as:
(5) 
where the block matrices are functions of , , , and their first and second derivatives [25]. The control modification is obtained by minimizing (5) with respect to for some state perturbation :
(6) 
Substituting this for in (5) gives a quadratic model for
(7) 
The backward pass initializes the quadratic approximation of with and its derivatives, and then recursively computes (6) and propagates the value approximation (IIIA).
The algorithm then integrates (1) to compute a new trajectory, completing one iteration. The control during this forward pass is set to with taken as the difference between across consecutive iterations. Moreover, is regularized to ensure (6) has a valid solution, and a linesearch over is needed ensure costimprovement and good convergence from an arbitrary initialization [25].
IiiB Input Constrained DDP
Now consider a system where the control inputs are linearly constrained by inequality (or equality) constraints:
(8) 
Here and are potentially nonlinear functions of state. This class of constraint can represent both planar friction and forcebalance constraints for a fixed contact mode. The DDP algorithm is modified in two ways for these constraints [18]. First, the analytic minimization in the backwards pass (6) is replaced by a constrained quadratic program (QP) evaluated at the nominal :
s.t.  (9) 
The solution to this QP gives the value of the feedforward control satisfying the input constraints. We then consider the state variation when solving for the feedback gain . Details on this are given by Murray and Yakowitz [18].
Second, even though satisfies the input constraints, the new control computed using during the forward pass (Line 12, Alg. 1) can violate feasibility. Consequently, it must be projected onto the constraint set. When and have a simple geometric representation (e.g., a box or a cone), we can algebraically project the new control input onto the feasible set. In all other cases, we solve another QP detailed by Murray and Yakowitz [18]. The inputconstrained DDP algorithm is outlined in Alg. 1.
IiiC Hybrid DDP
Our algorithm extends inputconstrained DDP to systems with hybrid switches. We use DDP as a subroutine to (1) explore and rank all feasible mode sequences and to (2) optimize the trajectory and feedback law associated with the best mode sequence. In addition to initial state and input trajectories, the user can specify the maximum number of hybrid switches () and the set of hybrid modes ().
Our algorithm first builds a depth tree of trajectories that enumerates all feasible hybrid possibilities. We use inputconstrained DDP with a small iteration limit to optimize each edge (trajectory) in the tree and approximate its cost. Second, we select the branch (a sequence of connected edges) with the lowest cumulative cost, and fix the mode sequence to that of the selected branch. Finally, we optimize the trajectory and controller associated with best branch using inputconstrained DDP.
For computational efficiency, we initialize DDP with the inputs that result in static equilibrium and prune the tree during exploration by eliminating branches whose initial configurations do not satisfy static equilibrium. The hyperparameters of our algorithm (explored in Sec. V) are the number of hybrid switches, the set of hybrid modes, the planning horizon (), and the maximum number of DDP iterations during tree generation (.
In summary, our algorithm can be thought of as an exhaustive search over all hybrid modes with pruning based on static equilibrium.
Iv Manipulation Primitives
Here we derive the equations of motion (EOM) for the primitives used in this work: quasistatic planar pushing (Sec. IVA) and dynamic planar pivoting (Sec. IVB).
Iva Quasistatic Planar Pushing
We consider quasistatic pushing in a horizontal plane with four potential (only one active at any given instant) sticking point contacts (Fig. 1). The object’s state is , where and are the position of its centerofmass (COM), and is its orientation. The quasistatic EOM are
(10) 
where is the object’s twist at timestep . Using force balance and a ellipsoidal approximation [11] of the limit surface [5], we can write
(11) 
Here is the index of the active contact, is active pusher force in the bodyframe, is contact Jacobian for the active contact, is the rotation between the body and world frames, and is the gradient of the limit surface with respect to the applied wrench. Hogan et al. [7] give further details on this derivation.
The active pusher force must lie within the corresponding friction cone () for sticking contact, and we also place an upperbound on the pusher normal force. In the contact frame (+x aligned with the contact normal), this can be written as
(12) 
where () is the normal and tangential components of in the contactframe, and is the coefficient of friction between the pusher and the object.
IvB Dynamic Planar Pivoting
We also consider dynamic pivoting in the gravity plane about a sticking frictional pivot with the ground (Fig. 2). The object is rotated about this pivot by sticking contacts at one of the other three corners. Each corner contact is treated as a pointline contact with the line fixed at with respect to both sides of that corner.
The object’s state is , where is the orientation of the object and is its angular velocity. We write the discretetime dynamics of the system as
(13) 
where and is
(14) 
Here is the index of the active contact, is the active contact force in the bodyframe, is the ground reaction force the worldframe, is the rotation between the world and body frames,
is the mass moment of inertia of the object, and
() is the vector from the COM to the ground (active) contact. We constrain the ground reaction force in terms of the active force, gravity, and the object’s inertia. This is given by the momentum principle:
(15) 
where is the gravitational force and is the timederivative of linear momentum of the COM. Note that can be computed in terms of , and . Finally, we enforce that all contact forces (active and ground) lie within their friction cones and place an upper bound on the normal contact forces.
V Numerical Studies
We use our algorithm to plan posetopose trajectories for the planar pushing and pivoting manipulation primitives. We present a number of representative trajectories in Sec. VA and conduct ablation studies in Sec. VB. We use a simple quadratic total cost of the form
(16) 
to generate all trajectories. Here is the distance from the goal and , , and are diagonal matrices.
Va Simulated Trajectory Planning
Representative planar pushing and pivoting trajectories are shown in Fig. 3 and Fig. 4, respectively.
Planar Pushing We compute trajectories from eight initial conditions for available pusher sets of dimension one and three (Fig. 3). The goal is () . Trajectories are considered successful if the final errors in , , and are less than , , and , respectively. We set the maximum number of hybrid switches () to 1, the maximum iterations during tree generation ( to 10, and the planning horizon () to 24. Moreover, we use a timestep () of , a coefficient of friction () of 0.3 at both frictional contacts, and allow a maximum normal force of .
We observe that with only the left pusher (Fig. 4a), the algorithm finds solutions for initial conditions that are to the left of the goal. Note that this corresponds to pure inputconstrained DDP as there is only one available contact. With three available pushers (Fig. 4b), the algorithm finds trajectories that converge to the goal from all initial conditions. The planner usually only needs to select the best pusher; however, we see a hybrid switch for one of the trajectories. Finally, the mean planning time is and for one and three available pushers, respectively.
Planar Pivoting We compute trajectories for available palms sets of dimension one, two, and three from two initial conditions (Fig. 4). The goal is and . Trajectories are considered successful if the final errors in and are less than and , respectively. The object’s mass is . Moreover, we use , , , , , and allow a maximum normal force of .
For pivoting, we observe that the ability to reason about contact switches is important. For example, we cannot pivot the object from to with only a single active contact (Fig. 4a) using pure inputconstrained DDP. Moreover, the planner finds different solutions when there are more than one available contacts (Fig. 4b). Finally, the mean planning time is 0.67, 3.12, and for the trajectories with one, two, and three available contacts, respectively.
VB Ablation Studies
We also conduct a number of onedimensional ablation studies that explore the effect of the hyperparameters of our algorithm on success rate (defined above) and planning time.
Planar Pushing In Fig. 5, we depict the effect of the number available pushers and one other ablation parameter: (a) number of hybrid switches, (b) number of DDP iterations during mode selection (tree generation), and (c) the horizon of the trajectory. When not varied, these parameters are fixed to , , and . For each parameter, we consider all active pusher combinations and plan trajectories from 27 initial conditions for each pusher combination.
We find that across all parameters, success rate increases with the number of available pushers (A, Fig. 5). This is intuitive as allowing for more pusher directions increases controllability. This success, however, comes at the cost of increased planning time (B, Fig. 5), though planning time is most affected by the number of hybrid switches (Fig. 5a). We also find that success rate is robust to hyperparamter changes with three or four available pusher (C, Fig. 5). Finally, we can achieve a success rate of 1 with a planning time of for a number of different hyperparameter combinations.
Planar Pivoting We present the effects of varying the same hyperparameters as above for pivoting in Fig. 6. When not varied, these parameters are fixed to , , and . For each parameter combination, we consider all palm contact combinations and plan trajectories from two initial conditions for object with aspect ratios of 0.5, 1.0, and 1.5.
Similar to the pushing primitive, we find that success rate increases with the number of available palms (A, Fig. 6); however, we are only able to reach a maximum success rate of 0.60.7. Interestingly, there is not a corresponding increase in planning time (B, Fig. 6), though the overall planning time is higher than for planar pushing due to the more complex dynamics of pivoting. Our results suggest that the planner is also more sensitive to the choice of hyperparameters; for example, planning over an 18step horizon outperforms planning over 12 and 24step horizons (C, Fig. 6).
Vi Experimental Results
In this section we evaluate our framework on a physical planar pushing system.
Experimental Setup We use an industrial robotic manipulator (ABB IRB 120, Fig. 7). The object rests on a flat plywood surface and is moved by a metallic rod attached to the robot. The feedback controller (6) runs at , and the object pose is tracked using a motion capture system (Vicon, Bonita) at . The object’s physical properties are described in Sec. VA, and it has a length of .
We convert the inputs to our model (applied force and timestep length) into position commands for the robot manipulator using the following kinematic relation
(17) 
Here is the Cartesian position of the pusher in the world frame, and , , and are defined in Sec. IVA.
Straight Line Pushing We evaluate the performance of our controller on five straightline pushes (Fig. 8). We compare these against five openloop straightline pushes. The openloop standard deviation for error in , , and is , , and , respectively. The closedloop controller significantly reduces the standard deviation in error to , , and in , , and , respectively.
Hybrid Pushing We also validate our framework for three pushes starting from more challenging initial conditions, with zero, one, and two contact switches (Fig. 9). Our planner finds pushing trajectories that reach the goal and are effectively stabilized by the controller. However, slipping between the pusher and the object results in slightly higher final pose error than in the straightline pushing scenario.
Vii Discussion
We summarize the major findings of this work (Sec. VIIA), discuss some important limitations (Sec. VIIB), and propose directions for future work (Sec. VIIC).
Viia Conclusions
We introduce a hybrid DDP algorithm for dynamical system with frictional interactions, discontinuous switches, and input constraints. We validate this framework on two planar manipulation primitives and demonstrate that we can plan and control over a finite horizon while reasoning about a small number of contact switches. We also present a numerical study on the convergence properties and computational cost of our algorithm, finding that a couple (one to two) contact switches is sufficient for convergence from most initial conditions and results in a reasonable planning time (one to five seconds). Finally, we validate our approach on a physical planar pushing task showcasing our closedloop controller’s ability to track the planned trajectory.
ViiB Limitations
Though we are able to drive any initial condition to the origin for planar pushing, this is not the case for planar pivoting. We believe this is due to a poor initialization of the DDP planner. While quasistatic planar pushing always maintains static equilibrium, planar pivoting can be unstable if not properly initialized. Furthermore, though we achieve closedloop tracking for straightline pushing, pose errors are larger for more complex pushing trajectories. This is likely a result of slipping between the pusher and object that is unaccounted for in both the planner and controller. A lowerlevel controller that enforces sticking [6] or corrects for slipping [8] would complement our approach.
ViiC Future Work
This paper develops a framework that has the potential to enable richer manipulation primitives. One extension is to investigate the performance of our approach on a wider range of primitives, including pulling, prehensile pushing, rolling, tilting, etc. This effort will require both identifying appropriate mechanics models and adapting the hybrid DDP framework. Moreover, we would like to explore more sophisticated controllers, as detailed by Hogan and Rodriguez [8], to reason about contactsliding relative to the object.
References
 [1] (2013) Manipulation with multiple action types. In Experimental Robotics: The 13th International Symposium on Experimental Robotics, pp. 531–545. External Links: ISBN 9783319000657, Document Cited by: §II.
 [2] (1966) Dynamic programming. Science 153 (3731), pp. 34–37. Cited by: §IIIA.
 [3] (2018) Inhand manipulation via motion cones. In Robotics: Science and Systems, Cited by: §II.
 [4] (1988) An exploration of sensorless manipulation. IEEE Journal on Robotics and Automation 4 (4), pp. 369–379. Cited by: §II.
 [5] (1991) Planar sliding with dry friction part 1. limit surface and moment function. Wear 143 (2), pp. 307–330. Cited by: §IVA.
 [6] Tactile dexterity: manipulation primitives with tactile feedback. In 2020 IEEE international conference on robotics and automation (ICRA), Note: In review Cited by: §II, §VIIB.
 [7] (201805) Reactive planar manipulation with convex hybrid mpc. In 2018 IEEE International Conference on Robotics and Automation (ICRA), Vol. , pp. 247–253. External Links: ISSN Cited by: §IVA.
 [8] (2016) Feedback control of the pusherslider system: a story of hybrid and underactuated contact dynamics. In WAFR, Cited by: §II, §VIIB, §VIIC.
 [9] (2015) A general framework for openloop pivoting. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 3675–3681. Cited by: §II.
 [10] (2018) Fast planning for 3d anyposereorienting using pivoting. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1631–1638. Cited by: §II, §II.
 [11] (1996) Practical forcemotion models for sliding manipulation. The International Journal of Robotics Research 15 (6), pp. 557–572. Cited by: §IVA.
 [12] (1970) Differential dynamic programming. NorthHolland. Cited by: §II.
 [13] (2016) Adaptive control for pivoting with visual and tactile feedback. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 399–406. Cited by: §II.
 [14] (1996) Stable pushing: mechanics, controllability, and planning. The International Journal of Robotics Research 15 (6), pp. 533–556. Cited by: §II.
 [15] (2019) Contactimplicit trajectory optimization using variational integrators. The International Journal of Robotics Research, pp. 0278364919849235. Cited by: §I.
 [16] (1986) Mechanics and planning of manipulator pushing operations. The International Journal of Robotics Research 5 (3), pp. 53–71. Cited by: §II.
 [17] (201208) Discovery of complex behaviors through contactinvariant optimization. ACM Transactions on Graphics (TOG) 31 (4), pp. 43–43:8 (English). Cited by: §II.
 [18] (1979) Constrained differential dynamic programming and its application to multireservoir control. Water Resources Research 15 (5), pp. 1017–1027. Cited by: §II, §IIIB, §IIIB, 1.
 [19] (2017) Hybrid control trajectory optimization under uncertainty. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5694–5701. Cited by: §II.
 [20] (201401) A direct method for trajectory optimization of rigid bodies through contact. International Journal of Robotics Research 33 (1), pp. 69–81. Cited by: §I.
 [21] (1991) Tumbling objects using a multifingered robot. Journal of the Robotics Society of Japan 9 (5), pp. 560–571. Cited by: §II.
 [22] (2017) Dynamic inhand sliding manipulation. IEEE Transactions on Robotics 33 (4), pp. 778–795. Cited by: §II.
 [23] (2016) A framework for fine robotic assembly. In 2016 IEEE international conference on robotics and automation (ICRA), pp. 421–426. Cited by: §II.
 [24] (2012) Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization. In IEEE/RSJ International Conference on Intelligent Robots and Systems, Cited by: §II.
 [25] (2014) Controllimited differential dynamic programming. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 1168–1175. Cited by: §II, §IIIA, §IIIA.
 [26] (1998) Motion planning of intelligent manipulation by a parallel twofingered gripper equipped with a simple rotating mechanism. IEEE Transactions on Robotics and Automation 14 (2), pp. 207–219. Cited by: §II.
 [27] (2018) Differentiable physics and stable modes for tooluse and manipulation planning.. In Robotics: Science and Systems, Cited by: §II.
 [28] (1991) A framework for planning dexterous manipulation. In 1991 IEEE International Conference on Robotics and Automation (ICRA), pp. 1245–1251. Cited by: §II.
 [29] (1988) An investigation of frictionless enveloping grasping in the plane. The International journal of robotics research 7 (3), pp. 33–51. Cited by: §II.
 [30] (1992) On the stability and instantaneous velocity of grasped frictionless objects. IEEE Transactions on Robotics and Automation 8 (5), pp. 560–572. Cited by: §II.
 [31] (2017) Planning and control for dynamic, nonprehensile, and hybrid manipulation tasks. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4066–4073. Cited by: §I, §II.
 [32] (2017) Differential dynamic programming with nonlinear constraints. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 695–702. Cited by: §II.
 [33] (2016) Differential dynamic programming for graphstructured dynamical systems: Generalization of pouring behavior with different skills. IEEERAS International Conference on Humanoid Robots (November 2016), pp. 1029–1036. External Links: Document, ISBN 9781509047185, ISSN 21640580 Cited by: §II.
 [34] (2017) Pushing revisited: differential flatness, trajectory planning and stabilization. In Proceedings of the International Symposium on Robotics Research (ISRR), Cited by: §II.
Comments
There are no comments yet.