I Introduction
While humans naturally make use of sensing and feedback when manipulating objects, robot manipulators traditionally execute actions relying on openloop control strategies. Given the uncertainty associated with frictional contact interactions [1, 2] and the inherent inaccuracies of contact models [3, 4], the use of feedback can play an important role to address model uncertainty. The long term goal of this work is to endow robots with realtime decision making capabilities to enable reactive manipulation.
This paper focuses on planar manipulation tasks where the physical interactions between manipulator, object, and environment can be modeled from first principles, using rigid body dynamics and Coulomb’s frictional law. A major challenge concerning feedback controller design for systems involving contact interactions is the presence of hybridness and underactuation [5]. Hybridness refers to the fact that frictional interactions between manipulator and object exhibit different contact modes (e.g. contact/separation, sticking/sliding, etc), while underactuation is a result of the limited set of forces and torques that can be transmitted by the robot to the object through frictional interactions. Figure 1 shows an animation of a hybrid manipulation task that exploits multiple contact modalities.
This paper’s main contribution is twofold. First, we present a controller design formulation that can be used to manipulate an object on a flat surface. The approach presented in this paper generalizes to multiple contact interactions between manipulator and object and for tracking trajectories in the plane. Second, we introduce a method to determine an effective contact mode sequence that leads to a convex hybrid Model Predictive Control (MPC) formulation. Due to the hybridness associated with frictional interactions, hybrid MPC formulations suffer from a combinatorial expansion due to unknown future contact interactions modes. To overcome this difficulty, we formulate the search for optimal modes separately from the search for optimal control inputs, by leveraging machine learning methods to select mode sequences from prior experience. Once the mode sequences are selected, the control problem reduces to solving a convex quadratic program, which can be achieved at very high frequency.
Ii Related Work
The mechanics of planar pushing manipulation tasks were first described by [6]. [7] introduced the concept of the limit surface, a useful geometric representation which maps the applied frictional forces on an object to its instantaneous velocity. Under the assumption of quasistatic interactions, the limit surface has been successfully used in simulation [6], planning [8]
, state estimation
[9], and feedback control [10, 5, 11] applications. Due to the high computational costs associated with building a true limit surface, [12] proposed an ellipsoidal approximation which yields invertible models from force to motion. Recently, [13] exploited the convex properties of the limit surface to develop an efficient datadriven algorithm for its construction from contact interactions.There is an ongoing effort to find planning frameworks that can effectively handle the underactuation and hybridness associated with contact models. [14] used sampledbased planning algorithms to plan robot motions for inhand manipulation tasks. [15] developed a nonlinear trajectory optimization framework that includes frictional forces as decision variables within a nonlinear program and makes use of complementary constraints to encode the different contact interaction modes of the system. Another approach that shows promise is based on Differential Dynamic Programming (DDP), which iteratively approximates locallyquadratic models of the dynamics and cost functions to find a locally optimal path. Most approaches in this line rely on approximating discontinuous dynamics with continuous relaxations. [16] used penalty methods to smooth contact models while [17] modeled the discontinuous dynamics of the system with mixture models.
The application of modelbased feedback control to contact rich tasks has been limited to a small number of applications [18, 19, 11]. The control strategies presented in the aforementioned papers are applied to systems with an a priori knowledge of the contact mode sequencing. In [5]
, a feedback controller design is presented for the pusherslider system using a Model Predictive Control framework, where a set of contact mode schedules are chosen such that they span a number of dynamic behaviors likely to occur. This method has been shown to work experimentally but requires heuristic methods in order to design candidate contact mode schedules. This paper aims to eliminate the need for contact mode enumeration based on human intuition by developing an algorithm that systematically selects contact modes sequences.
Iii Nomenclature
The notation used in the paper is described below:

: Convex set representing the limit surface.

: Applied wrench on the object resolved in the body frame.

: Object twist resolved in the body frame.

: Jacobian matrix associated with the contact point resolved in the body frame.

: Matrix of object normal vectors at contact points resolved in body frame.

: Matrix of object tangent vectors at contact points resolved in body frame.

: Vector of applied normal forces at contact points resolved in body frame.

: Vector of applied tangential forces at contact points resolved in body frame.

: Vector of relative angles of pusher relative to body frame.

: System state vector.

: Vector of control inputs at contact point .

: Vector of commanded reaction forces.

: Vector of commanded angular velocities.

: Control input.
Iv Planar Pushing Model
This section introduces the motion model for planar manipulation tasks that is general to an arbitrary number of contact points and arbitrary object shapes. We adapt the modeling of planar pushing interactions from [20] that we briefly summarize below. All contacts in this work assume Coulomb friction interactions [21], uniform pressure distribution, and quasistatic interactions, where the inertial forces of the object are negligible.
Iva Motion Model
The limit surface is a geometric representation that describes the relationship between the applied force on an object and its instantaneous velocity. In this paper, we use the ellipsoidal approximation to the limit surface [12, 20] due its simplicity and invertibility properties. The ellipsoidal limit surface can be expressed in convex quadratic form as . By the principle of maximal dissipation [7], the object twist is perpendicular to the limit surface for a given wrench
(1) 
where the applied frictional wrench is
(2) 
Consider the planar manipulation task with multiple contact points shown in Fig. 2. The unconstrained motion equations of the system can be expressed as
(3) 
assuming that all points maintain contact with the sliding object.
IvB Frictional Constraints
The motion equations in (3) do not enforce that the reaction forces between manipulator and sliding object are feasible. To ensure that the motion equations are associated with physically reasonable behavior, we must impose constraints on the control input , ensuring that the motion model obeys contact interactions laws. Due to the hybrid nature of contact, the physical constraints that dictate the magnitude and direction of the frictional forces vary with the contact interaction mode. In accordance with Coulomb’s frictional law, the following constraints on the inputs must always be satisfied independently of the contact mode:
implying the pusher can only exert a compressive force on the object and that the net frictional force applied on the object remains within the bounds of the friction cone in Fig. 3. In addition, we must enforce constraints that depend on the contact interaction mode.
Sticking When the pusher is sticking relative to the object, the tangential velocity is stationary, as in Fig. 4(a)
(5) 
Sliding Left When the pusher is sliding left relative to the object, the tangential velocity is strictly positive and the frictional force must remain on the right hand side of the friction cone, as in Fig. 4(b)
(6) 
Sliding Right When the pusher is sliding right relative to the object, the tangential velocity is strictly negative and the frictional force is constrained to remain on the left hand side of the friction cone, as in Fig. 4(c)
(7) 
V Hybrid ModelPredictive Control
This section presents a controller design framework for planar manipulation tasks. The proposed controller aims to stabilize the motion of a given object about a nominal trajectory. The approach presented follows a Model Predictive Control (MPC) formulation where the goal is to determine a sequence of control inputs over a receding horizon to minimize the error between the manipulated object and its desired motion. Due to the hybridness associated with Coulomb’s frictional law, the optimization program takes the form of a mixedinteger quadratic program (MIQP).
Optimization Problem MPC (MIQP): Given current error state and nominal trajectory (, ), solve
(8)  
subject to  
with , , and . The terms , , , and denote weight matrices associated with the error state, final error state, control input, and contact modes, respectively. We constrain the search to the linearized dynamics of the system and the contact constraints presented in sections IV, where and .
We introduce integer variables into the optimization program to denote the hybrid mode that is active at time step . The integer variables , , and denote sticking, sliding left, and sliding right for contact point , respectively, where the integer variable takes the value of if a contact mode is active and otherwise. We enforce that the sum of integers values must be unity at each time step to ensure that only one mode can be active at a time.
To speed up computation, it is often practical to constraint adjacent time steps within a prediction horizon to have the same contact mode. This is shown in Fig. 5, where the agglomerated mode sequence is introduced, with denoting sticking, sliding left, and sliding right.
Vi Offline Mode Schedule Learning
We can visualize the feedback control architecture proposed in Section V in block diagram form in Fig. 6. Due to the nonconvex nature of integer variables in Eq. (8), the solution time is typically slow and not appropriate for high bandwidth feedback control applications. In an effort to increase the control bandwidth, we aim to offload as much of the computational costs offline as possible. To accomplish this, we present a novel formulation that separates the search for the mode schedule selection from the optimal control sequence.
Consider the controller design architecture proposed in Figure 7. Suppose that given the state error , we had access to an oracle function that returned an effective mode schedule to be enforced during the prediction horizon.
Although we do not have direct access to a realtime function that determines the optimal mode schedule, we can query the mixedinteger MPC program as much as desired offline to find optimal mode sequences given error state inputs. This formulation lends itself well to a supervised learning setting, where the objective is to train a classifier model that can select an effective mode schedule given the error state. We present the learning framework used to design the classifier model shown in Fig.
8. Using the hybrid MPC (MIQP) formulation presented in Eq. (8), we generate a dataset of training example , where represents the mode schedule associated with the datapoint. The purpose of the machine learning algorithm is to train a candidate classifier model that minimizes the crossentropy error function of the labelled training set. This new hybrid control architecture leads to a convex optimization program with a prescribed mode sequence and is referred to as MPC (learned modes).Optimization Problem MPC (learned modes): Given current error state , nominal trajectory (, ), and mode schedule , solve
subject to  
The main attraction of this approach is to convert a nonconvex mixedinteger quadratic program into a convex quadratic program that can be solved in realtime.
Vii Results
In this section, we implement the controller design presented in Section V along with the planar pushing model described in Section IVA on a planar pushing experimental setup. Videos of the experiments can be found at https://youtu.be/bMMlkyue_ZU. In both of the experiments considered in this section, we parametrize the classifier model introduced in Fig. 7
with a neural network with the properties summarized in Table 1.
Property  Symbol  Value 
Coefficient of friction (pusherslider)  
Coefficient of friction (slidertable)  
Mass of slider (experiment A),  m  0.827 
Object radius (experiment A),  r  0.045 
Mass of slider (experiment B),  m  0.827 
Object side length (experiment B),  a  0.09 
Line pusher width (experiment B),  d  0.03 
Viia Case Study A: Single Point Pushing
First, we investigate the performance of the controller design in Fig. 7 on a planar manipulation system where the goal is to track a 2d trajectory in the shape of an shaped trajectory defined by two circles of radii meters at a constant velocity of [m/s]. We build the classifier model following the learning framework displayed in Fig. 8, where the training examples are generated using the MPC (MIQP) program presented in Eq. (8).
We parametrize the classifier using a neural network, as detailed in Table 1. The physical properties of the circular object along with the frictional properties of the system are in Table 2. The controller design parameters used in the numerical simulations are seconds, , , , and . The prediction horizon is split into parts during which the contact modes are held constant. The number of time steps associated with each contact mode section is with the associated contact mode weight matrix .
Property  Value 

Number of hidden layers  3 
Neurons in hidden layer 1  32 
Neurons in hidden layer 2  50 
Neurons in hidden layer 3  50 
Activation functions  ReLu 
Output layer  Softmax 
Loss function  Cross entropy 
Figure 10 shows the prediction accuracy of the neural network trained on labelled data points on a validation set of
labelled data points both generated using sampling the error state from a normal distribution with standard deviation
. We evaluate the performance on each mode individually, as defined in Fig. 5.Figure 11 compares the optimal contact mode associated with the first time step of the MPC (MIQP) with the prediction made by the classifier. The contact modes are generated as a function of initial state errors in and while holding and constant at degrees. The regions shown in green, yellow, and blue denote the regions where the options actions are sliding left, sliding right, or sticking. From the figure, we notice that the optimal contact modes are separated into distinct region, thus justifying the search for contact mode as a classification problem and facilitating the learning process. We notice that the classifier model succeed in capturing the important trends of the MPC (MIQP) solutions.
Figure 12(a) depicts the robotic point pusher pushing the square object about a track without any external perturbations for consecutive laps. The black line represents the desired trajectory and the blue lines track the center of mass of the object. Although there is a small steadystate error, the controller succeeds in tracking the desired trajectory with accuracy. Figure 12(b) depicts the robotic point pusher pushing the square object about a track with external perturbations for a single lap. The controller quickly succeeds in eliminating the perturbation and returning to the desired trajectory. The novel convex MPC formulation with learned modes achieves good closedloop performance and permits a much higher bandwidth ( Hz) than the nonconvex MPC (MIQP) formulation ( Hz).
ViiB Case Study B: Pushing with Line Contact
The experimental setup for the planar manipulation task with a line pusher is shown in Fig. 14. The trajectory tracking task is shown in Fig. 13, where the goal is to track a 2d trajectory in the shape of an shaped trajectory defined by two circles of radii meters at a constant velocity of [m/s]. We model the line pusher as contact points that are constrained to move as a rigidbody, with the position of the center point of pusher denoted as and state vector defined by . Following a similar approach to that described in Section VIIA, we train a classifier model using labelled data points to predict the optimal mode schedule based on the error state of the system. The neural network properties used to parametrize the classifier model and the physical properties are related in Table 1 and 2, respectively. The controller design parameters are seconds, , , , and . We split the prediction horizon into parts during which the contact modes are held constant. The number of time steps associated with each contact mode section is with the associated weight matrix .
Figure 13(a) depicts the robotic line pusher pushing the square object about a track without any external perturbations for consecutive laps. The black line represents the desired trajectory and the blue lines track the center of mass of the object. The steadystate error is less prominent than in the point pusher case, as the line pusher is a more stable system with additional control authority. Figure 12(b) depicts the robotic point pusher pushing the square object about a track with external perturbations for a single lap. Each time a perturbation is encountered, the pusher reacts to reduce the error by following a fast sliding motion to stabilize the object and then push it back towards the desired trajectory using a sticking phase. The novel MPC (learned modes) controller performs comparably to the MPC (MIQP) formulation and succeeds in tracking the desired trajectory while having significantly faster control bandwidth (200 Hz vs. 15 Hz).
Viii Conclusion
This paper presents a methodology for feedback controller design of hybrid dynamical systems. The control formulation is based on a model predictive control approach, where the hybridness and underactuation associated with contact are explicitly enforced as constraints within a mixedinteger optimization program.
In order to enable realtime implementation, we address the combinatorial complexity resulting from the hybrid expansion of contact modes by separating the search for optimal mode schedules (offline) from the search for optimal control inputs (online). This is made possible by formulating the contact mode selection as a supervised learning problem. This approach enables us to train a classifier model offline by harnessing the solutions returned by the mixedinteger optimization program and building a dataset of optimal mode schedules.
We validate the controller design methodology on a planar manipulation experimental setup, where it is shown that the proposed convex formulation controller achieves comparable performance as its nonconvex alternative, while obtaining a fold improvement in the control bandwidth. Most importantly, in contrast to mixedinteger MPC control formulations, the online component of the hybrid MPC formulation with learned modes has the potential to extend to more complex dynamical systems with additional contact interactions, as its associated convex program can be efficiently computed in realtime.
References
 [1] Yu, K.T., Bauza, M., Fazeli, N., Rodriguez, A.: More than a million ways to be pushed. a highfidelity experimental dataset of planar pushing. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea, October 9–14 (2016)
 [2] Bauza, M., Rodriguez, A.: A probabilistic datadriven model for planar pushing. IEEE International Conference on Robotics and Automation (ICRA) arXiv:1704.03033 (2017)
 [3] Fazeli, N., Donlon, E., Drumwright, E., Rodriguez, A.: Empirical evaluation of common contact models for planar impact. IEEE International Conference on Robotics and Automation (ICRA), Singapore, May 29 – June 3 (2017)
 [4] Kolbert, R., ChavanDafle, N., Rodriguez, A.: Experimental validation of contact dynamics for inhand manipulation. International Symposium on Experimental Robotics (ISER). Spring Proceedings in Advanced Robotics. 1 (2016)
 [5] Hogan, F.R., Rodriguez, A.: Feedback control of the pusherslider system: a story of hybrid and underactuated contact dynamics. In Proceedings of the 12th International Workshop on the Algorithmic Foundations of Robotics (WAFR), San Francisco, CA, USA, December 18 – 20. (2016)
 [6] Mason, M.T.: Mechanics and planning of manipulator pushing operations. The International Journal of Robotics Research 5(3) (1996) 53 – 71

[7]
Goyal, S., Ruina, A., Papadopoulos, J.:
Wear.
Planar sliding with dry friction Part 1. Limit surface and moment function
143 (1991) 307 – 330  [8] Lynch, K.M., Mason, M.T.: Stable pushing: mechanics, controllability, and planning. The International Journal of Robotics Research 15(6) (1986) 533 – 556
 [9] Yu, K.T., Leonard, J., Rodriguez, A.: Shape and pose recovery from planar pushing. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, September 28 – October 02 (2016)
 [10] Lynch, K., Maekawa, H., Tanie, K.: Manipulation and active sensing by pushing using tactile feedback. IEEE/RSJ International Conference on Intelligent Robots and Systems, Raleigh, USA, July 7 – 10, pp. 416–421 (1992)
 [11] Woodruff, J.Z., Lynch, K.M.: Planning and control for dynamic, nonprehensile, and hybrid manipulation tasks. IEEE International Conference on Robotics and Automation (ICRA), Singapore, May 29 – June 3 (2017)
 [12] Lee, S.H., Cutkosky, M.: Journal of Manufacturing Science and Engineering. Fixture planning with friction 113(3) (1991) 320 – 327
 [13] Zhou, J., Paolini, R., Bagnell, J.A., Mason, M.T.: A convex polynomial forcemotion model for planar sliding: Identification and application. IEEE International Conference on Robotics and Automation (ICRA), May 1621, Stockholm, Sweden, May 16–21 (2016)
 [14] ChavanDafle, N., Rodriguez, A.: Samplingbased planning for inhand manipulations with external pushes. International Symposium on Robotics Research (ISRR). Under Review. (2017)
 [15] Posa, M., Cantu, C., Tedrake, R.: A direct method for trajectory optimization of rigid bodies through contact. The International Journal of Robotics Research 33(1) (2014) 69 – 81
 [16] Tassa, Y., Todorov, E.: Synthesis and stabilization of complex behaviors through online trajectory optimization. IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, October 7 – 12 (2012)
 [17] Pajarinen, J., Kyrki, V., Koval, M., Srinivasa, S., Peters, J., Neumann, G.: Hybrid control trajectory optimization under uncertainty. arXiv:1702.04396 (2017)
 [18] Ryu, J., Ruggiero, F., Lynch, K.M.: Control of nonprehensile rolling manipulation: Balancing a disk on a disk. IEEE International Conference on Robotics and Automation (ICRA) , St. Paul, MN, USA, May 14–18, pp. 3232–3237 (2012)
 [19] Posa, M., Kuindersma, S., Tedrake, R.: Optimization and stabilization of trajectories for constrained dynamical systems. In Proceedings of the International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, May 16–21 (2016)
 [20] Zhou, J., Bagnell, J., Mason, M.: A fast stochastic contact model for planar pushing and grasping: theory and experimental validation. Robotics Science and Systems, Cambridge, MA, USA, July 12–16 (2017)
 [21] Mason, M.T.: Mechanics of robotic manipulation. MIT Press, Cambridge, Massachusetts (2001)