I Introduction
Modelling and learning dynamics for contactrich manipulation is an open problem in robotics. Classical control approaches [1, 2, 3, 4, 5, 6, 7, 8] suffer when the modes of interaction increase, their respective models are too complicated to be described analytically or their variations too diverse to be accounted for. Especially for contactrich tasks, this difficulty arises from the dynamics that can include discontinuities such as breaking and making contact, complicated frictional phenomena, or the variety of object properties. With the introduction of datadriven methods, a lot of these shortcomings were confronted successfully [9]. Their main advantage stems from not relying on analytical models, but on interaction with the environment, or demonstrations that can initialize a policy for the completion of the task.
In this work, we investigate a datadriven method for robotic food cutting which is inherently a contactrich task with complicated interaction dynamics. Modelling the interaction as a massspringdamper system is an oversimplification of the contact dynamics and the tissue fracturing/separation of the fibers is not wellapproximated by a smooth impedance in closedform expression. On the other hand, more realistic and analytical representations [10, 11, 12] are arduous to develop when considering many different classes, or classes with substantial variations in their dynamics. As a task, cutting can be lowdimensional if only operational space quantities are considered. The discontinuities it exhibits are in most cases due to frictional, stickslip phenomena and not extreme ones such as sudden and complete breaking of contact. However, it is exceptionally difficult to simulate, so any type of data collection or exploration must be done on the real system, which is expensive. As a result, we choose to learn only the dynamics model from data with a deep network and handle the closedloop control with a Model Predictive Controller (MPC) as seen in Fig. 1.
For methods that are not learning a policy online, as most Reinforcement Learning techniques or endtoend MPC variants
[13], the resulting controller behavior primarily depends on the accuracy and expressive capacity of the dynamics model. The idea of choosing the appropriate quantities to describe a task is not new. However, it tends not to be particularly highlighted when the focus is a simulated task where any control quantity can be readily available. In real systems, the available quantities are constrained by hardware and perception capabilities, and the method needs to reflect that.In this paper, we present a velocityresolved formulation for contactrich tasks and propose a prediction method for foodcutting dynamics that can be used in modelbased control schemes. The main contributions of this work are summarized below:

In contrast to our previous work [14], the learning problem is reformulated to explicitly incorporate information related to the complicated dynamics of the task and how they are affected by the robot’s cutting actions.

Our modelling and training approach can provide models that have consistently good performance and exhibit a finegrained understanding of the task dynamics when performing within an MPC.
Ii Related Work
Robotic cutting has been treated in a multitude of ways, primarily based on more traditional control approaches. In [16]
the authors employed an impedance controller with adaptive force tracking for a simulated object with nonhomogeneous stiffness. An adaptive position controller with gradientbased estimation of the desired force was presented in
[17] with simulated results as well. More recently, [12] proposed force control combined with visual servoing to adjust the tracked trajectories for the task of cutting deformable soft objects, while minimizing the required cutting force. A combined hybrid force/position control approach was presented in [11] to cut two classes of nondeformable objects. These works all introduce physicsbased mechanics that lead to a welldefined problem. However, their applicability is limited as they would require additional computations to be applied to a larger variety of cases.Datadriven approaches can address this issue by approximating the interaction dynamics, resulting in a single model that is capable of treating several object classes. A method that employs deep networks to approximate the dynamics for this task was first introduced in [15]. Although the method’s performance was evaluated on an extensive dataset, the generalization ability to unseen classes was not examined. Additionally, the proposed network outperformed several baselines but it was not clear whether there was need for the complex architecture and training procedure, as only one architecture was evaluated within the MPC. This approach was revisited by our group in [14] after being reformulated into a velocity resolved control problem, but with the same underlying network structure and training procedure. Unseen classes were included in the evaluation but there was no further examination of the network and its training.
In contrast to this work, our work [14] was an investigation of how a contactrich task, associated with torque information, can be adapted to a different hardware setup and the constraints this imposes. Embedded in its modelling choices, was the assumption that the dynamics of cutting can be described by a nonlinear mapping of the form . Part of what we considered as state (forces, displacements), reflected the intended action through the external forces measured by the sensor. By choosing this model, the focus shifts on the response of the coupled objectknife system which has already incorporated the effect of the control input and thus, does not offer a proper formulation for a dynamics learning scheme. Furthermore, this hypothesis effectively considers the potentially delayed effect of the control input negligible, which is not always a valid assumption, especially in a nested control scheme, where there is no guarantee that the desired force will be reached. Notably, in [18] the authors have worked on the same task but their focus is how to learn a semantic representation rather than the dynamics.
Despite the limited amount of works for this particular application, datadriven methods in contactrich scenarios have shown promising results. Demonstrationbased methods [19, 20, 21, 22], are well suited due to their sample complexity, but infeasible for cutting with force feedback as there is no practical way to distinguish the demonstrator’s exerted wrench from the object’s. Reinforcement Learning, when actively focusing on sampling complexity, is a competitive alternative for realworld, contactrich tasks. Recently, in [23]
the authors proposed an actorcritic that is guided by supervised learning to account for sample complexity and safety, but still required 1.5 hours for an assembly task that has a smaller range of dynamics than cutting. Another method that reduces the sample complexity was presented in
[24]. The authors actively leveraged a handengineered controller as a basis for a policy they optimize online, thus splitting the problem into a trajectory tracker and an adaptive corrective behavior. Their method greatly reduced the sample requirements through simtoreal transfer but still depended on a simulated environment and unsupervised exploration, neither of which are available for our task.A central part of the suggested training approach is curriculum training [25] which we combine with learning rate decay to avoid prediction error accumulation and facilitate training. Curriculum training has been applied in several different contexts but to the best of our knowledge, not as a horizon curriculum for multistep prediction. An exception is the work in [26] where it is used for image registration and the authors gradually increase the temporal distance between the images. Other applications of curriculum training include minibatch frequency selection [27]
, sequence prediction in natural language processing
[28], equation learning [29] and finally, encoding positions and velocities from pixels in simulated control tasks [30].Iii Problem Formulation
Consider a robotic manipulator equipped with force sensing. Let denote the translation part of the endeffector pose in the world frame and the force measurements. Let further denote the endeffector’s desired position and velocity, the reference force and a velocity control input. In order to follow a predefined trajectory in a compliant manner, we can employ a variant [31] of velocityresolved (inverse) damping control,
(1) 
We can then define the desired compliant behavior as
(2) 
where are the stiffness and compliance gain matrices and the position error.
Substituting Eq. (2) in Eq. (1) and noting that the control input corresponds to the endeffector’s Cartesian velocity, results in the desired dynamic behavior
(3) 
where is the velocity error.
To plan a cutting trial, it is necessary to define a desired trajectory and choose appropriate controller gains with regard to the object class. A controller capable of handling the variation in the contact properties would require variable stiffness gains. However, it would still be impossible to design a single, fixed trajectory that addresses the nonhomogeneous sizes of the objects. Alternatively, we can model the dynamics of the contact as a discretetime dynamics function and determine an optimal reference force such that it minimizes a cost (see Appendix) over a time horizon , by solving the optimization problem
(4) 
In this work, we parametrize the dynamics function
as a deep network that receives current positions, measured and reference forces, and outputs the estimated future positions. We define the model’s state as the augmented state vector
and denote , resulting in the formulation(5) 
This network is then used in conjunction with an MPC to determine the optimal reference force in Eq. (4). In contrast to our earlier work [14], we model the effect of the control input explicitly through the reference force and decouple it from the interaction one, as discussed in Section II. Expressing the reference force as a function of the desired velocity and the position error, allows to encompass the possible delayed effects of the control input as we instill information about the divergence from the desired trajectory due to friction. This results in a more clear and concise formulation that offers a better representation for the learning task.
To further demonstrate the importance of considering , we can transform the initial data space that includes multistep sequences of 6 or 9 features, into a 2D one and visualize the data with tSNE [32]. tSNE is a probabilistic dimensionality reduction technique that projects data into their lowdimensional embedding in a nonlinear way, while trying to preserve their probabilistic distribution. To have a fair comparison, we used the same dataset^{1}^{1}1
For all the visualizations, the dataset consists of 24 different cutting trials for 6 different objects. The tSNE hyperparameter ”perplexity” was set to 30 and the max number of iterations to 3000.
but omitted the inputs for the latter case. The two resulting datasets went through the same preprocessing as in Eq. (6) and (7).Fig. 1(a) shows the results of the dataset and correspondingly, Fig. 1(b) the ones from the dataset we propose for this task, namely . With dataset , as seen in Fig. 1(a), there is no specific structure in the embedding except for the eggplant class as the class dynamics are the most easily distinguishable during the task due to the object’s texture. In comparison, adding and visualizing
, produces more coherent clusters. The central part of the plot is mostly occupied by easier to cut classes and as we are moving peripherally outwards, we get cases of stiffer materials. Although we are not interested in classifying the objects, a more cohesive embedding indicates that
is a more informative representation and henceforth, the networks we compare in the experimental section are trained on these features.Iv Method
Iva Modelling Cutting
Modelling cutting analytically is a complicated process due to its frictional properties as well as the separation of fibers [33]. Nevertheless, it can be approximated given appropriate inputs and a model that is adequately expressive to capture the nonlinear temporal and spatial variations.
In the context of this work, we are interested in representing the interaction dynamics between the manipulator and the object as the transition function in Eq. (5). Therefore, the dataset needs to reflect the current state of the system and the delayed effect of the controller’s input. Traditionally, to characterize the dynamics during an interaction task, the terms of mechanical impedance and admittance are introduced, which are characterized as mappings between velocities and forces. However, velocities are usually noisy and not easy to learn from. In addition, considering joint velocities unnecessarily increases the difficulty of the task as the approximator is implicitly required to learn the robot’s kinematics.
Instead, we employ relative displacements over time to approximate a generalized notion of velocity, similar to [15, 14]. To achieve that, the input features for the learning module are not treated as single timesteps but form nonoverlapping blocks of sequences. Block of length is then given by
(6) 
If we denote the positional elements of as and an allones vector of length as , the transformation from positions to relative displacements is done by subtracting the past block’s last position from every position in the current one
(7) 
Dropping the superscript for brevity, the network’s input is then . Through this transformation, we also ensure that the network will not overfit to absolute positions, which do not carry the same amount of information as they depend on the object’s size. Since we are using relative displacements and sample every
, the magnitude of the positional part is significantly smaller than the remaining features of the input vector. To ensure consistency in the input range, we normalize the features to zero mean and unit standard deviation.
IvB Network Architecture and Training
In this work, we chose to employ an LSTM network as opposed to Recurrent Neural Networks (RNN) used in
[15, 14]. While more complex than a regular RNN, LSTMs have proven to be suitable for learning sequences and dependencies further in time [34], which is appealing for a task that requires modelling of temporally and spatially varying dynamics.Although Eq. (5) is referring to onestep predictions, predicting the positions at a timestep ahead into the future can be achieved by recursively using the intermediate prediction as inputs until we reach the desired horizon i.e.
This results in a sequencetosequence prediction that, for a robot working at and with , instead of predicting to , the new horizon will be blocks ahead.
A common problem with the recursive approach is error accumulation, since predictions are used in place of observations. To avoid that, we propose to train the system with a curriculum strategy that gradually increases the difficulty of the prediction goal. Practically, this amounts to progressively predicting further ahead in the future by increasing the horizon. However, the abrupt difference in difficulty might lead the system into instability, or it might render the hyperparameters used for the easier problem unsuitable. Therefore, we apply learning rate decay when the horizon changes, so that learning is adjusted to the new horizon smoothly and the gradient steps are affected less by the change, especially during the transitions.
IvC Model Predictive Control
We treat the problem in Eq. (4) with an MPC [35] that instead of solving the optimization problem for an infinite horizon, executes the first step of the solution and then resamples the current state. By doing so, it alleviates the need for a global, openloop, plan that would require modelplant mismatches or abrupt discontinuities to be treated a priori. Instead, receding horizon controllers correct them by sampling the real system state at the next optimization round.
We manage the compliant reaction to the environment separately as seen in Fig. 1, and use as a feature for the dynamics model and the optimization variable of Eq. (4). Finally, for the MPC state we do not consider the full pose of the endeffector, but simplify the problem by only treating the translational parts of the cutting motion (axes in Fig. 1). The motion on the remaining axes is controlled through a setpoint stiffness controller.
V Evaluation
For all of the following experiments, the training set, or seen classes, includes trials for 6 object classes, . In Sections VB and VC we also examine the generalization ability of the different models by adding 3 completely unseen classes, . For the experiments, the blocks are formed by timesteps which correspond to of measurements and in Sections VB, VC the prediction horizon is set to blocks. More information regarding the models’ parameter values and experimental details are listed in the Appendix.
Va Data Collection
We collected data from cutting trials executed with the controller in Eq. (1), where the desired behavior in Eq. (2) was commanded as a trajectory (). The gains, as well as the sawing rate for the desired trajectory, were tuned depending on the object class in order to record a large variety of interaction modes. We included interactions with optimal parameters for the specific object, interactions with appropriate parameters for the whole class, as well as parameters that could accommodate all the classes, albeit not in an optimal manner. All the cutting trials were initialized above the object, depending on its size, as to include enough samples of freespace motion.
Cake  Zucchini  Cucumber  Banana  Pepper  Eggplant  Cheese*  Potato*  Lemon*  Avg.  

RNN  1.08  1.04  1.49  1.59  0.92  1.42  0.71  0.55  N/A  1.13** 
LSTM  1.07  2.34  0.96  1.72  2.08  2.11  1.69  0.68  1.43  1.56 
LSTMc  1.51  1.41  1.68  1.19  1.21  0.98  2.18  0.51  0.99  1.29 
LSTMlrc  1.26  0.91  0.71  2.31  0.72  0.95  0.73  1.76  1.80  1.24 
Avg.  1.23  1.42  1.21  1.71  1.23  1.36  1.33  0.88  1.16** 
VB Prediction Performance
In this section, we investigate the effects of the proposed training approach on the prediction performance of the networks modelling the dynamics. To evalue this , we compare an RNN architecture (RNN), a baseline LSTM network trained directly for 5block prediction (LSTM), an LSTM trained with horizon curriculum (LSTMc) and finally the LSTM trained with the proposed combination of horizon curriculum and decaying learning rate (LSTMlrc). The RNN was structured and trained with the 3stage approach in [14]. In the experiments, we investigate whether a simpler architecture can capture the dynamics, the effect of the proposed training approach and finally how these changes can affect the generalization ability over different object classes.
Firstly, we examine the evolution of the mean L2 error between predicted and groundtruth trajectories as the prediction horizon increases up to (or 15 blocks) into the future. Note that the trajectories consist of relative displacements, hence their magnitude. For this experiment, the networks were trained on a dataset containing 34 cutting trials over 6 objects (210564 data points in total), while the validation set includes 15 independent trials over the same object categories (93447 data points in total).
From the results in Fig. 3 it can be seen that for short horizons, the LSTM networks have comparable results, while the RNN displays a much higher error. As the horizon increases, the performance of all the networks, except LSTMlrc, degrades to the same point. Before the prediction horizon reaches , LSTMc has only marginally better results than its simpler counterpart, showcasing that simply employing a learning curriculum is not enough to boost the predictive performance. Finally, throughout the experiment, LSTMlrc significantly outperforms all of the baselines, supporting that the combination of learning rate decay and curriculum training results in better performance that scales well with the prediction horizon.
Secondly, we report the average MSE during forward predictions on a test set for a prediction horizon of blocks. For this purpose, we performed trials for 5 object classes that were also in the training set and the additional 3 classes in . We recorded two repetitions for three different values of amounting to a total of 30 trials with seen classes and 18 trials with unseen ones.
Table II shows the corresponding results for each model on seen and unseen classes, as well as the total MSE for both cases. It is evident that LSTMlrc is consistently better than the rest of the models and generalizes well to the unseen cases. It is interesting to observe that despite its poor scaling as the horizon grows, the RNN model shows slightly better results than the LSTM baselines in both datasets. This reinforces the results from the previous section concerning the training procedure and further indicates the usefulness of combining curriculum training with learning rate decay.
Model 





RNN  2.08  3.33  2.55  
LSTM  2.26  3.75  2.82  
LSTMc  2.30  3.94  2.92  
LSTMlrc  1.37  2.29  1.72 
VC Robotic experiments
Even though deep networks can efficiently model nonlinear mappings, when training a network for a dynamics model, the focus should be on the closedloop behavior. A good prediction accuracy is a good indication of the modelling capabilities but does not necessarily reflect what that behavior will be as both training and testing are done on trajectories executed by a trajectorytracking controller, which is different than the costbased MPC during online deployment. The desirable properties of that closedloop system are primarily qualitative and difficult to express quantitatively. Considering that we are aiming to construct an intelligent system that is handling the task of foodcutting generally, it is of paramount importance that the results for different models are consistently good and can tackle different object types. In that light, failure to complete trials for a given object type, such as the RNN model in Table I, should be weighted more than good training and validation results.
To evaluate the models’ performance within the controller, we executed a series of experiments with the 9 different object classes in and . In the experiments, we used a YuMiIRB 14000 collaborative robot with an OptoForce 6axis Force/Torque sensor mounted on its wrist. For every model, we executed 5 trials per class. Throughout the trials, we set and kept the same cost function as reported in the Appendix. The trial ended successfully only when the knife had reached the cutting board. In any other case, e.g. if the execution time exceeded a minute, the trial was considered unsuccessful and the results were discarded. Since we have included sequences of freespace motion in the training data, we did not initialize the trials with the knife already in the object as in [15, 14], but directly above it with no further indications of the object’s location.
It should be noted that due to the robot’s hardware limitations, stiffer objects often caused the torque limit to be surpassed, leading the robot to shut down, which constituted a failure and the trial was repeated. This was especially evident while evaluating the RNN architecture on lemons as the closedloop behavior did not exhibit the necessary sawing motion to break friction. This hampered the downwards progress, or simply caused a hardware failure, making it impossible to collect successful trials. For this reason, in Table I the associated results are marked with a double asterisk to denote that RNN trials were omitted when calculating the average scores.
The only modifications made to the objects were size adjustments if their width exceeded the knife’s length. We did not alter the objects’ height as that could alter the dynamics of certain classes. Additionally, because of the aforementioned hardware limitations, when treating eggplants and lemons, we created a small slit on the object’s surface to decrease the exerted torque during the initial contact and avoid reaching the torque limits before the controller had time to employ sawing motions.
Since most of the objects did not have homogeneous shapes, the required cutting length was often not the same. Consequently, the accumulated costs per trial would be a misleading and an incomparable metric as longer trajectories do not necessarily signify worse performance. Instead, in Table I we assess the online performance through the mean cost achieved by the models. For almost half of the classes, LSTMlrc again outperforms the rest of the baselines and achieves better mean costs. However, it is interesting to notice that LSTMc, despite having the highest MSE during forward predictions, still manages to perform well, and more importantly, accomplishes the best scores in 2 out 3 unseen cases. Finally, even though RNN failed to complete a cut on lemons, it still has notable performance on several classes. This is partially due to the fact that the strategies it resulted in revolved mostly around slicing the object instead of sawing. However, because this behavior was more aggressive, it often led to failed trials because of the hardware limits.
In a successful cutting trial, it is straightforward to surmise that the main objective is downwards motion. Nevertheless, the sawing motion is related to the downward progress, as it enables it by breaking friction and minimizing the sheer force otherwise required. Consequently, apart from the mean cost, a crucial point of evaluation for the dynamics models is whether they lead the controller to infer useful strategies for each object class. For objects that are stiffer, finegrained understanding of the dynamics should drive the strategy around sawing, while for the softer ones, it should deem it unnecessary. A qualitative demonstration of these emerging behaviors can be observed in Fig. 3(a) and Fig. 3(b) that depict the trajectories during trials on a soft (cake) and a stiff object (eggplant).
In the former case, any strategy is viable as there is no significant resistance from the material. LSTM that had the best cost for this class, results in minimum sawing, as it is redundant, and so is LSTMlrc despite it’s worse costwise performance. On the other hand, RNN, that had almost the same cost as LSTM, displays similar behaviour with the worst model for this class. In the latter case of the eggplant, it is substantially more difficult to cut through the object without sawing, because of its density and firmness. LSTMlrc demonstrates the most insightful behavior with smooth sawing motions that led to the best cost. Similar behavior is adopted by LSTMc that has the closest score, as opposed to LSTM and RNN that only employed lowmagnitude sawing, which was not suitable for the dynamics of the class. In conclusion, even though LSTMlrc did not have the least cost for every class, it exhibited the most appropriately diverse techniques that were able to adapt efficiently to the dynamics encountered amongst the classes.
Vi Conclusion
In this work, we presented a datadriven framework for the contactrich task of foodcutting. We showed that by carefully designing every step of the method, we can produce models that have consistently good predictive performance on known cases and generalize well to unseen ones. When evaluated within a predictive controller, the proposed approach achieved the best mean cost in 4 out of 9 object classes and displayed a better understanding of the dynamics as showcased by the strategies the controller adopted. In the future, it would be interesting to explore avenues that allow adaption not only on different object sizes or classes, but on completely different and more complicated cases of cutting, such as objects with a large seed. To this end, we will further investigate the design choices as to seamlessly incorporate behaviors that could be otherwise generated by switching controllers or a highlevel planner.
Appendix
Network architectures
The LSTM networks consist of a fully connected input layer of size 90 with a hyperbolic tangent activation, followed by 2 LSTM layers of hidden size 9 and a linear output layer that transforms the LSTM output to size 30. The RNN baseline consists of 6 fully connected layers with hyperbolic tangent activation and 2 recurrent layers with 30 units each.
Curriculum training
During curriculum training, we gradually increase the horizon until we reach the desired one. For every horizon, we train the network for 10 epochs and reduce the learning rate, except for the final length prediction that we allow the network to train for 20 epochs without further changing it.
Training Hyperparameters
For all the networks we used Adam [36] with the hyperparameters learning rate (lr), weight decay (wd) and learning rate decay (gamma) set as listed in Table III.
Model  {lr, wd, gamma}  Model  {lr, wd, gamma} 

{1e04, 5e04, N/A}  
RNN  {1e04, 5e04, N/A}  LSTMc  {1e04, 2e04, N/A} 
{1e04, 3e04, N/A}  
LSTM  {2e04, 3e04, N/A}  LSTMlrc  {1e04, 3e04, 0.5} 
Model Predictive Control
The main components of the cost function are a term that drives the slicing motion towards the table’s surface and a sawing term that enables the downward progress. Since there is no fixed trajectory, the sawing term does not penalize motion within a range , with an margin, around the central sawing point and is quadratic beyond it. Finally, to motivate smallereffort solutions, we include the norm of the control input. Namely, for the prediction horizon , the cost was given by:
where , are positive constants weighting the contribution of the costs associated with cutting and sawing actions respectively to the total cost while is the weighting constant for the control input quadratic term.
References
 [1] N. Hogan, “Impedance control: An approach to manipulation,” in 1984 American Control Conference, June 1984, pp. 304–313.
 [2] B. Siciliano and L. Villani, Robot Force Control, 1st ed. Norwell, MA, USA: Kluwer Academic Publishers, 2000.
 [3] J. D. Schutter and H. V. Brussel, “Compliant robot motion i. a formalism for specifying compliant motion tasks,” The International Journal of Robotics Research, vol. 7, no. 4, pp. 3–17, 1988. [Online]. Available: https://doi.org/10.1177/027836498800700401
 [4] Y. Karayiannidis and Z. Doulgeri, “An adaptive law for slope identification and force position regulation using motion variables,” vol. 2006, 06 2006, pp. 3538 – 3543.
 [5] D. Zhang and B. Wei, “A review on model reference adaptive control of robotic manipulators,” Annual Reviews in Control, vol. 43, pp. 188–198, 2017. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1367578816301110
 [6] Y. Karayiannidis, C. Smith, F. E. V. Barrientos, P. Ögren, and D. Kragic, “An adaptive control approach for opening doors and drawers under uncertainties,” IEEE Transactions on Robotics, vol. 32, no. 1, pp. 161–175, Feb 2016.
 [7] S. Chiaverini and L. Sciavicco, “The parallel approach to force/position control of robotic manipulators,” IEEE Transactions on Robotics and Automation, vol. 9, no. 4, pp. 361–373, Aug 1993.
 [8] M. H. Raibert and J. J. Craig, “Hybrid position/force control of manipulator,” Journal of Dynamic Systems Measurement and Control, vol. 103, 12 1980.
 [9] O. Kroemer, S. Niekum, and G. Konidaris, “A review of robot learning for manipulation: Challenges, representations, and algorithms,” ArXiv, vol. abs/1907.03146, 2019.
 [10] A. Atkins and X. Xu, “Slicing of soft flexible solids with industrial applications,” International Journal of Mechanical Sciences, vol. 47, no. 4, pp. 479 – 492, 2005, a Special Issue in Honour of Professor Stephen R. Reid’s 60th Birthday. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0020740305000457
 [11] X. Mu, Y. Xue, and Y.b. Jia, “Robotic Cutting : Mechanics and Control of Knife Motion,” pp. 3066–3072, 2019.
 [12] P. Long, W. Khalil, and P. Martinet, “Force / vision control for robotic cutting of soft materials *.”
 [13] B. Amos, I. D. J. Rodriguez, J. Sacks, B. Boots, and J. Z. Kolter, “Differentiable mpc for endtoend planning and control,” in Proceedings of the 32Nd International Conference on Neural Information Processing Systems, ser. NIPS’18. USA: Curran Associates Inc., 2018, pp. 8299–8310. [Online]. Available: http://dl.acm.org/citation.cfm?id=3327757.3327922
 [14] I. Mitsioni, Y. Karayiannidis, J. A. Stork, and D. Kragic, “Datadriven model predictive control for the contactrich task of food cutting,” 2019 IEEERAS International Conference on Humanoid Robots (Humanoids), 2019.
 [15] I. Lenz, R. A. Knepper, and A. Saxena, “Deepmpc: Learning deep latent features for model predictive control.” in Robotics: Science and Systems, 2015.
 [16] S. Jung and T. Hsia, “Adaptive force tracking impedance control of robot for cutting nonhomogeneous workpiece,” vol. 3, 02 1999, pp. 1800 – 1805 vol.3.
 [17] G. Zeng and A. Hemami, “An adaptive control strategy for robotic cutting,” no. April, pp. 22–27, 2002.
 [18] M. Sharma, K. Zhang, and O. Kroemer, “Learning Semantic Embedding Spaces for Slicing Vegetables,” 2019. [Online]. Available: http://arxiv.org/abs/1904.00303
 [19] K. Kronander and A. Billard, “Learning compliant manipulation through kinesthetic and tactile humanrobot interaction,” Haptics, IEEE Transactions on, vol. 7, pp. 367–380, 07 2014.
 [20] B. Huang, M. Li, R. L. De Souza, J. J. Bryson, and A. Billard, “A modular approach to learning manipulation strategies from human demonstration,” Autonomous Robots, vol. 40, no. 5, pp. 903–927, Jun 2016. [Online]. Available: https://doi.org/10.1007/s1051401595019
 [21] A. X. Lee, H. Lu, A. Gupta, S. Levine, and P. Abbeel, “Learning forcebased manipulation of deformable objects from multiple demonstrations,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), May 2015, pp. 177–184.
 [22] T. Tang, H.C. Lin, and M. Tomizuka, “A learningbased framework for robot pegholeinsertion,” in ASME 2015 Dynamic Systems and Control Conference, 10 2015, p. V002T27A002.
 [23] Y. Fan, J. Luo, and M. Tomizuka, “A learning framework for high precision industrial assembly,” in 2019 International Conference on Robotics and Automation (ICRA), May 2019, pp. 811–817.
 [24] T. Johannink, S. Bahl, A. Nair, J. Luo, A. Kumar, M. Loskyll, J. A. Ojea, E. Solowjow, and S. Levine, “Residual reinforcement learning for robot control,” 2019 International Conference on Robotics and Automation (ICRA), pp. 6023–6029, 2018.

[25]
Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,”
in
Proceedings of the 26th Annual International Conference on Machine Learning
, ser. ICML ’09. New York, NY, USA: ACM, 2009, pp. 41–48. [Online]. Available: http://doi.acm.org/10.1145/1553374.1553380  [26] F. Ebert, S. Dasari, A. X. Lee, S. Levine, and C. Finn, “Robustness via Retrying: ClosedLoop Robotic Manipulation with SelfSupervised Learning,” no. CoRL, pp. 1–12, 2018. [Online]. Available: http://arxiv.org/abs/1810.03043
 [27] D. Xu, S. Nair, Y. Zhu, J. Gao, A. Garg, L. FeiFei, and S. Savarese, “Neural Task Programming: Learning to Generalize Across Hierarchical Tasks,” 2017. [Online]. Available: http://arxiv.org/abs/1710.01813
 [28] M. Ranzato, S. Chopra, M. Auli, and W. Zaremba, “Sequence level training with recurrent neural networks,” in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 24, 2016, Conference Track Proceedings, 2016. [Online]. Available: http://arxiv.org/abs/1511.06732
 [29] S. Sahoo, C. Lampert, and G. Martius, “Learning Equations for Extrapolation and Control,” ArXiv eprints, jun 2018.
 [30] R. Jonschkowski, R. Hafner, J. Scholz, and M. Riedmiller, “PVEs: PositionVelocity Encoders for Unsupervised Learning of Structured State Representations,” 2017. [Online]. Available: http://arxiv.org/abs/1705.09805
 [31] B. Siciliano and O. Khatib, Springer Handbook of Robotics. Berlin, Heidelberg: SpringerVerlag, 2007.
 [32] L. v. d. Maaten and G. Hinton, “Visualizing data using tsne,” Journal of machine learning research, vol. 9, no. Nov, pp. 2579–2605, 2008.
 [33] S. Cotin, H. Delingette, and N. Ayache, “A hybrid elastic model for realtime cutting, deformations, and force feedback for surgery training and simulation,” The Visual Computer, vol. 16, no. 8, pp. 437–452, 2000. [Online]. Available: http://link.springer.com/10.1007/PL00007215
 [34] J. Chung, Çaglar Gülçehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” ArXiv, vol. abs/1412.3555, 2014.
 [35] D. Q. Mayne and H. Michalska, “Receding horizon control of nonlinear systems,” IEEE Transactions on Automatic Control, vol. 35, no. 7, pp. 814–824, July 1990.
 [36] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 12 2014.
 [37] A. Rao, “A survey of numerical methods for optimal control,” Advances in the Astronautical Sciences, vol. 135, 01 2010.
Comments
There are no comments yet.