Learning and Generalisation of Primitives Skills Towards Robust Dual-arm Manipulation

04/02/2019 ∙ by Èric Pairet, et al. ∙ Heriot-Watt University 0

Robots are becoming a vital ingredient in society. Some of their daily tasks require dual-arm manipulation skills in the rapidly changing, dynamic and unpredictable real-world environments where they have to operate. Given the expertise of humans in conducting these activities, it is natural to study humans' motions to use the resulting knowledge in robotic control. With this in mind, this work leverages human knowledge to formulate a more general, real-time, and less task-specific framework for dual-arm manipulation. The proposed framework is evaluated on the iCub humanoid robot and several synthetic experiments, by conducting a dual-arm pick-and-place task of a parcel in the presence of unexpected obstacles. Results suggest the suitability of the method towards robust and generalisable dual-arm manipulation.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 4

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

The last decades have witnessed a drastic increase in the use of robots in industry, professional and domestic environments. Among the countless competences that robots have acquired, some of the most outstanding are automating repetitive and exhausting tasks in manufacturing plants, working in hazardous scenarios unreachable to humans, assisting doctors in challenging surgical operations, and taking the responsibility for household chores. A common issue in all these applications is the need of manipulating large objects and ensembling multi-component elements without external assistance. On top of that, current manipulators lack human-like generalisation capabilities to confront the highly dynamic and changing environments. Thus, endowing robots with human-like dual-arm manipulation capabilities is essential to extend their competences and autonomy.

Traditional approaches for governing these dual-arm systems depend upon a great understanding of the model underlying the system’s behaviour [Smith et al.2012]. Even though deriving an accurate model is possible for some complex systems, approximations are commonly used in order to make the calculations computationally tractable, despite the trade-off of the model’s uncertainty [Pairet et al.2018]. Furthermore, some of these methods lack scalability and generalisation capabilities along and across tasks. In other words, they require an expert programmer to hand-define all possible scenarios, movements, tasks, and extensive manual tuning of the system’s control architecture [Argall et al.2009].

Figure 1: iCub humanoid being taught how to avoid an obstacle (red sphere). Within the proposed framework, this primitive skill provides robustness to novel scenarios.

The growth of AI has popularised more natural techniques for robot learning, reducing the laborious task of coding every possible scenario and thus, increasing modularity and flexibility on the systems. An example of this is imitation learning or LbD. This method allows non-robotics-experts to interact, teach and modify the robot’s behaviours 

[Nicolescu and Mataric2003], and, consequently, to obtain more human-like behaviours with enhanced acceptability and compatibility to the human workspaces [Ajoudani et al.2017]. Given the possibility to learn from humans’ expertise and dexterity in using both arms for manipulation purposes, it is natural to exploit LbD to use human motions in robotic control.

Teaching a robot from human demonstrations can be challenging. The different anatomical characteristics between the teacher and the learner produces the correspondence problem, i.e. the issue of identifying a mapping between the teacher and the learner which allows transferring of information from one to the other [Dautenhahn and Nehaniv2002]. Moreover, complex motions involve a mixture of human intentions, which are difficult to accurately learn when following an all-at-once learning baseline [Bajcsy et al.2018]. On top of that, teaching a dual-arm system can suppose a high endeavour for non-robotics-experts [Akgun et al.2012].

LbD offers some generalisation capabilities, such as changes in initial and goal configurations of a given demonstration [Billard et al.2008]. However, being limited to similar scenarios is not realistic to the rapidly changing, dynamic and unpredictable environments where robots have to operate. Extended robustness can be obtained by letting the system iteratively adapt and improve the learnt task to new scenarios [Guenter et al.2007]. This leads to the well-known exploration-exploitation dilemma and comes at the cost of needing to fail in order to learn and consequently, at the risk of causing harm to the robot during the self-learning process.

This paper presents a framework that seeks to jointly overcome the aforementioned issues, namely (i) the complex and ambiguous teaching procedures and (ii) the limited generalisation capabilities. Aiming to provide a dual-arm system with a more general and less task-specific method for real-time and robust manipulation in challenging even unfamiliar environments, the proposed framework (i) leverages human knowledge to learn and create a library of primitive skills and (ii) endows dual-arm systems with human-like manipulation capabilities by combining the primitive behaviours.

The main contribution of this work is the formulation of a framework which learns individual primitive skills from human demonstrations and exploits them for robust dual-arm manipulation purposes. Such a framework extends the capabilities of the method in [Pastor et al.2009] to handle the requirements of dual-arm systems. This leads to a framework which reuses its knowledge to generalise according to the environment awareness, differently from the proposals in [Zöllner, Asfour, and Dillmann2004, Topp2017]. The potential of this method has been demonstrated in a simulated environment on the iCub humanoid (see Figure 1). The experimental results suggest the suitability of the framework to address the aforementioned challenges.

System Description and Problem Formulation

The aim of this paper is an end-to-end learning-based framework that allows real-time autonomous dual-arm manipulation in unfamiliar environments. Such a framework needs to be able to adapt its plan to achieve a task according to the surrounding environment, while ensuring some synchronisation between both end-effectors. Moreover, it needs to be easily programmable, making a dual-arm platform customizable and accessible even to non-robotics-experts. Bearing these problem requirements in mind, this section firstly describes the typology and diversity of possible actions in a dual-arm system. It then analyses the challenges that arise when learning actions from human demonstrations. Finally, this section puts the previous pieces together to formulate the modelisation of the dual-arm system and its grasping.

Dual-arm Primitive Skills Taxonomy

Dual-arm manipulators are extremely sophisticated systems, and consequently, their control actions to achieve a specific performance. This work contemplates that any complex behaviour is composed of a vast repertoire of actions or primitive skills [Montesano et al.2008]. In the context of manipulation via a dual-arm system, a possible classification of any primitive skill falls into these two groups:

  • Absolute skills : imply a change of configuration of the manipulated object in the Cartesian space. Example: move, place and/or turn an object in a particular manner.

  • Relative skills : exert an action on the manipulated object in the Object space. Example: opening of a bottle’s screw cap, or hold a parcel by means of force contact.

Each type of primitive skill uniquely produces movement in its own space. In other words, the absolute and relative skills lay on orthogonal spaces. It is natural to expect from a dual-arm system to simultaneously carry out, at least, one absolute and one relative skill to successfully accomplish a task. Let us analyse the task of moving a bottle to a certain position while opening its screw cap. Both end-effectors synchronously move to reach a desired configuration (absolute skill). At the same time, the left end-effector is constrained to hold the bottle upright (relative skill), while the right end-effector unscrews the cap (relative skill).

Learning for a Dual-arm Manipulator

LbD provides a large set of recording techniques and mathematical supports for encoding a demonstrated skill. However, learning a particular task from human demonstrations raises some challenges, namely (i) clearly understanding the intentions of a demonstration and (ii) establishing a teacher-learner communication channel. Both issues can drastically affect the learning outcome if they are not well adressed [Argall et al.2009].

The demonstration clarity issue is tackled by leveraging the belief of a vast repertoire of primitive skills being the basis of any complex behaviour. With this in mind, this work avoids demonstrating a task itself but, instead, teaches the robot the involved primitive skills. This task factorisation provides similar benefits as the work in [Bajcsy et al.2018]: it allows the user to teach one feature of the task at a time, and, if required, to correct them individually.

Factorising a complex behaviour into primitive actions reduces the number of DoF to focus on during demonstration time. As an example, the desired position and orientation of a task can be encoded in separate primitive skills and thus, demonstrated one-at-a-time. This fact becomes handy to ease the complex process of teaching a dual-arm system [Akgun et al.2012]. This work employs kinesthetic guiding to establish a teacher-learner communication channel which does not suffer from the correspondence problem [Argall et al.2009].

Dual-arm System Modelisation

Figure 2: Dual-arm manipulator modelled in the Cartesian space as a spring-damper closed-chain system.

Given the variety of primitive skills that a dual-arm system can execute, this work seeks to model the robotic platform in a generalisable yet modular fashion, which accounts for both absolute and relative skills. To this aim, let us consider the closed kinematic chain depicted in Figure 2. Each arm , where , interacts with the same object in the workspace , where is the dimensionality of the considered Cartesian subspace. In this context, the absolute skill explains the movement of the object in the workspace , while the relative skill describes the actions of each end-effector with respect to the object’s reference frame . Note that is the centre of the closed-chain dual-arm system. Thus, the remaining of the paper uses as object’s and system’s frame indistinguishably.

Let the current state of the closed-chain dual-arm system be defined by the position, velocity and accelaration of its system’s frame in each DoF of the workspace , i.e. . The dynamics of such a system are approximated by the ones of a spring-damper system acting between the objects’s frame and its goal configuration (see Figure 2). This dynamical system genarates in each DoF a movement trajectory with velocity and acceleration profiles defined by:

(1)

where is the model’s attractor that the system will converge to with critically damped dynamics and null velocity when , and  [Ijspeert et al.2013].

Given any initial system state, the dynamical system in (1) generates a linear displacement towards the goal configuration . Any other dynamical behaviour can be represented by an external force acting on the system’s frame as:

(2)

where the coupling term describes the profile of the external force affecting the natural dynamics of the system. In other words, characterises the system’s behaviour and thus, can be used to encode and retrieve a primitive skill.

Dual-arm Grasping Geometry

Any action referenced to the object’s frame can be projected to the end-effectors using the grasping geometry of the manipulated object. This allows computing the required end-effector control commands to achieve a particular absolute task. To this aim, the grasp matrix needs to be computed. The grasp matrix of the end-effector is a transformation map which establishes a velocity relation between the contact point , and the systems reference frame . For a workspace of , i.e. considering the linear and rotational information of the 3D space, the grasping geometry establishes the following relation:

(3)

where

(4)

where

is the identity matrix, and

is the skew-symmetric matrix performing the cross product:

(5)

where is the distance from the object’s reference frame to the contact point .

A global grasp map for the dual-arm manipulator can be defined by horizontally concatenating the grasp matrix of each end-effector, i.e. where and are the left and right arm grasp matrix, respectively.

Framework for Robust Dual-Arm Manipulation

Figure 3: Scheme of the three stages involved in the proposal. Learning: a human demonstrator teaches some primitives behaviours to a system. Rolling-out: the robot exploits (generalises and combines accordingly to the environment awareness) the acquired knowledge. Evaluation: an evaluator inspects the system’s performance and decides whether reteaching is necessary.
Figure 4: Dual-arm pick-and-place of a parcel (brown prism) in the presence of obstacles (grey prism).

In order to endow robots with real-time, robust and autonomous dual-arm manipulation, while letting non-robotics-experts to easily program and customise the system’s behaviour, this work presents the learning-based framework depicted in Figure 3. Such a framework jointly addresses the aforementioned requirements with three sequential parts: (i) the learning module that learns a set of primitive skills from human demonstrations, (ii) the roll-out module that combines those primitive skills to plan a trajectory which makes the system succeed at a task, even in unfamiliar environments and (iii) the evaluation module that lets a human-in-the-loop supervise the robot’s behaviour and reteach a specific skill, if required.

Given a learnt repertoire (library) of absolute and relative primitive skills, such basic motions need to be combined to confront any dual-arm task in any possible scenario. Each absolute task is defined by its coupling term , which leads to a desired triplet after rolling-out (2). Similarly, each relative task defines a desired triplet for each end-effector . This work considers weighting and merging the contribution of each primitive skill at the velocity level as:

(6)

where and respectively are the velocity commands for the left and right end-effector which satisfy the set of activated primitive skills, is the velocity of the absolute primitive skills available in the library, and is the velocity of the relative primitive skills available in the library. Primitive skill selection according to a desired task and environment is conducted with the weights and . Works such as the one in [Ardón et al.2018] propose addressing this problem according to the object’s affordances and environment analysis.

The generality of the proposed framework is narrowed down to provide an application case. This work exploits such a framework to endow a dual-arm system with enhanced autonomy for the dual-arm task of pick-and-place of a parcel, even in the presence of unexpected obstacles. Figure 4 depicts the main idea: parcels (brown prisms) are meant to be moved from one side to another, adjusting the behaviour of the dual-arm whether there is an obstacle or not (grey prism). Not requiring complex grasping capabilities is the main reason for choosing this application case. However, it is extremely challenging in the synchronisation aspect: manipulators have to always maintain a certain amount of contact forces with the carried parcel as any variation would result in releasing or exposing the handled object to stress. To this aim, the library of primitive skills is loaded with: underlying dynamics of a pick-and-place task, obstacle avoidance and grasp maintenance behaviours. Note that the former two skills are absolute, while the latter is relative.

Skill Dynamics

The non-linear dynamical behaviour of any task can be represented using DMP. This mathematical encoding support has proven to be a versatile tool for modelling and learning complex motions, since: (a) any movement can be efficiently learned and generated, (b) a unique demonstration is already generalisable, (c) convergence to the goal is guaranteed, and (d) their representation is translation and time-invariant [Pastor et al.2009, Ijspeert et al.2013]. Some of these DMP-inherent generalisation capabilities are depicted in Figure 5.

The system modelisation in (2) can integrate a DMP as the coupling term . This means that the perturbationless dynamics of the spring-damper system are modified according to the DMP coupled in each DoF. If , three position-encoding DMP would describe the desired position of the manipulated object. Instead, if , four additional quaternion-based DMP would be required to also encode the object’s desired orientation [Ude et al.2014].

Formally, a position-encoding DMP is a weighted linear combination of non-linear RBF [Pastor et al.2009, Ijspeert et al.2013]. The value of such non-linear function when evaluated at a specific entry is defined as:

(7)
(8)

where and are the centres and widths, respectively, of the RBF distributed along the trajectory. Each RBF is weighted by . The phase variable is utilised to avoid direct dependency of on time. The dynamics of are defined as:

(9)

where the initial value of the canonical system and is a positive constant.

The learning of the DMP relies on adjusting the set of RBF, i.e. the weight vector

, composed of all weights . To this aim, LMS is used to compute the weight vector which makes the system (2) adjust to a recorded skill propioception information .

Figure 5: DMP generalisation capabilities. Given a demonstration (red trajectory), rolling-out (2) with the DMP coupling term defined in (7) let the system generalise to new goal configurations (blue trajectories).

Obstacle Avoidance

Figure 6: Change of steering angle following the original formulation in (11) with and .

An analytical description of how humans steer around an obstacle was first presented in [Fajen and Warren2003]. Later on, such biologically-inspired formulation was used in [Hoffmann et al.2009] for single-arm manipulation purposes. Let , , and be respectively the system’s position, velocity and orientation referenced to the workspace reference frame . In order to avoid an obstacle, the system in (2) needs to change its acceleration accordingly to:

(10)

where is a rotation matrix with respect to the vector , and is the desired turning velocity:

(11)

where and are tuning constants. Their effect can be best understood in Figure 6: sets the abruptness of the obstacle avoidance behaviour, and determines its sensitivity.

Within the framework, the parameters of the obstacle avoidance behaviour are leant from human demonstrations, thus involving less parameter tuning. This is achieved using LMS after log-linearising (11) and arranging it as:

(12)

where the training data and contain the periodically sampled value of and experienced during the obstacle avoidance demonstration. The change in steering angle is retrieved from (10), where , i.e. the difference on the dynamics between a perturbationless task and one with obstacles is only motivated by the presence of an obstacle.

Grasp Maintenance

Manipulation of a rigid object via a dual-arm system requires each end-effector to be in contact with the object. Moreover, when the interaction is by force contact (without grasping the object) it is also essential to apply the sufficient forces to ensure grasp maintenance, i.e. prevention of contact separation and unwanted contact sliding. The complexity of this task usually requires modelling the required coupling forces as . For applications with low-dynamical requirements, the previous dynamical function can be approximated to [Gams et al.2014]:

(13)

where is an error multiplying constant which transforms errors in force contact to velocity commands, is the desired coupling force and is the current coupling force retrieved from the robot’s sensors. Thus, the learning of this primitive skill resides on learning from demonstrations which ensures grasp maintenance.

Results and Evaluation

The work presented in this paper is a generic framework for any dual-arm manipulator. Experimental evaluation has been carried out on synthetic environments and a simulated iCub humanoid. This section firstly introduces the iCub robot and the execution of kinesthetic learning on this platform. It then describes the learning of the obstacle avoidance behaviour, and it analyses its integration in a synthetic pick-and-place task. Finally, this section depicts the potential of the proposed framework for being used on a humanoid robot.

Experimental Platform

iCub is an open source humanoid robot testbed for research into human cognition and artificial intelligence applications [Metta et al.2008]. The physical and software characteristics of this robot make it an ideal platform for the presented research. Among all this robot’s features, some of the most relevant to this work are the two 7-DoF manipulators equipped with a torque sensor on the shoulder, tactile sensors in the fingertips and palm, and integrated stereo vision. iCub operates under the YARP middleware.

Kinesthetic teaching on the iCub humanoid is conducted by setting all joints in gravity compensation allowing the teacher to physically manoeuvre the robot through a desired skill. During the demonstrations, proprioception information is retrieved via YARP ports.

Obstacle Avoidance Behaviour

Figure 7: iCub humanoid robot [Metta et al.2008] learning the primitive skill of obstacle avoidance with two different behaviours: reckless (first column) and convervative (second column). (a)-(b) Human demonstrations to avoid an obstacle (red sphere). (c)-(d) iCub’s proprioception data. (e)-(f) Processed iCub’s proprioception data (red trajectory) and learned obstacle avoidance behaviour (blue trajectory).

The primitive skill of obstacle avoidance has been taught to iCub with two different behaviours: reckless (see Figure 7) and conservative (see Figure 7). While the former steers around the obstacle (red sphere) closely, the latter keeps a larger distance to it. The recorded raw proprioception data of these two kinesthetic demonstrations is respectively portraited in Figure 7 and Figure 7. As it can be observed, the retrieved trajectories are noisy and not smooth.

To learn from these demonstrations, the data has been preprocessed in two steps: (i) filtering to remove outliers and high-frequency noise, and (ii) projecting the filtered information to the 2D space composed for the two principal components of the data. Figure 

7 and Figure 7 show the preprocessed data (red trajectory), which has been used in (12) to learn the parameters defining the demonstrator’s obstacle avoidance behaviour. The encoded reckless and conservative styles are respectively depicted in Figure 7 and Figure 7 (blue trajectory). Note that learning the parameters instead of the motion itself lets the robot generalise such behaviour under different conditions.

In overall terms, from Figure 7 it can be concluded that the obstacle avoidance encoding support and its learning process from human demonstrations is able to encapsulate the demonstrator style. The differences between the demonstrated skill and the learnt one are mainly attributed to the hypothesis that any steering around an obstacle follows the formulation in (10)-(11

). Moreover, the noise in the proprioception data increases the variance in the learning. Alternatively, a high-precision tracking system such as the one used in 

[Rai et al.2014] can be considered. Because the proposed approach extracts the parameters of an obstacle avoidance behaviour, the resulting knowledge would yet be independent of the demonstration frame.

Synthetic Pick-and-Place Task

The performance of the obstacle avoidance behaviour in a more realistic context has been validated using the pick-and-place setup depicted in Figure 8. Particularly, an initial pick-and-place demonstration is given to the system (red trajectory), consisting of moving the parcel from the left to the right without the presence of any obstacle (grey prism). The underlying dynamics defining this primitive skill have been encoded as a DMP. Due to the inherent generalisation capabilities of the DMP, the system is already able to infer the pick-and-place dynamics from any different starting and goal positions (blue trajectory), but not able to generalise to the presence of obstacles. Only after coupling the previously learnt pick-and-place dynamics and obstacle avoidance behaviour together, the system is able to generalise in real-time to the presence of unexpected obstacles (black trajectory).

Figure 8: Dual-arm pick-and-place of a parcel (brown prism) in the presence of obstacles (grey prism). Demonstration (red trajectory), inference to a new position (blue trajectory), inference with obstacle avoidance (black trajectory).

Framework on Humanoid Robot

Figure 9: iCub humanoid robot [Metta et al.2008] picking a parcel and raising it with specific dynamics (red trajectory) to a goal configuration (red star). (a) Following the task dynamics previously learnt from a human demonstrator. (b) Modifying the task dynamics in real-time to avoid an obstacle (blue cross). Grasp maintenance is successfuly ensured in both cases by the corresponding primitive skill.

The applicability of the framework has been tested with a particular dual-arm task. The framework has been developed in YARP to deploy it on a simulated iCub humanoid. Due to the lack of realistic simulated force sensors and thus, lack of awareness of the exerted force on the carried object, the grasp maintenance skill primitive in (13) has been replaced according to the proposal in [Gams et al.2014]:

(14)

where is the desired distance from the object’s frame to the contact point , being , and is the current distance. Due to the symmetry of the task, . Thus, the learning of this primitive skill is reduced at setting accordingly to the characteristics of the manipulated parcel and the grasping points.

After this arrangement forced by the simulated nature of the experimentation, the pick-and-raise activity was conducted (see Figure 9). Such a task consists of picking a parcel from the table and raising it with certain dynamics (red trajectory), while avoiding obstacles and ensuring grasp maintenance. iCub performed the described dual-arm task in two different contexts. First, with the absence of obstacles, where the robot can move the parcel with the designated dynamics (see Figure 9). Second, with the presence of unexpected obstacles (blue cross), where the robot had to replan the trajectory to achieve the goal configuration (see Figure 9). Despite the simplicity of the used primitive skill to ensure grasp maintenance, the trials were successful: both end-effectors were accurately synchronised so the handled parcel was neither released nor exposed to stress.

These results show that iCub has been able to perform the pick-and-place task even in the presence of an unexpected obstacle, after learning three primitive skills individually from a human demonstrator. This fact raises expectations about the degree of similarity that iCub’s final behaviour might have with the demonstrator’s behaviour under the same conditions. Analysing this similarity is of interest to the HRI community, since it can contribute to enhancing the acceptability and compatibility of robots in human workspaces [Ajoudani et al.2017]. An alternative for conducting this study consists of recording some samples of the robotic and human approach to quantify their deviation with the KL divergence statistic. The lower this indicator is, the higher the chances are that these two agents have similar behaviours. Such a study is left for future work.

Final Remarks and Future Work

This work has presented a novel framework which endows a dual-arm system with real-time, robust and less task-specific manipulation capabilities. Such a framework is twofold: (i) learns from human demonstrations to create a library of primitive skills, and (ii) combines such knowledge to confront challenging unfamiliar scenarios with human-like manipulation capabilities. Unlike the framework of motion primitives in [Pastor et al.2009], the proposed approach handles primitive skills for dual-arm manipulation purposes while still being able to combine different primitives at the same time. This feature is what differentiates the current work from similar ones [Zöllner, Asfour, and Dillmann2004, Topp2017]. The evaluation conducted on the iCub humanoid suggest the proposal’s suitability for robust dual-arm manipulation, yet with some room for improvement.

The framework is not restricted to the presented experimental evaluation nor platform. Any system able to retrieve proprioception information can benefit from this work. Moreover, any primitive skill that might be required for dual-arm manipulation can be included in the framework’s library. The application case reported in this manuscript exemplifies this fact by considering, among other primitive skills, an obstacle avoidance behaviour which steers around obstacles in real-time. The desired reactivity of this obstacle avoidance behaviour is learnt from human demonstrations.

Future work will significantly extend the library of primitive skills such that more tasks and scenarios involving challenging dual-arm manipulation tasks can be addressed within the framework. Action selection will be integrated to automatically select from the framework’s library the necessary set of skills to address a specific task. Apart from the task itself, surrounding environment and characteristics and constraints of the object to manipulate might need to be considered. Finally, imminent efforts will focus on learning force-dependant primitive skills, such as the grasp maintenance one, on the real iCub humanoid robot, as well as evaluating the entire framework on such platform.

Acknowledgments

This work has been partially supported by ORCA Hub EPSRC (EP/R026173/1) and consortium partners.

References

  • [Ajoudani et al.2017] Ajoudani, A.; Zanchettin, A. M.; Ivaldi, S.; Albu-Schäffer, A.; Kosuge, K.; and Khatib, O. 2017. Progress and prospects of the human–robot collaboration. Autonomous Robots 1–19.
  • [Akgun et al.2012] Akgun, B.; Cakmak, M.; Jiang, K.; and Thomaz, A. L. 2012. Keyframe-based learning from demonstration. International Journal of Social Robotics 4(4):343–355.
  • [Ardón et al.2018] Ardón, P.; Pairet, È.; Ramamoorthy, S.; and Lohan, K. S. 2018. Towards robust grasps: Using the environment semantics for robotic object affordances. In Proceedings of the AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy, 5–12. AAAI Press.
  • [Argall et al.2009] Argall, B. D.; Chernova, S.; Veloso, M.; and Browning, B. 2009. A survey of robot learning from demonstration. Robotics and autonomous systems 57(5):469–483.
  • [Bajcsy et al.2018] Bajcsy, A.; Losey, D. P.; O’Malley, M. K.; and Dragan, A. D. 2018. Learning from physical human corrections, one feature at a time. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 141–149. ACM.
  • [Billard et al.2008] Billard, A.; Calinon, S.; Dillmann, R.; and Schaal, S. 2008. Robot programming by demonstration. In Springer handbook of robotics. Springer. 1371–1394.
  • [Dautenhahn and Nehaniv2002] Dautenhahn, K., and Nehaniv, C. L. 2002. The correspondence problem. In Imitation in Animals and Artifacts, MIT Press. MIT Press.
  • [Fajen and Warren2003] Fajen, B. R., and Warren, W. H. 2003. Behavioral dynamics of steering, obstable avoidance, and route selection. Journal of Experimental Psychology: Human Perception and Performance 29(2):343.
  • [Gams et al.2014] Gams, A.; Nemec, B.; Ijspeert, A. J.; and Ude, A. 2014. Coupling movement primitives: Interaction with the environment and bimanual tasks. IEEE Transactions on Robotics 30(4):816–830.
  • [Guenter et al.2007] Guenter, F.; Hersch, M.; Calinon, S.; and Billard, A. 2007. Reinforcement learning for imitating constrained reaching movements. Advanced Robotics 21(13):1521–1544.
  • [Hoffmann et al.2009] Hoffmann, H.; Pastor, P.; Park, D.-H.; and Schaal, S. 2009. Biologically-inspired dynamical systems for movement generation: automatic real-time goal adaptation and obstacle avoidance. In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, 2587–2592. IEEE.
  • [Ijspeert et al.2013] Ijspeert, A. J.; Nakanishi, J.; Hoffmann, H.; Pastor, P.; and Schaal, S. 2013. Dynamical movement primitives: learning attractor models for motor behaviors. Neural computation 25(2):328–373.
  • [Metta et al.2008] Metta, G.; Sandini, G.; Vernon, D.; Natale, L.; and Nori, F. 2008. The icub humanoid robot: an open platform for research in embodied cognition. In Proceedings of the 8th workshop on performance metrics for intelligent systems, 50–56. ACM.
  • [Montesano et al.2008] Montesano, L.; Lopes, M.; Bernardino, A.; and Santos-Victor, J. 2008. Learning object affordances: from sensory-motor coordination to imitation. IEEE Transactions on Robotics 24(1):15–26.
  • [Nicolescu and Mataric2003] Nicolescu, M. N., and Mataric, M. J. 2003. Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, 241–248. ACM.
  • [Pairet et al.2018] Pairet, È.; Hernández, J. D.; Lahijanian, M.; and Carreras, M. 2018. Uncertainty-based online mapping and motion planning for marine robotics guidance. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2367–2374. IEEE.
  • [Pastor et al.2009] Pastor, P.; Hoffmann, H.; Asfour, T.; and Schaal, S. 2009. Learning and generalization of motor skills by learning from demonstration. In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, 763–768. IEEE.
  • [Rai et al.2014] Rai, A.; Meier, F.; Ijspeert, A.; and Schaal, S. 2014. Learning coupling terms for obstacle avoidance. In Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on, 512–518. IEEE.
  • [Smith et al.2012] Smith, C.; Karayiannidis, Y.; Nalpantidis, L.; Gratal, X.; Qi, P.; Dimarogonas, D. V.; and Kragic, D. 2012. Dual arm manipulation: A survey. Robotics and Autonomous systems 60(10):1340–1353.
  • [Topp2017] Topp, E. A. 2017. Knowledge for synchronized dual-arm robot programming. In AAAI Fall Symposium Series 2017. AAAI Press.
  • [Ude et al.2014] Ude, A.; Nemec, B.; Petrić, T.; and Morimoto, J. 2014. Orientation in cartesian space dynamic movement primitives. In Robotics and Automation (ICRA), 2014 IEEE International Conference on, 2997–3004. IEEE.
  • [Zöllner, Asfour, and Dillmann2004] Zöllner, R.; Asfour, T.; and Dillmann, R. 2004. Programming by demonstration: dual-arm manipulation tasks for humanoid robots. In IROS, 479–484.