Skill Acquisition via Automated Multi-Coordinate Cost Balancing

We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and linear constraints on the reproductions, such as initial, target, and via points. Further, since the relative importance of each coordinate system in the cost function might be unknown for a given skill, MCCB learns optimal weighting factors that balance the cost function. We demonstrate the effectiveness of MCCB via detailed experiments conducted on one handwriting dataset and three complex skill datasets.


Learning from Successful and Failed Demonstrations via Optimization

Learning from Demonstration (LfD) is a popular approach that allows huma...

Imitation of Manipulation Skills Using Multiple Geometries

Daily manipulation tasks are characterized by regular characteristics as...

The Pareto cover problem

We introduce the problem of finding a set B of k points in [0,1]^n such ...

Accuracy-Reliability Cost Function for Empirical Variance Estimation

In this paper we focus on the problem of assigning uncertainties to sing...

Stochastic Dual Coordinate Descent with Bandit Sampling

Coordinate descent methods minimize a cost function by updating a single...

Stochastic convex optimization for provably efficient apprenticeship learning

We consider large-scale Markov decision processes (MDPs) with an unknown...

Robot Learning from Demonstration Using Elastic Maps

Learning from Demonstration (LfD) is a popular method of reproducing and...

I Introduction

The next generation of robots, that can operate in and adapt to unstructured and dynamic environments, must possess a diverse set of skills. However, it is implausible to pre-program robots with a library of all required skills. Learning from Demonstration (LfD) [1, 2] is a paradigm that aims to equip robots with the ability to learn efficiently from demonstrations provided by humans. Existing work in trajectory-based LfD has contributed a wide range of mathematical representations that encode skills from human demonstrations and then reproduce the learned skills at runtime. Proposed representations include Spring-damper systems with forcing functions [3]

, Gaussian Mixture Models (GMMs)

[4, 5, 6]

, Neural Networks (NNs)

[7, 8], Gaussian Processes (GPs) [9, 10, 11], and geometric objects [12], among others. Each of these representations is used to encode the demonstrations in a predefined space or coordinate system (e.g., Cartesian coordinates). In other words, a single best coordinate system for any given skill is assumed to both exist and be known. However, as we show in this work, the assumption that a single best coordinate system exists for each task does not hold. Further, encoding in only a single coordinate system prohibits the model from capturing some of the geometric features that underlie a demonstrated skill.

Fig. 1: A comparison of reproductions generated by considering different coordinates, illustrating the need for cost balancing.

In this work, we contribute a learning framework that encodes demonstrations simultaneously in multiple coordinates, and balances the relative influences of the learned models in generating reproductions. The proposed framework, named Multi-Coordinate Cost Balancing (MCCB), encodes demonstrations in three differential coordinates: Cartesian, tangent, and Laplacian (Section III-A). Simultaneously learning in these three coordinates allows our method to capture all of the underlying geometric properties that are central to a given skill. MCCB encodes the joint density of the time index and the demonstrations in each differential coordinate frame using a separate statistical model. Thus, given any time instant, we are able to readily obtain the conditional mean and covariance in each coordinate system (Section III-B

). MCCB generates reproductions by solving an optimization problem with a blended cost function that consists of one term per coordinate. Each term penalizes deviations from the norm, weighted by the inverse of the expected variance in the corresponding coordinate system (Section

III-C). Further, we subject the optimization problem to linear constraints on the reproductions, such as initial, target, and via point constraints. Our constrained optimization problem is convex with respect to the reproduction and hence can be solved efficiently.

A major hurdle in learning a wide variety of skills, without significant parameter tweaking, is that the relative importance of each differential coordinate (or the geometric feature) in encoding a given task is unknown ahead of time. For instance, consider the problem of encoding the demonstrations illustrated in Fig. 1. Using any one coordinate system in isolation, even when the most suitable one is known, does not yield good reproductions (the red, brown, and green dashed lines). To alleviate this problem, MCCB preferentially weights the costs defined in each coordinate (Fig. 2). Importantly, MCCB learns the optimal weights directly from the demonstrations without making task-dependent assumptions. To this end, MCCB solves a meta optimization problem that aims to minimize reproduction errors (Section III-D). As shown by the solid blue lines in Fig. 1, a cost function that optimally balances the costs in each coordinate yields better reproductions than any single-coordinate method.

In summary, we contribute a unified task-independent learning framework that (1) encodes demonstrations simultaneously in multiple differential coordinates, (2) defines a blended cost function that incentivizes conformance to the norm in each coordinate system while considering expected variance, and (3) learns optimal weights directly from the demonstrations to balance the relative influence of each differential coordinate in generating reproductions. Further, MCCB is compatible with and complementary to several existing LfD methods that utilize different statistical representations and coordinate systems [13, 14, 12, 11, 10, 15, 16].

Fig. 2: A flow diagram illustrating MCCB.

Ii Related Work

Learning from demonstration has attracted a lot of attention from researchers in the past few decades. While several categories of LfD methods exist [1], our work falls under the category of trajectory-based LfD. In this category, demonstrations take the form of trajectories and the methods aim to synthesize trajectories that accurately reproduce the demonstrations.

Dynamical systems-based trajectory learning methods, such as [5, 6, 7], encode demonstrations using statistical dynamical systems and generate reproductions by forward propagating the dynamics. While such deterministic methods exhibit several advantages, such as convergence guarantees and robustness to perturbations, they are restricted to learning in a single coordinate system and ignore inherent uncertainties in the demonstrations. They incentivize conformance to the norm even when demonstrations exhibit high variance.

Trajectory optimization methods, such as [17] and [18], focus on geometric features by minimizing costs specified using predefined norms. An optimization framework proposed in [19] attempts to adapt multiple demonstrations to new initial and target locations by minimizing the distance between the demonstrations and the reproduction according to a learned Hilbert space norm. Indeed, learning an appropriate Hilbert space norm is related to finding an appropriate coordinate system based on the demonstrations. However, similar to the dynamical systems-based methods, the methods in [17, 18, 19] are restricted to a single predefined or learned coordinate system and do not explicitly model and utilize the inherent time-dependent variations in the demonstrations.

Probabilistic trajectory-learning methods, such as [10, 11] and [14], on the other hand, capture and utilize the variation observed in the demonstrations. However, these methods are also restricted to encoding demonstrations in a single predefined coordinate system that is assumed to be known.

Our design of the costs in each differential coordinate is inspired by the minimal intervention principle [13] that takes variance into account. While the approach in [13] does encode demonstrations in different frames of references, all the frames are restricted to Cartesian coordinates or orientation space. Furthermore, all the relevant frames for a given task are also expected to be provided by the user.

The motion planning framework in [15]

, complementary to our approach, utilizes a blended cost function, the construction of which is guided by probability distributions learned from the demonstrations. This framework incentivizes factors such as smoothness, manipulability, and obstacle avoidance, but is restricted to the Cartesian coordinate system. MCCB, on the other hand, encodes demonstrations in multiple differential coordinates and learns to optimally balance their relative influences, but does not consider factors such as manipulability and obstacle avoidance.

Differential coordinates have been extensively used in the computer graphics community [20, 21]. Prior work in trajectory learning that incorporates differential coordinates includes the Laplacian trajectory editing (LTE) algorithm [16]. Using Laplacian coordinates, the LTE algorithm adapts a single demonstration to new initial, target, and via points while preserving the shape. However, the LTE algorithm does not reason about the relative importances of multiple coordinates.

Iii Methodology

The section describes the technical details of MCCB and its work flow as illustrated in Fig. 2.

Iii-a Differential Coordinate Transformations

In this section, we define the differential coordinates and their corresponding transformations used in MCCB.

Cartesian: Let a discrete finite-length trajectory in -dimensional Cartesian coordinates be denoted by and let denote a discrete sample at time index . This trajectory can be represented using a graph where is the set of vertices representing the samples in the trajectory and is the set of edges that represent the connections between the samples in the trajectory. The neighborhood of each vertex is defined by the set of adjacent vertices . In the case of discrete-time trajectories, the edges between any given vertex and its two neighbors are assumed to carry unit weights, while all other edges carry zero weights.

Laplacian: It is known that the discrete Laplace-Beltrami operator for the trajectory provides the Laplacian coordinate as [20]. Note that the above relationship can be written as a linear differential operator in matrix form


where is the trajectory in the Laplacian coordinates, and , called the graph Laplacian, is given by


As pointed out in [16]

, the Laplacian coordinates have meaningful geometric interpretations. Specifically, the Laplacian coordinates can be seen as the discrete approximations of the derivative of the unit tangent vectors of an arc-length parametrized continuous trajectory. In other words, the Laplacian coordinates measure the deviation of each sample from the centroid of its neighbors.

Tangent: While the Laplacian coordinates are discrete approximations of second order differential transformations, a discrete approximation of the first differential transformation is possible. Consider such a first order transformation using first order finite differences defined as , where is called the tangent coordinate. The matrix form of the above relationship results in a linear differential operator given by


where is the trajectory in the tangent coordinates and , called the graph incidence matrix, is given by


Similar to the Laplacian coordinates, the tangent coordinates have geometric interpretations. Specifically, the tangent coordinates can be seen as discrete approximations of the un-normalized tangent vectors of an arc-length parametrized continuous trajectory, i.e., the tangent coordinates measure the local direction of motion at each sample of the trajectory.

In our work, we assume that a set of demonstrations in the Cartesian coordinates are available. Let the th demonstration be denoted by . Note that if the raw demonstrations are of varying duration in time, we perform time alignment using dynamic time warping. MCCB transforms each obtained demonstration into a trajectory in the tangent coordinates (denoted by ) and a trajectory in Laplacian coordinates (denoted by ) using (1) and (3), respectively.

Iii-B Encoding in Multiple Differential Coordinates

This section defines the costs associated with each coordinate. With the demonstrations available in all three differential coordinates, we employ three independent Gaussian mixture models (GMMs)111

MCCB does not rely on the use of GMMs and any statistical representation that can provide the conditional estimates will suffice.

to approximate the joint probability densities of time and the samples in each coordinate system.

The GMM associated with the Cartesian coordinates attempts to approximate the joint density of and using a finite number of Gaussian basis functions as follows , where is the number of Gaussian basis functions, is the prior associated with the th basis function, is the set of parameters of the GMM, and is the conditional probability density given by , where is the mean and is the covariance matrix of the th Gaussian basis function.

We learn the parameters

of the model using the Expectation-Maximization algorithm based on the demonstrations

. Given the learned model and a time instant, the expected value of the conditional density is given by Gaussian mixture regression (GMR) [22] as follows


where , , , and the conditional covariance is given by


Similar to the GMM learned in the Cartesian coordinates, we learn a second GMM in the tangent coordinates based on the demonstrations , and a third GMM in the Laplacian coordinates based on the demonstrations . The expected values of the conditional densities and are given by


and the corresponding conditional expectations are given by


where the variables in (7)-(10) with subscripts and correspond to the tangent and Laplacian coordinates, respectively, and are defined similarly to the ones in (5)-(6).

Iii-C Imitation via Optimization

In this section, we explain the design of our multi-coordinate cost function. MCCB generates reproductions by solving a constrained optimization problem given by


where is the reproduction, are positive weights; are cost functions in the Cartesian, tangent, and Laplacian coordinates, respectively; and define linear constraints on . In practice, and we use the linear constraints to enforce constraints on initial, target, and via points.

We define the cost function in each coordinate system as follows


where denote the block diagonal matrices formed with the conditional covariances and , respectively, for all values of . Further, the notation denotes vectorization - for instance, denote the vectorized trajectories formed by vertically stacking and for all values of , respectively. Note that we construct the trajectories and in (14) and (15) from via the linear operators defined in (3) and (1), respectively. MCCB penalizes deviations from the conditional mean in each coordinate system. However, deviations are penalized less (more) severely if high (low) variance is observed in the demonstrations at any given time.

Iii-D Automated Cost Balancing

In order to obtain reproductions that successfully imitate demonstrations of a wide variety of skills, the weights and have to be chosen with care. Indeed, they preferentially weight the costs defined in each differential coordinate and thereby manipulate the relative incentive for successful imitation in each coordinate system.

We learn these weights directly from the available demonstrations. Note that, for known weights, the constrained optimization problem in (11) is convex in . We estimate the weights in the following form


where , such that , are positive scaling factors used to correct for inherent differences in the magnitudes of the costs, and , such that , are positive weights used to preferentially weight the cost defined in each coordinate system. MCCB estimates the scaling factors ’s as follows


With the scaling factors compensating the inherent scale difference in the costs, we compute the preferential weighting factors ’s that minimize reproduction error. To this end, we formulate the following meta optimization problem


where denotes the sum of squared errors computed over time, and is the solution to the following optimization problem


where denotes specific linear constraints pertaining to the demonstration , such as initial, target, and via points. Solving the above meta-optimization problem results in the preferential weights ’s that minimize reproduction errors of the solutions generated by the original constrained optimization problem in (11)-(12).

Iv Experimental Evaluation

This section describes the design and discusses the results of four experiments conducted to evaluate MCCB. In each experiment, we compared the performances of the following approaches:

  1. Cartesian-coordinates: , ,

  2. Tangent-coordinates: , ,

  3. Laplacian-coordinates: , ,

  4. Uniform weighting: , ,

  5. MCCB: , ,

Fig. 3: Qualitative performance of MCCB on the LASA handwriting dataset. Demonstration (gray), reproductions (blue), and expected mean position (dashed red) are shown.

We measured the performance of each approach by the following geometric and kinematic metrics: Swept Error Area (SEA) [23], Sum of Squared Errors (SSE), Dynamic Time Warping Distance (DTWD), and Frechet Distance (FD) [24]. These metrics allow us to evaluate different aspects of each method’s performance. The SEA and SSE metrics penalize both spatial and temporal misalignment, and thus evaluate kinematic performance. On the other hand, the DTWD and FD metrics penalize spatial misalignment while disregarding time misalignment, and thus evaluate geometric performance. Further, the SEA, SSE, and DTWD metrics evaluate aggregate performance by summing over or averaging across all the samples of each reproduction. The FD metric, on the other hand, computes the shortest possible cord length required to connect the demonstration and the reproduction in space while allowing time re-parametrization of either trajectory, and thus measures maximal deviation in space. Note that the SEA metric is restricted to 2-dimensional data, so we only report it for one of our experiments.

In all the experiments, we used the position constraints in (12) to enforce both initial and end point constraints uniformly across all the methods being compared. Further, we uniformly set the number of Gaussian basis functions to five across all the coordinates and all the experiments.

Iv-a Handwriting Skill

This experiment evaluates MCCB on the publicly available LASA human handwriting library [5], that consists of handwriting motions collected from pen input using a Tablet PC. The library contains a total of 25 handwriting motions, each with 7 demonstrations.

Fig. 4: Box plots, with mean (brown star) and median (red line), illustrate the performance of each approach on the handwriting task.
Fig. 5: Snapshots illustrating the experimental setup for the picking (left), pressing (center), and pushing (right) skills.
Fig. 6: Qualitative performance of MCCB on the picking, pressing, and pushing datasets. Demonstration (gray), reproductions (blue), expected mean position (dashed red), initial (black squares), and target (black stars) are shown.

Fig. 3 shows that MCCB yields reproductions that are qualitatively similar to the demonstrations while satisfying the end-point constraints across all motions. As shown in Fig. 4, quantitative analysis indicates that MCCB ()222Weighting factors averaged over all 25 skills in the LASA dataset and three of the four baselines performed comparably with respect to the SEA, FD, SSE, and DTWD metrics, while the Cartesian baseline performed poorly in comparison. This is consistent with the fact that the demonstrations within the LASA dataset emphasize strong similarities in shape.

Iv-B Picking Skill

The second experiment evaluates the performance of MCCB in a picking task (Fig. 5). The data consists of six kinesthetic demonstrations, each a 3-dimensional robot end-effector position trajectory recorded as a human guided the robot in picking up two magnets atop two blocks. We enforced two via-point constraints (one at each picking point) in addition to the end-point constraints.

Fig. 7: Box plots, with mean (brown star) and median (red line), illustrate the performance of each approach on the picking dataset.

As shown in Fig. 6(a), MCCB generated reproductions that are qualitatively similar to the demonstrations while satisfying all the position constraints. Quantitative evaluations reveal that learning in tangent coordinates yielded better reproductions than learning in Cartesian and Laplacian coordinates (Fig. 7). This was expected since the demonstrations of this task, much like the LASA dataset, emphasize shape similarity. Further, MCCB () yielded the best performance, with respect to all three metrics. In fact, uniform weighting yielded poorer results, with respect to all three metrics, than when considering only the tangent coordinates. The results of this experiment show that while multi-coordinate methods can yield strong performance, it is critical that we balance the weights appropriately.

Iv-C Pressing Skill

In this experiment, we evaluated MCCB’s ability to learn pressing skills (Fig. 5). The data consists of six kinesthetic demonstrations, each a 3-dimensional robot end-effector position trajectory recorded as a human guided the robot in pressing two cylindrical pegs into their respective holes.

Fig. 8: Box plots, with mean (brown star) and median (red line), illustrate the performance of each approach on the pressing dataset.

As shown in Fig. 6(b), MCCB successfully reproduced the demonstrations. Note that MCCB is capable of automatically capturing and reproducing the consistencies across the demonstrations in certain regions without any position constraints. Fig. 8 illustrates the performance of MCCB and the baselines with respect to three different metrics. Learning in Cartesian coordinates resulted in the better performance compared to learning in tangent and Laplacian coordinates. Quantitative evaluations further demonstrate that MCCB () consistently yielded the best performance with respect to all three metrics. The results of this experiment, in light of the results in Section IV-B, suggest that the relative importance of each of the differential coordinates vary across different skills.

Iv-D Pushing Skill

The final experiment evaluates the performance of MCCB in a pushing task (Fig. 5). The data consists of six kinesthetic demonstrations, each a 3-dimensional robot end-effector position trajectory recorded as a human guided the robot in sliding closed the lid of a wooden box.

Fig. 9: Box plots, with mean (brown star) and median (red line), illustrate the performance of each approach on the pushing dataset.

As shown in Fig. 6(c), MCCB successfully generated reproductions that are similar to the demonstrations. As evidenced by quantitative evaluations in Fig. 9, encoding demonstrations in the Laplacian coordinates yielded better performance, with respect to all three metrics, when compared to learning only in either of the other two coordinates, while, MCCB () consistently outperformed all the other approaches. Note that learning in the Laplacian coordinates alone resulted in better performance than uniformly weighting of all the coordinates. These results are consistent with the results from the previous sections and indicate that MCCB yields consistently good performance. The results are summarized in Table I.

    Single Coordinate     Multi-Coordinate
    Cartesian Tangent Laplacian     Uniform W. MCCB
TABLE I: Orange check marks denote the most relevant coordinate and green check marks denote the best performing method.

V Discussion and Conclusion

We introduced MCCB, a learning framework for encoding demonstrations in multiple differential coordinates, and automated balancing of costs defined in those coordinates. As shown in Table I, we demonstrated that the relative effectiveness of each coordinate system is not consistent across a variety of tasks since any given skill might be better suited for learning in one (or more) coordinate system(s). Furthermore, uniform weighting of costs in different coordinates does not consistently yield the best results across different skills. Indeed, uniform weighting, in some cases, yielded poorer performances compared to when only one coordinate system was used. On the other hand, MCCB learned to balance the costs and consistently yielded the best performance. Since the weights are learned directly from the demonstrations, MCCB makes no task-specific assumptions and does not require tedious parameter tuning. Note that although we used GMMs as the base representation in this work, MCCB is agnostic to the statistical model used to encode the demonstrations in each coordinate system, and thus can be combined with other techniques, such as [13, 14, 12, 11, 10, 15, 16]. Furthermore, MCCB can be extended to include more coordinate systems that capture additional trajectory features.


This research is supported in part by NSF NRI 1637758.


  • [1] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey of robot learning from demonstration,” Robotics and autonomous systems, vol. 57, no. 5, pp. 469–483, 2009.
  • [2] A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot programming by demonstration,” in Springer handbook of robotics.   Springer, 2008, pp. 1371–1394.
  • [3] P. Pastor, H. Hoffmann, T. Asfour, and S. Schaal, “Learning and generalization of motor skills by learning from demonstration,” in IEEE International Conference on Robotics and Automation (ICRA), 2009, pp. 763–768.
  • [4] S. Calinon, F. Guenter, and A. Billard, “On learning, representing, and generalizing a task in a humanoid robot,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 37, no. 2, pp. 286–298, 2007.
  • [5] S. M. Khansari-Zadeh and A. Billard, “Learning stable nonlinear dynamical systems with gaussian mixture models,” IEEE Transactions on Robotics, vol. 27, no. 5, pp. 943–957, 2011.
  • [6] H. C. Ravichandar and A. Dani, “Learning position and orientation dynamics from demonstrations via contraction analysis,” Autonomous Robots, pp. 1–16, 2018.
  • [7] K. Neumann, A. Lemme, and J. J. Steil, “Neural learning of stable dynamical systems based on data-driven lyapunov candidates,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 1216–1222.
  • [8] S. Levine and V. Koltun, “Learning complex neural network policies with trajectory optimization,” in

    International Conference on Machine Learning

    , 2014, pp. 829–837.
  • [9] M. Schneider and W. Ertel, “Robot learning by demonstration with local gaussian process regression,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 255–260.
  • [10] M. A. Rana, M. Mukadam, S. R. Ahmadzadeh, S. Chernova, and B. Boots, “Towards robust skill generalization: Unifying learning from demonstration and motion planning,” in Proceedings of the 2017 Conference on Robot Learning (CoRL), 2017.
  • [11] J. Umlauft and S. Hirche, “Learning stable stochastic nonlinear dynamical systems,” in International Conference on Machine Learning, 2017, pp. 3502–3510.
  • [12] S. R. Ahmadzadeh, M. A. Rana, and S. Chernova, “Generalized cylinders for learning, reproduction, generalization, and refinement of robot skills.” in Robotics: Science and Systems, 2017.
  • [13] S. Calinon, D. Bruno, and D. G. Caldwell, “A task-parameterized probabilistic model with minimal intervention control,” in IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 3339–3344.
  • [14] A. Paraschos, C. Daniel, J. R. Peters, and G. Neumann, “Probabilistic movement primitives,” in Advances in neural information processing systems, 2013, pp. 2616–2624.
  • [15] T. Osa, A. M. G. Esfahani, R. Stolkin, R. Lioutikov, J. Peters, and G. Neumann, “Guiding trajectory optimization by demonstrated distributions,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 819–826, 2017.
  • [16] T. Nierhoff, S. Hirche, and Y. Nakamura, “Spatial adaption of robot trajectories based on laplacian trajectory editing,” Autonomous Robots, vol. 40, no. 1, pp. 159–173, 2016.
  • [17] N. Ratliff, M. Zucker, J. A. Bagnell, and S. Srinivasa, “Chomp: Gradient optimization techniques for efficient motion planning,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2009, pp. 489–494.
  • [18] J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Goldberg, and P. Abbeel, “Motion planning with sequential convex optimization and convex collision checking,” The International Journal of Robotics Research, vol. 33, no. 9, pp. 1251–1270, 2014.
  • [19] A. D. Dragan, K. Muelling, J. A. Bagnell, and S. S. Srinivasa, “Movement primitives via optimization,” in IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2015, pp. 2339–2346.
  • [20] Y. Lipman, O. Sorkine, D. Cohen-Or, D. Levin, C. Rossi, and H.-P. Seidel, “Differential coordinates for interactive mesh editing,” in Shape Modeling Applications.   IEEE, 2004, pp. 181–190.
  • [21]

    B. Lévy, “Laplace-beltrami eigenfunctions towards an algorithm that” understands” geometry,” in

    IEEE International Conference on Shape Modeling and Applications, 2006, pp. 13–13.
  • [22]

    D. A. Cohn, Z. Ghahramani, and M. I. Jordan, “Active learning with statistical models,”

    Journal of artificial intelligence research

    , vol. 4, pp. 129–145, 1996.
  • [23] S. M. Khansari-Zadeh and A. Billard, “Learning control lyapunov function to ensure stability of dynamical system-based robot reaching motions,” Robotics and Autonomous Systems, vol. 62, no. 6, pp. 752–765, 2014.
  • [24] M. M. Fréchet, “Sur quelques points du calcul fonctionnel,” Rendiconti del Circolo Matematico di Palermo (1884-1940), vol. 22, no. 1, pp. 1–72, 1906.