I Introduction
In many complicated tasks (e.g., robot table tennis [2] and bimanual manipulation [3]), it is nontrivial to manually define proper trajectories for robots beforehand, hence imitation learning is suggested in order to facilitate the transfer of human skills to robots [4]. The basic idea of imitation learning is to model consistent or important motion patterns that underlie human skills and, subsequently, employ these patterns in new situations. A myriad of results on imitation learning have been reported in the past few years, such as dynamic movement primitives (DMP) [5], probabilistic movement primitives (ProMP) [6]
, taskparameterized Gaussian mixture model (TPGMM)
[7] and kernelized movement primitives (KMP) [8].While the aforementioned skill learning approaches have been proven effective in robot trajectory generation [9, 10] (i.e., Cartesian and joint positions), learning of orientation in task space still imposes great challenges. Unlike position operations in Euclidean space, orientation is accompanied by additional constraints, e.g., the unit norm of the quaternion representation or the orthogonal constraint of rotation matrices. In many previous work, quaternion trajectories are learned and adapted without considering the unit norm constraint (e.g., orientation TPGMM [3] and DMP [11]), leading to improper quaternions and hence requiring an additional renormalization.
Instead of learning quaternions in Euclidean space, a few approaches that comply with orientation constraints have been proposed. One recent approach is built on DMP [12, 13], where quaternions were used to represent orientation and a reformulation of DMP was developed to ensure proper quaternions over the course of orientation adaptation. However, [12, 13] can only adapt quaternions towards a desired target with zero angular velocity as a consequence of the springdamper dynamics inherited from the original DMP.
Another solution of learning orientation was proposed in [14], where GMM was employed to model the distribution of quaternion displacements so as to avoid the quaternion constraint. However, this approach only focuses on orientation reproduction without addressing the adaptation issue. In contrast to [14] that learns quaternion displacements, the Riemannian topology of the manifold was exploited in [15] to probabilistically encode and reproduce distributions of quaternions. Moreover, [15] provides an extension to taskparameterized movements, which allows for adapting orientation tasks to different initial and final orientations. However, adaptation to orientation viapoints and angular velocities is not provided.
In addition to the abovementioned issues, learning orientations associated with highdimensional inputs is important. For example, in a humanrobot collaboration scenario, the robot endeffector orientation is often required to react promptly and properly according to external inputs (e.g., the user hand poses). More specifically, the robot might need to adapt its orientation in accordance with the dynamic environment. The results [11, 12, 13] are built on timedriven DMP, and hence it is nonstraightforward to extend these works to deal with highdimensional inputs. In contrast, due to the employment of GMM, learning orientations with multiple inputs are feasible in [3, 14, 15]. However, these approaches fail to tackle the problem of adaptations comprising viapoint and angularvelocity requirements.
Probabilistic  Unit norm  Viaquaternion  Viaangular velocity  Endquaternion  Endangular velocity  Jerk constraints  Multiple inputs  

Silvério et al. [3]  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓        TextRenderingMode=FillStroke, LineWidth=.5pt, ✓      TextRenderingMode=FillStroke, LineWidth=.5pt, ✓ 
Pastor et al. [11]          TextRenderingMode=FillStroke, LineWidth=.5pt, ✓       
Ude et al. [12]    TextRenderingMode=FillStroke, LineWidth=.5pt, ✓      TextRenderingMode=FillStroke, LineWidth=.5pt, ✓       
Abudakka et al. [13]    TextRenderingMode=FillStroke, LineWidth=.5pt, ✓      TextRenderingMode=FillStroke, LineWidth=.5pt, ✓       
Kim et al. [14]  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓            TextRenderingMode=FillStroke, LineWidth=.5pt, ✓ 
Zeestraten et al. [15]  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓      TextRenderingMode=FillStroke, LineWidth=.5pt, ✓      TextRenderingMode=FillStroke, LineWidth=.5pt, ✓ 
Our approach  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓  TextRenderingMode=FillStroke, LineWidth=.5pt, ✓ 

* In these works, primitives end with zero angular velocity, i.e., one can not set a desired nonzero velocity.
It is worthy pointing out that many imitation learning approaches focus on mimicking human demonstrations, whereas constrained skill learning is often overlooked. As discussed in [16, 17], trajectory smoothness (e.g., acceleration and jerk) will influence robot performance, particularly in timecontact systems (e.g., striking movement in robot table tennis). Thus, it is desirable to incorporate smoothness constraints into the process of learning orientations.
In summary, if we consider the problem of adapting quaternions and angular velocities to pass through arbitrary desired points (e.g., viapoint and endpoint) while taking into account highdimensional inputs and smoothness constraints, no previous work in the scope of imitation learning provides an allencompassing solution.
In this paper, we aim at providing an analytical solution that is capable of

learning multiple quaternion trajectories,

allowing for orientation adaptations towards arbitrary desired points that consist of both quaternions and angular velocities,

coping with orientation learning and adaptations associated with highdimensional inputs,

accounting for smoothness constraints.
For the purpose of clear comparison, the main contributions of the stateoftheart approaches and our approach are summarized in Table I.
This paper is structured as follows. We first illustrate the probabilistic learning of multiple quaternion trajectories and derive our main results in Section II. Subsequently, we extend the obtained results to quaternion adaptations in Section III, as well as quaternion learning and adaptation with jerk constraints in Section IV. After that, we take a typical humanrobot collaboration case as an example to show how our approach can be applied to the learning of quaternions along with multiple inputs in Section V. We evaluate our method through several simulated examples (including discrete and rhythmic quaternion profiles) and real experiments (a painting task with time input on Barrett WAM robot and a handover task with multiple inputs on KUKA robot) in Section VI. Finally, our work is concluded in Section VII.
Ii Probabilistic Learning of Quaternion Trajectories
, the probability distribution of multiple demonstrations often encapsulates important motion features and further facilitates the design of optimal controllers
[19, 20]. Nonetheless, the direct probabilistic modeling of quaternion trajectories is intractable as a result of the unit norm constraint. Similarly to [12, 14, 15], we propose to transform quaternions into Euclidean space, which hence enables the probabilistic modeling of transformed trajectories (Section IIA). Then, we exploit the distribution of transformed trajectories using a kernelized approach, whose predictions allow for the retrieval of proper quaternions (Section IIB). Since many notations will be introduced in the rest of this paper, we summarize the key ones in Table II.Iia Probabilistic Modeling of Quaternion Trajectories
A straightforward solution of modeling quaternions is to transform them into Euclidean space [12, 14, 15]. For the sake of clarity, let us define quaternions and , where , and , . Besides, we write as the conjugation of and, as the quaternion product^{1}^{1}1Quaternion product is defined as: . of and . The function
that can be used to determine the difference vector between
and is defined as [12](1) 
where denotes norm. By using this function, demonstrated quaternions can be projected into Euclidean space.
,  quaternion and its conjugation  
auxiliary quaternion  
transformed state of quaternion  
angular velocity  
Cartesian position  
number of Gaussian components in GMM  
parameters of –th Gaussian component in GMM, see (3)  
unknown parametric vector  
,  dimensional basis function vector and its corresponding expanded matrix, see (9)  
,  expanded matrices, see (26) and (28)  
kernel function  
demonstrations in terms of time and quaternion, where each demonstration has datapoints  
transformed data obtained from , where = , see (2)  
compact form of , where  
probabilistic reference trajectory extracted from  
expanded matrices/vectors defined on , see (13)  
expanded kernel matrix, see (15) or (30)  
desired quaternion states  
transformed states obtained from , see (18) and (22)  
compact form of , where  
additional reference trajectory to indicate the transformed desired points  
extended reference trajectory, see (25)  
demonstration database with highdimensional input and output  
transformed data obtained from with  
probabilistic reference trajectory extracted from  
,  basis function vector with highdimensional inputs and its corresponding expanded matrix, see (34)  
expanded kernel matrix, see (36)  
desired points associated with highdimensional inputs  
transformed desired data from  
additional reference trajectory for highdimensional inputs 
Let us assume that we can access a set of demonstrations with being the time length and as the number of demonstrations, where denotes a quaternion at the th timestep from the th demonstration. Note that two quaternions are needed in (1) in order to carry out the difference operation. So, we introduce an auxiliary quaternion ^{2}^{2}2 could be set as the initial state of demonstrations or simply as ., which is subsequently used for transforming demonstrated quaternions into Euclidean space, yielding new trajectories as with
(2) 
and being the derivative of . Please note that the new trajectories can be utilized to recover quaternion trajectories (which shall be seen in Section IIB). For the purpose of simplicity, we denote and accordingly becomes .
From now on, we can apply probabilistic modeling approaches to new trajectories . To take GMM as an example [7], the joint probability distribution can be estimated through expectationmaximization, leading to
(3) 
where
denotes prior probability of the
th Gaussian component whose mean and covariance are, respectively, and ^{3}^{3}3In order to keep notations consistent, we still use vector notations and to represent scalars. . Furthermore, Gaussian mixture regression (GMR) [7, 21]is employed to retrieve the conditional probability distribution, i.e.,
(4) 
(5) 
(6) 
(7) 
Note that the result in (4) can be approximated by a single Gaussian, i.e.,
(8) 
with and Please refer to [7, 8, 21] for more details. Therefore, for a given time sequence that spans the input space, a probabilistic reference trajectory can be obtained. Here, we can view as a representative of since it encapsulates the distribution of trajectories in in terms of mean and covariance. Therefore, we exploit instead of the original demonstrations in the next subsection.
IiB Learning Quaternions Using A Kernelized Approach
Following the treatment in KMP [8], we first write in a parameterized way^{4}^{4}4Similar parametric strategies were used in DMP [5] and ProMP [6]., i.e.,
(9) 
where represents a dimensional basis function vector. Note that the parameter vector is unknown. In order to learn the probabilistic reference trajectories
, we consider the problem of maximizing the posterior probability
(10) 
It can be proved that the optimal solution to (10) can be computed as
(11)  
where the objective to be minimized can be viewed as the sum of covarianceweighted squared errors^{5}^{5}5
Similar varianceweighted scheme has also been exploited in trajectoryGMM
[7], motion similarity estimation [18] and optimal control [20].. Note that a regularized term with is introduced in (11) so as to mitigate the overfitting.Similarly to the derivations of kernel ridge regression
[22, 23, 24], the optimal solution of (11) can be computed. Thus, for an inquiry point , its corresponding output can be predicted as(12) 
where
(13)  
Furthermore, (12) can be kernelized as
(14) 
with and , , where denotes the blockcomponent at the th column of , denotes the blockcomponent at the th row and the th column of , is defined by
(15) 
with^{6}^{6}6Note that is approximated by in order to facilitate the following kernelized operations.
where is a small constant and represents the kernel function.
Let us recall that quaternion trajectories have been transformed into Euclidean space by using (1) (as explained in Section IIA). Thus, once we have determined at a query point via (14), we can use its component to recover the corresponding quaternion . Specifically, is determined by
(16) 
where the function is [12, 13]
(17) 
An overview of learning quaternions is depicted in the top row of Fig. 1. So far, the developed approach is limited for orientation reproduction, we will show orientation adaptation in the next section, where both quaternion and angularvelocity profiles can be modulated towards passing through any desired points (e.g., via/end points).
Iii Adaptation of Quaternion Trajectories
Similarly to trajectory adaptation in terms of Cartesian and joint positions (and/or velocities) [5, 6, 7, 8], the capability of adapting orientation in Cartesian space is also important for robots in many cases (e.g., bimanual operations and pouring tasks). To take a pouring task as an example, the orientation of the bottle should be adapted according to the height of the cup. In this section, we consider the problem of adapting orientation trajectory in terms of desired quaternions and angular velocities. To do so, we propose to transform desired orientation states into Euclidean space (Section IIIA), and subsequently we reformulate the kernelized learning approach to incorporate these transformed desired points (Section IIIB). Finally, the adapted trajectory in Euclidean space is used to retrieve its corresponding adapted quaternion trajectory. An illustration of adapting quaternions is depicted in the bottom row of Fig. 1.
Iiia Transform Desired Quaternion States
Let us denote desired quaternion states as , where and represent desired quaternion and angular velocity at time , respectively. Since both the modeling operation (3)(8) and the prediction operation (14) are carried out in Euclidean space, we need to transform into Euclidean space so as to facilitate adaptations of quaternion trajectories. Similarly to (2), the desired quaternion can be transformed as
(18) 
In order to incorporate the desired angular velocity , we resort to the relationship between derivatives of quaternions and angular velocities, i.e., [12, 13]
(19) 
where denotes a small constant. By using (19), we can compute the desired quaternion at time as
(20) 
which is subsequently transformed into Euclidean space via (2), resulting in
(21)  
Thus, we can approximate the derivative of as
(22)  
Now, can be transformed into via (18) and (22), which can be further rewritten in a compact way as with . In addition, we can design a covariance for each desired point to control the precision of adaptations. Namely, a high or low precision can be enforced by a small or large covariance, respectively. Thus, we can obtain an additional probabilistic reference trajectory to indicate the transformed desired quaternion states.
IiiB Adaptation of Quaternion Trajectories
According to the adaptation strategy in KMP [8], we reformulate the objective in (11) so that the additional reference trajectory is incorporated, leading to a new objective
(23)  
whose compact representation is
(24)  
with
(25) 
It can be observed that the new objective (24) shares the same form with (11), except that the reference trajectory in (24) is longer than that in (11), thus the solution of (24) can be determined in a similar way. Finally, can be computed via (14) and, subsequently, is recovered from (16) by using . In this case, is capable of passing through the desired quaternions with desired angular velocities at time . The entire approach of quaternion adaptations is summarized in Algorithm 1.
Iv Adaptation of Quaternions with Jerk Constraints
It is well known that robot trajectories should be smooth in order to facilitate the design of controllers as well as the execution of motor commands [16, 17]. For instance, in a table tennis scenario that needs fast striking motions, extremely high accelerations or jerks may degrade the final striking performance, given the physical limits of motors. It is possible to formulate this constraint as an optimization problem and search for the optimal trajectory via an iterative scheme, as done in [16]. In this section, we consider the problem of learning and adapting quaternion trajectories while taking into account jerk constraints. Specifically, we aim to provide an analytical solution to the issue. We explain the learning of multiple quaternions with jerk constraints in Section IVA. Subsequently, we show quaternion adaptations with jerk constraints in Section IVB.
Iva Learning Quaternions with Jerk Constraints
By observing (19), we can find that is proportional to . Moreover, we have . Therefore, when we revisit the transformation described in (2), we can come to the relationship . To this end, we have , implying that the quaternion jerk constraint can be approximated by the acceleration constraint in .
We have formulated imitation learning in Euclidean space as an optimization problem in terms of (11), which in fact can be used to incorporate acceleration constraints directly. Following the parameterization form in (9), we have
(26) 
Thus, the problem of learning orientations with jerk constraints becomes
(27)  
where acts as a tradeoff regulator between orientation learning and jerk minimization^{7}^{7}7Note that the quaternion jerk constraints have been approximated by the acceleration constraints in . .
Let us rearrange (27) into a compact form, resulting in
(28)  
which is equivalent to
(29)  
It can be observed that (29) shares the same formula as (11), and hence we can follow (12)–(17) to derive a kernelized solution for quaternion reproduction with jerk constraints. It is worthy pointing out that comprises the secondorder derivative of , thus a new kernel
(30) 
is required instead of (15). Please see the detailed derivations of kernelizing (30) in Appendix A.
IvB Adapting Quaternions with Jerk Constraints
We now move one step further and consider the problem of quaternion adaptations with jerk constraints. Specifically, we propose to impose jerk constraints into the adaptation problem studied in Section III. By analogy with (11) and (27), we reformulate (23) to include the jerk constraints, yielding a new minimization problem as
(31)  
Comments
There are no comments yet.