In many complicated tasks (e.g., robot table tennis  and bimanual manipulation ), it is non-trivial to manually define proper trajectories for robots beforehand, hence imitation learning is suggested in order to facilitate the transfer of human skills to robots . The basic idea of imitation learning is to model consistent or important motion patterns that underlie human skills and, subsequently, employ these patterns in new situations. A myriad of results on imitation learning have been reported in the past few years, such as dynamic movement primitives (DMP) , probabilistic movement primitives (ProMP) 
, task-parameterized Gaussian mixture model (TP-GMM) and kernelized movement primitives (KMP) .
While the aforementioned skill learning approaches have been proven effective in robot trajectory generation [9, 10] (i.e., Cartesian and joint positions), learning of orientation in task space still imposes great challenges. Unlike position operations in Euclidean space, orientation is accompanied by additional constraints, e.g., the unit norm of the quaternion representation or the orthogonal constraint of rotation matrices. In many previous work, quaternion trajectories are learned and adapted without considering the unit norm constraint (e.g., orientation TP-GMM  and DMP ), leading to improper quaternions and hence requiring an additional renormalization.
Instead of learning quaternions in Euclidean space, a few approaches that comply with orientation constraints have been proposed. One recent approach is built on DMP [12, 13], where quaternions were used to represent orientation and a reformulation of DMP was developed to ensure proper quaternions over the course of orientation adaptation. However, [12, 13] can only adapt quaternions towards a desired target with zero angular velocity as a consequence of the spring-damper dynamics inherited from the original DMP.
Another solution of learning orientation was proposed in , where GMM was employed to model the distribution of quaternion displacements so as to avoid the quaternion constraint. However, this approach only focuses on orientation reproduction without addressing the adaptation issue. In contrast to  that learns quaternion displacements, the Riemannian topology of the manifold was exploited in  to probabilistically encode and reproduce distributions of quaternions. Moreover,  provides an extension to task-parameterized movements, which allows for adapting orientation tasks to different initial and final orientations. However, adaptation to orientation via-points and angular velocities is not provided.
In addition to the above-mentioned issues, learning orientations associated with high-dimensional inputs is important. For example, in a human-robot collaboration scenario, the robot end-effector orientation is often required to react promptly and properly according to external inputs (e.g., the user hand poses). More specifically, the robot might need to adapt its orientation in accordance with the dynamic environment. The results [11, 12, 13] are built on time-driven DMP, and hence it is non-straightforward to extend these works to deal with high-dimensional inputs. In contrast, due to the employment of GMM, learning orientations with multiple inputs are feasible in [3, 14, 15]. However, these approaches fail to tackle the problem of adaptations comprising via-point and angular-velocity requirements.
|Probabilistic||Unit norm||Via-quaternion||Via-angular velocity||End-quaternion||End-angular velocity||Jerk constraints||Multiple inputs|
|Silvério et al. ||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓|
|Pastor et al. ||-||-||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||-|
|Ude et al. ||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||-|
|Abu-dakka et al. ||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||-|
|Kim et al. ||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||-||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓|
|Zeestraten et al. ||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||-||-||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓|
|Our approach||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓||TextRenderingMode=FillStroke, LineWidth=.5pt, ✓|
* In these works, primitives end with zero angular velocity, i.e., one can not set a desired non-zero velocity.
It is worthy pointing out that many imitation learning approaches focus on mimicking human demonstrations, whereas constrained skill learning is often overlooked. As discussed in [16, 17], trajectory smoothness (e.g., acceleration and jerk) will influence robot performance, particularly in time-contact systems (e.g., striking movement in robot table tennis). Thus, it is desirable to incorporate smoothness constraints into the process of learning orientations.
In summary, if we consider the problem of adapting quaternions and angular velocities to pass through arbitrary desired points (e.g., via-point and end-point) while taking into account high-dimensional inputs and smoothness constraints, no previous work in the scope of imitation learning provides an all-encompassing solution.
In this paper, we aim at providing an analytical solution that is capable of
learning multiple quaternion trajectories,
allowing for orientation adaptations towards arbitrary desired points that consist of both quaternions and angular velocities,
coping with orientation learning and adaptations associated with high-dimensional inputs,
accounting for smoothness constraints.
For the purpose of clear comparison, the main contributions of the state-of-the-art approaches and our approach are summarized in Table I.
This paper is structured as follows. We first illustrate the probabilistic learning of multiple quaternion trajectories and derive our main results in Section II. Subsequently, we extend the obtained results to quaternion adaptations in Section III, as well as quaternion learning and adaptation with jerk constraints in Section IV. After that, we take a typical human-robot collaboration case as an example to show how our approach can be applied to the learning of quaternions along with multiple inputs in Section V. We evaluate our method through several simulated examples (including discrete and rhythmic quaternion profiles) and real experiments (a painting task with time input on Barrett WAM robot and a handover task with multiple inputs on KUKA robot) in Section VI. Finally, our work is concluded in Section VII.
Ii Probabilistic Learning of Quaternion Trajectories
, the probability distribution of multiple demonstrations often encapsulates important motion features and further facilitates the design of optimal controllers[19, 20]. Nonetheless, the direct probabilistic modeling of quaternion trajectories is intractable as a result of the unit norm constraint. Similarly to [12, 14, 15], we propose to transform quaternions into Euclidean space, which hence enables the probabilistic modeling of transformed trajectories (Section II-A). Then, we exploit the distribution of transformed trajectories using a kernelized approach, whose predictions allow for the retrieval of proper quaternions (Section II-B). Since many notations will be introduced in the rest of this paper, we summarize the key ones in Table II.
Ii-a Probabilistic Modeling of Quaternion Trajectories
A straightforward solution of modeling quaternions is to transform them into Euclidean space [12, 14, 15]. For the sake of clarity, let us define quaternions and , where , and , . Besides, we write as the conjugation of and, as the quaternion product111Quaternion product is defined as: . of and . The function
that can be used to determine the difference vector betweenand is defined as 
where denotes norm. By using this function, demonstrated quaternions can be projected into Euclidean space.
|,||quaternion and its conjugation|
|transformed state of quaternion|
|number of Gaussian components in GMM|
|parameters of –th Gaussian component in GMM, see (3)|
|unknown parametric vector|
|,||-dimensional basis function vector and its corresponding expanded matrix, see (9)|
|,||expanded matrices, see (26) and (28)|
|demonstrations in terms of time and quaternion, where each demonstration has datapoints|
|transformed data obtained from , where = , see (2)|
|compact form of , where|
|probabilistic reference trajectory extracted from|
|expanded matrices/vectors defined on , see (13)|
|expanded kernel matrix, see (15) or (30)|
|desired quaternion states|
|transformed states obtained from , see (18) and (22)|
|compact form of , where|
|additional reference trajectory to indicate the transformed desired points|
|extended reference trajectory, see (25)|
|demonstration database with high-dimensional input and output|
|transformed data obtained from with|
|probabilistic reference trajectory extracted from|
|,||basis function vector with high-dimensional inputs and its corresponding expanded matrix, see (34)|
|expanded kernel matrix, see (36)|
|desired points associated with high-dimensional inputs|
|transformed desired data from|
|additional reference trajectory for high-dimensional inputs|
Let us assume that we can access a set of demonstrations with being the time length and as the number of demonstrations, where denotes a quaternion at the -th time-step from the -th demonstration. Note that two quaternions are needed in (1) in order to carry out the difference operation. So, we introduce an auxiliary quaternion 222 could be set as the initial state of demonstrations or simply as ., which is subsequently used for transforming demonstrated quaternions into Euclidean space, yielding new trajectories as with
and being the derivative of . Please note that the new trajectories can be utilized to recover quaternion trajectories (which shall be seen in Section II-B). For the purpose of simplicity, we denote and accordingly becomes .
From now on, we can apply probabilistic modeling approaches to new trajectories . To take GMM as an example , the joint probability distribution can be estimated through expectation-maximization, leading to
denotes prior probability of the-th Gaussian component whose mean and covariance are, respectively, and 333In order to keep notations consistent, we still use vector notations and to represent scalars. . Furthermore, Gaussian mixture regression (GMR) [7, 21]
is employed to retrieve the conditional probability distribution, i.e.,
Note that the result in (4) can be approximated by a single Gaussian, i.e.,
with and Please refer to [7, 8, 21] for more details. Therefore, for a given time sequence that spans the input space, a probabilistic reference trajectory can be obtained. Here, we can view as a representative of since it encapsulates the distribution of trajectories in in terms of mean and covariance. Therefore, we exploit instead of the original demonstrations in the next subsection.
Ii-B Learning Quaternions Using A Kernelized Approach
where represents a -dimensional basis function vector. Note that the parameter vector is unknown. In order to learn the probabilistic reference trajectories
, we consider the problem of maximizing the posterior probability
It can be proved that the optimal solution to (10) can be computed as
where the objective to be minimized can be viewed as the sum of covariance-weighted squared errors555 Similar variance-weighted scheme has also been exploited in trajectory-GMM
Similar variance-weighted scheme has also been exploited in trajectory-GMM, motion similarity estimation  and optimal control .. Note that a regularized term with is introduced in (11) so as to mitigate the over-fitting.
Similarly to the derivations of kernel ridge regression[22, 23, 24], the optimal solution of (11) can be computed. Thus, for an inquiry point , its corresponding output can be predicted as
Furthermore, (12) can be kernelized as
with and , , where denotes the block-component at the -th column of , denotes the block-component at the -th row and the -th column of , is defined by
with666Note that is approximated by in order to facilitate the following kernelized operations.
where is a small constant and represents the kernel function.
Let us recall that quaternion trajectories have been transformed into Euclidean space by using (1) (as explained in Section II-A). Thus, once we have determined at a query point via (14), we can use its component to recover the corresponding quaternion . Specifically, is determined by
An overview of learning quaternions is depicted in the top row of Fig. 1. So far, the developed approach is limited for orientation reproduction, we will show orientation adaptation in the next section, where both quaternion and angular-velocity profiles can be modulated towards passing through any desired points (e.g., via-/end- points).
Iii Adaptation of Quaternion Trajectories
Similarly to trajectory adaptation in terms of Cartesian and joint positions (and/or velocities) [5, 6, 7, 8], the capability of adapting orientation in Cartesian space is also important for robots in many cases (e.g., bi-manual operations and pouring tasks). To take a pouring task as an example, the orientation of the bottle should be adapted according to the height of the cup. In this section, we consider the problem of adapting orientation trajectory in terms of desired quaternions and angular velocities. To do so, we propose to transform desired orientation states into Euclidean space (Section III-A), and subsequently we reformulate the kernelized learning approach to incorporate these transformed desired points (Section III-B). Finally, the adapted trajectory in Euclidean space is used to retrieve its corresponding adapted quaternion trajectory. An illustration of adapting quaternions is depicted in the bottom row of Fig. 1.
Iii-a Transform Desired Quaternion States
Let us denote desired quaternion states as , where and represent desired quaternion and angular velocity at time , respectively. Since both the modeling operation (3)(8) and the prediction operation (14) are carried out in Euclidean space, we need to transform into Euclidean space so as to facilitate adaptations of quaternion trajectories. Similarly to (2), the desired quaternion can be transformed as
where denotes a small constant. By using (19), we can compute the desired quaternion at time as
which is subsequently transformed into Euclidean space via (2), resulting in
Thus, we can approximate the derivative of as
Now, can be transformed into via (18) and (22), which can be further rewritten in a compact way as with . In addition, we can design a covariance for each desired point to control the precision of adaptations. Namely, a high or low precision can be enforced by a small or large covariance, respectively. Thus, we can obtain an additional probabilistic reference trajectory to indicate the transformed desired quaternion states.
Iii-B Adaptation of Quaternion Trajectories
whose compact representation is
It can be observed that the new objective (24) shares the same form with (11), except that the reference trajectory in (24) is longer than that in (11), thus the solution of (24) can be determined in a similar way. Finally, can be computed via (14) and, subsequently, is recovered from (16) by using . In this case, is capable of passing through the desired quaternions with desired angular velocities at time . The entire approach of quaternion adaptations is summarized in Algorithm 1.
Iv Adaptation of Quaternions with Jerk Constraints
It is well known that robot trajectories should be smooth in order to facilitate the design of controllers as well as the execution of motor commands [16, 17]. For instance, in a table tennis scenario that needs fast striking motions, extremely high accelerations or jerks may degrade the final striking performance, given the physical limits of motors. It is possible to formulate this constraint as an optimization problem and search for the optimal trajectory via an iterative scheme, as done in . In this section, we consider the problem of learning and adapting quaternion trajectories while taking into account jerk constraints. Specifically, we aim to provide an analytical solution to the issue. We explain the learning of multiple quaternions with jerk constraints in Section IV-A. Subsequently, we show quaternion adaptations with jerk constraints in Section IV-B.
Iv-a Learning Quaternions with Jerk Constraints
By observing (19), we can find that is proportional to . Moreover, we have . Therefore, when we revisit the transformation described in (2), we can come to the relationship . To this end, we have , implying that the quaternion jerk constraint can be approximated by the acceleration constraint in .
We have formulated imitation learning in Euclidean space as an optimization problem in terms of (11), which in fact can be used to incorporate acceleration constraints directly. Following the parameterization form in (9), we have
Thus, the problem of learning orientations with jerk constraints becomes
where acts as a trade-off regulator between orientation learning and jerk minimization777Note that the quaternion jerk constraints have been approximated by the acceleration constraints in . .
Let us re-arrange (27) into a compact form, resulting in
which is equivalent to
It can be observed that (29) shares the same formula as (11), and hence we can follow (12)–(17) to derive a kernelized solution for quaternion reproduction with jerk constraints. It is worthy pointing out that comprises the second-order derivative of , thus a new kernel
Iv-B Adapting Quaternions with Jerk Constraints
We now move one step further and consider the problem of quaternion adaptations with jerk constraints. Specifically, we propose to impose jerk constraints into the adaptation problem studied in Section III. By analogy with (11) and (27), we reformulate (23) to include the jerk constraints, yielding a new minimization problem as