1. Introduction
We assume that the reader is familiar with the notion of pose and rigid motion, and the use of quaternions to represent rotations. We use the terms pose and rigid motion interchangeably, since a pose can be considered as a rigid motion relative to a fixed reference frame, and the mathematical operations to manipulate them are identical. Thus we also use the terms rotation and orientation interchangeably, and similarly with translation and position. We refer the reader to [5, 10].
The use of dual quaternions to represent poses is well established, especially in the graphic card industry: [1, 12, 6, 16, 17, 19, 26]
. However, in the robotics industry, it is more common to represent a pose using a quaternion (for the rotation) and a three dimensional vector (for the translation). In this paper, we advocate for the use of dual quaternions.
Let us start with a general discussion of why people might want to represent rotations by unit quaternions instead of three by three orthogonal matrices with determinant . One reason given is that quaternions can be represented with only four numbers, whereas the matrix representation requires nine numbers. Another representation commonly used are Euler angles, and these only require three numbers. However, with Euler angles, the formula for composition of rotations is difficult to calculate. Also, Euler angles suffer from gimbal lock, in which if the rotation takes certain values, the choice of Euler angles isn’t unique, and discontinuities and singularities arise. Quaternions suffer from no such problems.
But in modern times, computer memory is very cost effective, and communication is fast, thus the aformentioned advantages might seem trivial at first. This begs the question: what other advantages do quaternions offer over matrices? Quaternions represent a quantity with three degrees of freedom by a four dimensional vector, whereas matrices use a nine dimensional vector. Suppose one obtains an approximation to a matrix or quaternion? For example, this would be common when using NewtonRaphson to find a rotation satisfying some formula, or to interpolate between rotations. So then one has to find a projection which takes the vector to an admissible vector, that is, a vector that actually represents a rotation. And projecting from a nine dimensional quantity to a three dimensional quantity is going to be much harder than the similarly projection from the four dimensional vector. (The projection from quaternions to unit quaternions is called the
normalization, and is defined in Section2. A suggestion of what such a projection would be for matrices is given at the end of Section 16.)For representing poses, the usual representation is by a three by three orthogonal matrix with determinant
for the rotation, and a three dimensional vector for the translation, twelve dimensional in all. This has a natural representation as a four by four matrix, as we explain Section 4. A great advantage of this representation is the relationship between pose and twist:(1) 
where the multiplication is matrix multiplication, and a twist is a combination of angular velocity and translational velocity measured with respect to the moving frame.
Another natural representation used a lot is to use a quaternion and a three dimensional vector. But this new representation has problems in that it is not clear how the above equation would work in this situation. However, if one introduces the notion of the dual quaternion, which is an eight dimensional vector, equation (1) is still valid if one uses dual quaternion multiplication.
A challenge with dual quaternions is that they are abstract, which can make them harder to understand. Mathematicians find value behind abstract computational ideas, but Engineers usually find the end result, ‘black box approach,’ more useful.
A useful approach to developing intuition for dual quaternions is to understand the rules with their structure, and the mathematical operations needed to manipulate them, (that is, normalization, basis vector multiplication, pose differentiation). Once one becomes comfortable with their form and function, the advantage of duel quaternions over other methods becomes more apparant. For example, one realizes that computing the projection of an eight dimensional vector to the six degrees of freedom of a pose, is also quite simple. (This projection is also called the normalization, and is defined in Section 3.)
This makes it very useful if one is using the NewtonRaphson Method to find a pose satisfying certain properties. We show how the dual quaternion representation efficiently solves the forward kinematics problem.
Finally we consider the dynamics of poses, that is, the equations of motion of a rigid spinning body. In any representation this can be quite difficult, because of both centripetal and precessional effects. But from a theoretical perspective, using the dual quaternion representation is quite slick.
In summary, while the usual representation using Euler angles and translations is conceptually easy to understand, performing calculations can be difficult. Whereas with dual quaternions, while the are initially conceptually hard to understand, when it comes to calculations they are much easier to work with.
2. Motivation
A pose, or rigid motion, is a rotation followed by a translation. (Semantically, a pose is the position of the end effector of the robot, and thus different from a rigid motion, but mathematically they are the same if we consider a pose to be a rigid motion with respect to the stationary reference frame.) The rotation is represented by a orthogonal matrix with determinant . A translation is a vector in . These represent the pose :
(2) 
Composition of poses is written from right to left. Thus
(3) 
One way to represent a pose is using a four by four matrix
(4) 
Then composition of poses may be computed simply by multiplying matrices of the form (4). In the literature, the Lie group of rotations is often denoted . The Lie group of poses is often denoted .
The problem with this representation is that twelve numbers are required to represent a pose. Other problems, which we discuss below, is that of interpolating a sequence of poses, and of computing forward kinematics.
One common way to solve both of these problems is to represent the rotation by a unit quaternion [2, 23], which we briefly describe here. A quaternion is a quadruple of real numbers, written as , with the algebraic operations . Its conjugate is , its norm is , its normalization is , its real part is , and its imaginary part is . It is called a unit quaternion if , a real quaternion if , and a pure quaternion if . Note the multiplicative inverse is given by .
We identify three dimensional vectors with pure quaternions, by identifying , , and with the three standard unit vectors. A unit quaternion represents the rotation . A rotation by angle about an axis , where , has two unit quaternion representations: . Composition of rotations corresponds to multiplication of unit quaternions. With some practice, it becomes easier to read a rotation from a quaternion than it does from Euler angles, the standard method of representing rotations.
We can represent quaternions as four dimensional vectors, and give it the inner product
(5) 
Then a way to represent poses is as a pair . This is the way poses are represented internally in the Robotic Operating System [25]. Then the composition rule is
(6) 
A difficulty with this is that the composition is not a bilinear operation as it was with the representation (4). (This lack of bilinear representation becomes particularly problematic when representing twists, which we describe below.) However there is a way to resolve this as follows. Write and . Then equation (6) can be rewritten as
(7)  
This leads to representing the pose by a unit dual quaternion, which we now describe.
3. Dual quaternions to represent poses
A dual quaternion is a pair of quaternions, written as , with the extra algebraic operation .
The conjugate dual quaternion of is . Conjugation reverses the order of multiplication:
(8) 
There is another conjugation for dual quaternions: , but we have no cause to use it in this paper, except in equation (22) below.
A unit dual quaternion is a dual quaternion such that , equivalently, that is a unit quaternion and . A pure dual quaternion is a dual quaternion such that both and are pure quaternions.
If is a dual quaternion with , then its multiplicative inverse can be calculated using the formula
(9) 
If is a unit dual quaternion, then there is a computationally much faster formula:
(10) 
For a dual quaternion , it is not really possible to mix and additively. The quaternion is unitless, whereas the quaternion has units of length. For this reason, when measuring how large a dual quaternion is, everything must be with respect to , the characteristic length scale. The size of a dual quaternion is defined to be
(11) 
A dual number is anything of the form , where and are real numbers. The norm of a dual quaternion is the dual number defined by the two steps:
(12)  
(13) 
The norm preserves multiplication, that is, if and are two dual quaternions, then
(14) 
If is any dual quaternion with , then we define its normalization to be the unit dual quaternion
(15) 
(We remark that the normalization of a dual quaternion is used in the computer graphics industry [16, 17].) While this normalization formula might seem initially quite complicated, after thinking about it one can see that it is the simplest projection that enforces and .
The normalization also satisfies the following properties.

If is a unit dual quaternion, then .

Normalization preserves multiplication, that is, if and are two dual quaternions, then
(16)
Note that
(18) 
and this could have been the definition of the multiplicative inverse of a dual quaternion, except that this definition is circular.
Let us clarify the bigOh notation. We say that
(19) 
if there exists a constant such that if is sufficiently small, then
(20) 
We set for the set of dual quaternions, for the set of unit dual quaternions, and for the set of pure dual quaternions.
If we are given a pose represented by , then the pose is also represented by the unit dual quaternion
(21) 
As we have shown above, composition of poses corresponds to multiplication of unit dual quaternions. If is a 3vector, and is the image of under the action of the pose , then
(22) 
but generally it is easier to use the formula
(23) 
The introduction of the factors and is unnecessary, but it slightly improves formulas for the twist, as we see below.
It is difficult for a human to look at the second part of a unit dual quaternion, and from that read what the translational part of the pose should be. So dual quaternions work better for an internal representations of poses rather than human readable representations.
4. Kinematics: dual quaternions to represent twists
The translational velocity of a translating reference frame is given by . The angular velocity of a rotating frame is given by the differential equation
(26) 
or , where is the Hodge star operator of [27]:
(27) 
(Equation (26) gives the angular velocity with respect to the rotating reference frame. If one wants the angular velocity with respect to the stationary reference frame, use instead .)
Note that
(28) 
where denotes the cross product.
A twist is the pair of vectors that describe the change of pose in the moving reference frame, that is:
(29) 
(Note in particular that , so that a continual application of a twist can result in a motion following a straight line, an arc of a circle, or a spiral.) The set of angular velocities, and the set of twists are both Lie algebras, often denoted by and in the literature.
The quaterniontranslation formulation doesn’t have a representation like equation (1):
(30) 
But with the unit dual representation, because the composition of poses is given by a bilinear multiplication, it follows that the twist is represented by the dual quaternion satisfying the differential equation.
(31) 
It will be shown in Section 15 (see also [12]) that is the pure dual quaternion
(32) 
Note that if is a unit dual quaternion that is a function of time , then is always a pure dual quaternion. (Note that is a pure dual quaternion equal to the twist relative to the fixed frame of reference. Note also that the factor attached to in equation (32) is a consequence of the introduction of the factor in equation (21), and thus is essentially arbitrary.)
5. Dual quaternions to represent wrenches
Let the pose represent the reference frame that moves with the end effector. It is not necessary (although it can simplify things) that the center of mass of the end effector coincides with the origin of the moving frame.
The wrench dual quaternion is defined to be
(33) 
where and are the torque and force, respectively, applied to the end effector at the origin of the moving frame, measured with respect to the moving frame.
If is the center of mass of the end effector in the moving frame, then the twist about the center of mass is given by
(34) 
where , and the wrench applied about the center of mass is
(35) 
The reason for introducing the factor in definition (33) is so that the rate of change of work done to the end effector is given by
(36) 
See [4] for the origins of the term twist and wrench as pairs of 3vectors, which are examples of screws.
6. Interpolation of poses
Suppose that we are given a sequence of times and rotations, and . We would like to find a function from to rotations such that the third derivative of is bounded, and . If the quantities were merely vectors, we could use cubic splines. But if we perform a similar calculation on the matrices, we cannot guarantee that the matrix is a rotation matrix.
So what we want is to find a map that projects general matrices onto rotation matrices. But it is not obvious how to find such a map that is computationally efficient. It is here that using quaternions really has great advantages over matrices, because an obvious map from general quaternions to unit quaternions is to normalize.
Let be the quaternion representation of respectively. Let be the cubic spline interpolation such that . Then
(37) 
provides a computationally fast, low jerk, method of interpolating the rotations.
This can be extended to interpolate poses. Given times and unit dual quaternions , first interpolate to get a function such that , and then normalize it to get a function . This gives a computationally fast, low jerk, method for interpolating poses.
It is perhaps better to interpolate the rotation quaternions and the translations separately. This is because the center of gravity typically travels in a straight line rather than the arc of a curve, but naive interpolation of dual quaternions follows arcs of curves if the angular rotation is nonzero. (Interpolation of dual quaternions is more appropriate for ‘skinning’ [16, 17].)
If one wants to find a function that passes through these points, and has low jerk even at the end points and , this can be done by adding a ‘ramp up’ and ‘ramp down’ at the ends of the trajectory. Create a free cubic spline on and . Then define a function
(38) 
where
(39) 
Since , , and , it follows that has bounded third derivative, for , for , and for . Thus
(40) 
provides a low jerk function, even at the end points, that passes through the appropriate points. it is important that and be large enough to allow the trajectory to ‘ramp up’ and ‘ramp down’ smoothly.
7. Perturbations of poses
Suppose one has a pose that varies in time, and a fixed time . Then we would like to find a good representation of the perturbation of from a reference pose when is close to . One way to do this is to choose a new reference frame in which , the identity pose, or rather, to think about how behaves.
It can be seen that for close to that there is a pure dual quaternion such that
(41) 
Thus
(42) 
gives a great way to measure the perturbation.
We prefer instead to use what we shall call the Lie difference, which we denote using , which is close to (42), but is always a pure dual quaternion:
(43) 
Note that if and are pure dual quaternions, and is an invertible dual quaternion, then
(44)  
Thus these give a great way to map perturbations of nonlinear poses to linear pure dual quaternions. These can be very effective in control theory.
8. Dual quaternions as vectors
We make the identifications
(45) 
using the basis
(46) 
and similarly, we make the identification
(47) 
using the basis , , , , , . With these identifications, we can define the dot product between two dual quaternions by transferring the usual definition of dot product on , that is
(48) 
In this way, every dual quaternion can be written in component form as
(49) 
and every pure dual quaternion as
(50) 
9. Lie derivatives
The notion of the Lie derivative, sometimes in our context called the directional derivative, is a combination of two ideas that may be found in the literature. First is the concept of a Lie derivative with respect to a vector field [28, 30]. Secondly, the definition of the Lie algebra is that it is the vector space of vector fields that are invariant under left multiplication by elements of the Lie group [20, 29]. In this way, we can define the Lie derivative of a function with respect to an element of the Lie algebra. One place in the literature where they are combined is in [13, equation (5), Chapter II].
These standard abstract definitions can be made more concrete in our special case where the Lie group is the set of unit dual quaternions, and the Lie algebra is the set of pure dual quaternions.
If one has a quantity that is a function of pose , then we usually think of its derivative as the Jacobian with respect to the components of . But it really makes more sense to compute the derivative with respect to the components of the perturbation of . The latter is the Lie derivative.
The definition is this. Given a differentiable function whose domain is the unit dual quaternions, , we can extend it arbitrarily to a differentiable function whose domain is an open neighborhood of in . Given a unit dual quaternion and a pure dual quaternion , we define the Lie derivative of in the direction of to be
(51) 
Since isn’t necessarily a unit dual quaternion, it is not obvious that the definition of the directional Lie derivative doesn’t depend upon how the domain of was extended from , but it is, as is shown in Lemma 1 below.
Given a generic function whose domain is the dual quaternions, , we define its Jacobian to be the dual quaternion
(52) 
If its domain is the pure dual quaternions, , we have the same formula except with replaced by .
Using the chain rule for partial derivatives, we obtain the following formula, which is useful for explicitly calculating the Lie derivative if the function
is known.(53) 
We define the partial Lie derivatives to be
(54) 
and its full Lie derivative to be the pure dual quaternion
(55) 
so that for all pure dual quaternions it satisfies:
(56) 
To gain some intuition, write
(57) 
so that
(58) 
Then we see that represents a change in pose by an infinitesimal translation and an infinitesimal rotation , measured in the moving frame of reference. Thus is a pure dual quaternion giving twice the change in with respect to an infinitesimal rotation, plus times twice the change of with respect to an infinitesimal translation.
One important property of the Lie derivative is that if represents a pose, with twist , then
(59) 
The Lie derivative satisfies various rules, which are also useful for explicitly calculating the Lie derivative when is known.

If is linear in , then
(60) 
The product rule: if is any product, such as inner product or dual quaternion product, then
(61) 
The chain rule:
(62)
10. Applications to parallel robots
Suppose that the position of the end effector of a parallel robot is given by actuators, described by quantities
(63) 
For example, for a cabledriven parallel robot, these represent the lengths of the cables, and typically . For the Gough or Stewart Platforms [10], we have .
Let us also denote the force exerted by the actuators by
(64) 
defined so that the rate of change of work performed through the actuators is given by
(65) 
Suppose we have a function , which calculates the required actuator values, , from the pose of the moving frame. This is the inverse kinematics function. If it is a simple cabledriven parallel robot, this can be computed using Pythagoras’s Theorem, from the positions of the actuators on the fixed frame, and the positions of where the cables are attached to end effector. But this formula could be more complicated in other situations.
There is also a () matrix that maps the actuator forces to the wrench dual quaternion:
(68) 
This can be computed by balancing the force and torque exerted upon the endeffector. But it can also be computed with the following important identity:
(69) 
This is because the rate of change of work done on the parallel robot can be computed in two different ways, either using equation (36), or (65). Substituting in equations (67) and (68), we obtain
(70) 
Since this is true for arbitrary actuator forces and end effector twists , the result follows.
11. Second lie derivatives
If is a function of dual quaternions, we define its Hessian to be the matrix
(71) 
Thus the expression should be interpreted with treated as an eight dimensional vector.
We also need to compute second Lie derivatives. Calculations show that
(72) 
As a corollary, we obtain the well known identity:
(73) 
which implies that Lie derivatives do not necessary commute.
Another way one might try to define the second derivative is to use the formula . Unfortunately, this definition doesn’t work, as it depends upon the choice of how to extend the domain of to all dual quaternions. The obvious choice of extension is to use the normalization:
(74) 
We have the following formula for the Hessian of :
(75) 
This equation is proved in Section 15.
12. Forward kinematics for parallel robots
Let the set of admissible actuator values, , be the range of the function . Then the forward kinematics function is
(76) 
which is a left inverse to . Because of possible measurement errors, should produce decent answers even if the actuator values are merely close to .
We will focus on the overconstrained problem, that is, when the number of actuators is greater than . If , a much simpler approach is possible by simply applying the NewtonRaphson method directly to . In that case, only first derivatives of are required.
This problem has been solved by many others, for example, [22, 31]. But we feel that this is much more easily solved using dual quaternions. The function takes actuator values , and seeks to find the pose so that is close as possible to . We do this by seeking to minimize the loss function
(77) 
We use the NewtonRaphson Method, which given close to a minimum of the loss function, finds which is much closer to the same minimum of the loss function. Because this method as usually stated works only on linear vector spaces, we first have to define a map , with , from the linear vector space of pure dual quaternions to the manifold of unit dual quaternions [15].
Most papers on the NewtonRaphson Method on manifolds construct this map using the so called exp function [7, 8, 9]. So the map is
(78) 
The exp map in these papers is following the path of a geodesic on the manifold, and this is equivalent to using the equations of motion of the end effector as described in Section 13. Another exp map is to follow a oneparameter subgroup, or equivalently, equation (24). We do not take these approaches, as they can be computationally expensive. Our approach is to normalize:
(79) 
This is numerically close to the second approach, as is shown by equation (25).
In order to apply the NewtonRaphson, we need to compute the Jacobian
(80) 
and its Hessian , which by equation (75) is
(81) 
The NewtonRaphson method we use is to iterate:
(82) 
For the criterion of when to terminate the NewtonRaphson Method, we measure .
This method works well in practice, because typically we measure the pose frequently, and thus the current pose won’t be much different than the previously measured pose. Thus the previously measured pose is a good starting value for using the NewtonRaphson method for finding the current pose. We found that only about two or three iterations of the NewtonRaphson method were generally required, and that it easily ran at a thousand times per second on a modern computer.
13. Dynamics of the end effector
We define the noload forces to be the actuator forces if there is no end effector present:
(83) 
where is a positive definite matrix denoting what we shall call the effective noload mass of the actuators.
Let us suppose that the kinetic energy of the parallel robot is given by
(84) 
where is a positive definite matrix, which depends only upon , and which we call the effective mass of the parallel robot. If is the mass of the end effector,
is the moment of inertia tensor of the end effector about its center of mass, and
is the center of mass of the end effectors, all measured with respect to the moving frame, then(85) 
that is
(86) 
where is the identity matrix.
The next result is proved in Section 15.
Theorem 1.
The various terms in equation (88) can be interpreted as follows.

and are inertial resistance to change of angular and translational velocities.

is the centripetal force required to rotate and move at the same time.

is the precession torque (so that if the moment of inertia is not isotropic, then the body spins in a counterintuitive manner, see, for example, [18]).

(91) is the wrench required to move the actuators, where the noload forces may be computed using
(92) 
is the force due to gravity, translated into the moving frame of reference.
14. Spherical linear interpolation of dual quaternions
Suppose we are given two unit dual quaternions, and ? The problem is to find the dual quaternion valued function of time, denoted , such that at , it takes the value , that at , it takes the value , and such that its twist, , is a constant pure dual quaternion. The name of the function, ‘slerp’, comes from the equivalent problem for orientations [24].
This requires two functions. The first is the exponential function, defined in equation (24). The function as function of time is the pose traveled if it is the identity pose at , and maintains a constant twist . The second is a logarithm function, which given a unit dual quaternion , finds such that . There are infinitely many possibilities for , so we will just describe the principal value, that is, the one for which the norm of the angular velocity is minimized. Then
(93)  
If one is merely interested in interpolating between the poses represented by these unit dual quaternions, multiply one of them by so that nondual part of has nonnegative real part.
The results of this section may be found in [21]. We have seen similar results in [26], but we believe our formulas to be more explicit.
Given any dual quaternion
(94) 
by decomposing into a vector parallel to and perpendicular to
, we can suppose without loss of generality that there exists orthonormal vectors
and such that is a linear combination of , , , , and .Theorem 2.
If and are orthonormal vectors, then if
(95)  
and if
(96) 
If , this represents a pose that rotates counterclockwise with angular velocity around , and translates by
that is, the sum of a point on a line in the direction of moving with speed , and a point on a circle in the plane perpendicular to of radius which is traversed once every time units. If , this represents a pose with the identity rotation and translating in a straight line with velocity .
Theorem 3.
Suppose and are orthonormal vectors, and that
(97) 
is a unit dual quaternion, with . Let
(98) 
that is, the angle part of the polar coordinates of , with . Then if then
(99) 
where if we assume , and we set .
It is not possible to assign a canonical principal value for , because it could be for any unit vector .
15. Proofs
Proof of Equation (17).
Since , we have
(100) 
Hence using Taylor’s series
(101) 
from which it follows that
(102) 
∎
Proof of Equation (32).
First, if we differentiate , we obtain , that is, is pure.
Suppose the position is the image of the constant position under the pose represented by and also represented by . Differentiating equation (2) we obtain
(103) 
Differentiating equation (22) we obtain
(104) 
After some manipulation, and setting , we obtain
(105) 
Comparing equations (103) and (105), we obtain
(106) 
and since this holds for all , the result follows. ∎
Lemma 1.
The definition of in equation (51) does not depend upon the extension of from to a neighborhood of in .
Proof.
Let and be two extensions of from to a neighborhood of in