In recent years, there has been an increasing interest toward Modular Robotics and reconfigurable and adaptable robots have started to be designed [1, 2, 3]. In particular, reconfigurability and modularity can be exploited to build robots with greater adaptability to several different environments, as well as robots able to accomplish different tasks, obtaining important cost and time reductions.
On the other hand, Modular Robotics introduces new challenging issues. One of these is the need to handle robots with variable kinematic structures, where this variability might result in a partially or complete lack of knowledge about the kinematic structure. It is worth stressing that the forward and inverse kinematics are fundamental in robotics applications; remarkable examples are motion planning , robot modeling and control .
When dealing with standard robots, the prior knowledge about the robot geometry is extremely precise, since most of the times CAD models are available and direct measurements of the robot parameters are possible. Kinematic models relate joint input signals and robot configurations. Typically they are computed as a sequence of relative transformations between reference frames properly assigned . Unavoidable inaccuracies in the geometrical parameters, wear and systematic errors in the measurements, make sometimes calibration procedures necessary .
In the Modular Robotics context, where the uncertainty about the robot geometry might be particularly high, the development of algorithms able to estimate the kinematic structure starting from a time series of visual observations is crucial. The kinematic structure identification problem is defined at different levels, depending on the sensors used and the amount of prior information available. When there is no prior knowledge, firstly it is necessary to identify the rigid bodies composing the robot, and extract information about their poses. Secondly, starting from this piece of information, the robot kinematic is learned identifying the ordered sequence of links, the type of the joint connecting any pair of consecutive links and the corresponding input signal.
The first level is strictly related to the kind of sensors used in the data collection. For example, in  point cloud data have been considered. More precisely, the authors have proposed to identify the links by clustering points based on their relative distances, and by assuming that each cluster corresponds to one link. The task is much more complicated when observations come from a standard 2D camera. Indeed in this case the clustering phase should be preceded by the identification of features which are constant over the observations.
In this paper, we consider the setup introduced in [8, 9] and , where a distinct fiducial marker is attached to each link. This assumption, that simplifies considerably the data acquisition, is particularly reasonable in the Modular Robotics context.
The focus of this paper is on the second level of the kinematic structure reconstruction. Specifically, starting from a time series containing the marker poses and the joint signals, we propose an algorithm able to reconstruct the ordered markers sequence associated to the robot kinematic chain, together with the joint types connecting subsequent links and the corresponding joint input signals.
Similar problems have been treated in  and . In , the authors have restricted the scope of their work to the case of only revolute joints, and the learning of the kinematic parameters is obtained by a gradient-based minimization procedure. Moreover, numerical results highlight how the convergence is guaranteed only when each link has a marker attached to. In 
the authors have proposed the use of Gaussian Processes (GP) methodologies. However, the focus is limited to learn only the sequence of markers, and no joint type identification has been considered. Moreover, even if the authors have proposed a strategy to simplify the standard GP procedure, the algorithm they developed might be quite expensive from a computational point of view, in particular when dealing with manipulator with a relevant number of degrees of freedom.
The kinematic structure identification algorithm we introduce is based on checking the feasibility of three systems of equations, which are obtained starting from elementary kinematic relations between pairs of subsequent links and using information extracted from time series of visual data. More precisely, given a couple of markers attached to subsequent links and the corresponding joint input signal, a linear system of equations holds true if the three elements define a prismatic transformation, instead a linear and a non linear systems are satisfied if the transformation is revolute.
In general, it is possible to exhibit sets of observations for which systems of equations hold true though the pair of markers and the joint signal considered are not in relation among them. However, by extensive Monte Carlo simulations
we show that this false-positive fact is very unlikely to appear. To ensure that the feasibility of the introduced systems of equations is a necessary and sufficient condition to verify if two markers are attached to consecutive links, we need to apply our strategy with data obtained from fully informative sets of observations; in the paper we exhibit a class of trajectories from which it is possible to properly select observation sets which are fully informative.
Compared to the state of the art, the proposed approach is less expensive as regards the computational costs, since it is based on the solution of linear and non linear systems of equations with low dimensionality. Moreover, differently from , the proposed algorithm fully reconstructs the kinematic structure, including the joint type sequence.
The paper is organized as follows. In the next Section we describe the considered setup, and in Section III we derive the three systems of equations used in the kinematic identification. The fully informative trajectories are reported in Section IV, while the proposed algorithm and the Monte Carlo simulations are in Section V.
Ii Setup Description and Problem Formulation
In this section we formally describe the setup and problem considered in this work.
The framework is the same one adopted in , and it consists in a camera and a robotic arm, composed by links and joints, forming an open kinematic chain. As far as the camera placement is concerned, we adopted an eyes-to-hand configuration, namely the camera is observing the robot and its pose is fixed with respect to the world reference frame (RF). For a pictorial description of the overall setup see Figure 1.
We assume that each robot link , , has a fiducial marker , , attached to it, see for instance, [11, 12] and . The subscript denotes the link position in the kinematic chain, in particular and correspond, respectively, to the base link and the last link.
The marker-link relations are unknowns and, for this reason we use different subscripts to enumerate markers and links. More formally, referring to the notation above introduced, given the -th marker , the link at which is attached to is unknown.
The relative poses between the world RF and each marker are obtained processing each single frame coming from the camera [11, 12, 13]. Specifically, let be the pose of marker at time , denoted with the respect to the camera RF (hereafter indicated by the superscript ); then
denotes the position vector, and it is composed by the three Cartesian coordinates, and , while, as far as the orientation is concerned, the yaw-pitch-roll convention is adopted, where , and are respectively the yaw, pitch and roll angles. For computational reasons, it is convenient to express the relative orientation between and the camera using the rotation matrix , that is related to the yaw, pitch and roll angles by the standard expression . Similar definitions hold for , which denotes the pose of the -th link.
Then, by processing the frame taken by the camera at time instant , it is possible to reconstruct the set of poses .
In addition, we assume that the vector of the joints configurations is available at time ; specifically, parametrizes the relative displacement between the two links connected by the -th joint. However, we assume that also the relations joint-links are unknown, namely, for the -th joint we do not know which pair of links and is connected by joint . For this reason, similarly to what done when considering the markers, we use different subscripts to denote joints and links.
The main goal of this paper is that of identifying the robot kinematic structure starting from a time series of measurements , composed by the joints configurations and the marker poses.
From now on, to keep the notation compact, we point out explicitly the dependencies on time only when it is necessary.
The identification of the robot kinematic structure can be decomposed into three subtasks:
Identifying , i.e., the sequence of markers associated to the kinematic chain ;
Identifying , i.e., the sequence of joint types connecting consecutive links along the kinematic chain, starting from the pair , up to the pair ; more precisely
is a binary variable assuming value(resp. value ) when the joint connecting and is prismatic (resp. revolute);
Identifying , i.e., the sequence of joint signals that parametrize the corresponding transformations in .
Iii Relations Between Couple of Subsequent Markers
In this section we provide useful expressions that describe the relative motion between markers which are attached to consecutive links; in particular we will distinguish the case where the joint connecting the links is prismatic from the case where it is revolute.
More formally, let and be the two markers, and let and be the links they are attached to. Supposing (similar considerations hold for ), we will assign a RF to each marker and to each link and, based on these RFs, we will provide a mathematical description of the relative motion between and . In particular, given a set of observations , we will identify three systems of equations, that will be exploited in the next sections, to discriminate if the links associated to and are directly connected or not. Additionally, if and are connected, the systems will uniquely identify the joint type connecting the two links and also the joint variable describing the relative displacement.
Iii-a Reference frames conventions
The definition of a RF for each link and for each maker is required to provide a mathematical description of the transformations occurring along the kinematic chain. As far as the links are concerned, we adopt the Denavit-Hartenberg (DH) convention; for details we refer the interested reader to , chapter . Once the RFs of the links have been assigned, the expression of , i.e., the relative orientation between the consecutive links and , is given by
being and the elementary rotation matrices around the -axis and the -axis, respectively, and a constant parameter (see ). In case the joint connecting and is prismatic, then is constant and equal to , while, if the joint is revolute and controlled by , it holds . The relation between the relative positions of and is described by , i.e., the expression of the origin of with the respect to , and is given by
where is defined as before and is a constant parameter of the kinematic (see ). If the joint connecting and is revolute then is constant and equal to , while, if it is prismatic and parametrized by , then it holds .
Additionally, we need to define position and orientation of the reference frame of each marker with the respect to the reference frame of the link they are attached to. For example, suppose that is attached to , then position and orientation of w.r.t. are described, respectively, by and . For later use it is convenient to introduce also and . Similar definitions hold for and .
Since the marker-link transformations are fixed, and are independent from the joint values and constant over the time.
It is worth stressing that and are unknown, and we did not introduce any limitation on the way the markers are attached to links. From a practical point of view, this fact is very interesting, since it allows to adopt the proposed algorithm even in setups different from the one we described in Section II
. Indeed, it might happen that the markers are not available and the use of ad-hoc computer-vision algorithms is required to get information about the robot displacements[14, 15, 16]. In this context the markers placement is not controllable, but it still holds that and are constant.
Let and be, respectively, the Cartesian coordinates of the origin of w.r.t. , and the relative orientation between and . Assuming that and are subsequent, i.e., , is given by
Moreover, exploiting standard properties of rotation matrices, the following equation holds
It is worth remarking that, in the described setup, we have knowledge of , since , where , and are obtained processing the information coming from the camera. Moreover, also is known, since by definition it is a function of the camera observations and .
In the remaining part of this Section we further investigate the above relations, distinguishing the case where the joint connecting two successive links is prismatic from the case where the joint is revolute.
Iii-B Prismatic joint
Assume that the joint connecting and is prismatic. Since in this case the angle is constant, it follows that the relative orientation between and is not affected by variations of the joint variable, that is, the matrix is also constant over the time. In addition, assume that is the joint variable associated to the prismatic joint connecting and . By substituting the expression of given in (2) into (3), the following equation holds
where the first three terms of the last equation are constant and they can be compacted in the vector , while the last term depends on the joint coordinate .
Observe that the last equation defines a system of equations which are linear w.r.t. and the third column of that we denote hereafter as . Specifically we can write
where and subject to the constraint
We have the following Proposition.
Consider two markers and , attached to consecutive links connected through a prismatic joint. Let be the joint signal influencing the relative motion between the two links. Then, given a set of observations , the rotation matrix is constant and the linear system of equations in (8) has solution satisfying the constraint in (7).
Iii-C Revolute joint
which is linear w.r.t. the vector of variables
since Equation (III-C) can be rearranged as
When considering a set of observations with cardinality we obtain the following liner equations
We have the following Proposition.
Consider two markers and , attached to consecutive links connected through a revolute joint. Then, given a set of observations , the linear system of equation in (10) has solution.
It is worth observing that the dependance of (10) on the revolute joint signal is not explicit since it is incorporated into the evolution of the matrix . To directly consider the effects of varying the joint signal on the relative motion between and , we analyze their relative orientation. Differently than in the prismatic case, is not constrained to be constant. Let be the actuation signal of the revolute joint between and .
where . Analyzing (11) element-wise, we can identify a system of nine equations where the unknowns are the elements of and , while the output is given by the elements of . Now, let be the operator that maps a -dimensional matrix into the - dimensional column vector obtained by stacking the columns of the matrix on top of one another, then we have that . In addition observe that the unknown variables must satisfy the following orthogonality constraints
It is worth stressing that the aforementioned equations are non linear w.r.t the variables and .
If, instead of considering a single observation we consider a set of observations , the equations describing the relative orientations between and are
We have the following proposition.
Iv Fully Informative Trajectories
In the previous Section we have stated three propositions defining conditions that are verified when and are attached to consecutive links. In general the reverse relations are not true, since it is possible to exhibit sets of observations such that the conditions of the previous Propositions are satisfied though the markers are not attached to subsequent links; this fact is strictly related to the sequence of joint configurations which have generated the taken observations. Unfortunately, due to space constraints, we do not include in this paper examples of such false positive observation sets.
Instead, in this section we introduce a class of trajectories from which it is possible to properly select an observation set for which the conditions defined in Propositions 1 for the prismatic joint and in Propositions 2 and 3 for the revolute joint, are not only necessary but also sufficient to verify if two markers are attached to consecutive links. An observation set with this property is said to be fully informative. A class of fully informative observation sets can be formally defined as follows.
Consider a collection of trajectories, where each trajectory is obtained moving only one joint, and keeping all the others stuck. For , without loss of generality, assume the -th trajectory is obtained varying the joint signal and let and be two time instants such that and , , where is the module operator. Then let us define the observation set as the collection of the pairs
Observe that has observations, which for typical robot (i.e., ) represents a limited number of observations. It is possible to show that is a fully informative set and in particular we have the following results.
The proofs of the above Propositions are reported in Section VII).
A couple of remarks are now in order.
As said there are examples of observation sets which are not fully informative and that might lead to false positive situations. However in the numerical Section we will show, by Monte Carlo simulations, that selecting these false positive observation sets from generic trajectories seems to be a very unlikely event.
It is worth noting that the condition might not be verified with general input trajectories. A straightforward example happens when the actuation signal is periodic, with period and . To avoid this situations we assume that the obtained trajectories are post processed, simply removing the redundant values.
V Proposed Approach
In this Section we describe the algorithm we propose to deal with the robot kinematic structure identification problem. Our strategy consists in iterating over all the possible triplets, composed by a pair of markers and a joint signal, the procedure described by the flow chart in Figure 2.
Specifically, given a pair of markers and a joint signal, the algorithm first evaluates if the markers are connected through a prismatic joint and, in case this test is negative, it secondly evaluates if they are connected through a revolute joint.
The first test consists in checking the feasibility of the linear equations defined in Proposition 1. The second test, instead, is composed by two steps; the first step verifies if the linear equations of Proposition 2 admit solution, and, in such case, also the second step is performed which consists in solving the system of non-linear equations of Proposition 3. Observe that, in this way, the last step is performed only when it is necessary, thus, minimizing its executions. This fact is particularly relevant from the computational point of view, since the non linear test is the most expensive.
V-a Empirical results for general trajectories
In this section we investigate the effectiveness of the proposed approach for general trajectories, by running Monte Carlo simulations. In particular, simulating different robot kinematics (with ), we obtained a dataset composed of time-series, each one accounting for 50 observations. Among the different simulations we let vary several parameters, like the joint type order , the DH parameters and the markers positioning, namely and . As far as the input trajectories are concerned, we simulate a sinusoid for each joint signal, with amplitude and frequency randomly selected in each simulation.
For each time-series we have considered all the possible triplets, i.e. all pairs of markers and joint signal, for a total number of triplets, and we computed the systems of equations defined by Proposition 1, 2 and 3, to verify if the systems have solution.
Results are reported in Figure 3
in the form of confusion matrix. As usual the elements along the diagonal quantify the well classified triplets, while the elements outside the diagonal the ones which are misclassified. For example, when considering Proposition1, the entry quantifies the cases in which the system defined by Proposition 1 holds true and the elements of the considered triplets identify a prismatic transformation, while the entry the cases in which the system of equations is not verified and the relation between the elements of the triplet is not prismatic. The entry instead, represent the cases in which the system of equations holds even if the triple of elements are not connected, while the element represent the opposite situation.
Results confirm that equations defined by Proposition 2 and 3, identify only a necessary condition when considered alone, since a significant number of false positives occurred. Indeed it happens that the relative systems of equations have solutions even if the two markers are not connected. On the other hand, considering them together as done in Proposition 5
, the number of false positives goes to zero, allowing a perfect classification. Basically, empirical evidence highlights how the probability that the dataset collected contains only observations that leads to a false positives is close to zero.
In this paper we introduced a novel and computational efficient algorithm able to learn the robot kinematic structure from visual observations. We prove the effectiveness of our approach in several simulated environments, testing different type of kinematics. As future work we plan to extend the proposed approach to the case of noisy measurements, and test the algorithm also in a real robot. In particular we expect to replace the two linear systems with two linear least squared problems, and the non linear system with a non linear least squared problem, as well as defining a threshold in the error, to discriminate among the cases in which the links are connected or not.
To prove Proposition 4 and 5, we need to introduce the relative transformations between two links and not subsequent in the kinematic chain. , the relative orientation between and is obtained iterating equation (1) along the kinematic chain, i.e. considering all transformations induced by the joints between and :
where is a function of if the joint is revolute.
As regards the position the following general equation holds
Vii-a Proof of Proposition 4
Proposition 4 states that, when considering the excitation trajectories in Definition 1 and the relative minimal set of observations , if , and verify equations of Proposition 1, then and are subsequent in the kinematic chain, and the joint between them is prismatic with as actuation signal.
We prove this proposition by contradiction, showing that, if the system of equations defined by Proposition 1 holds, then triple , and describe a prismatic transformation.
First of all we exclude the case in which there is at least one revolute joint between and . Without loss of generality let be the index of the link coming after the revolute joint. The relative orientation between the markers is , that, exploiting (14), becomes
If Equations of Proposition 1 are verified, the relative orientation between and is constant, and consequently, considering two different time instants and , , that implies
However the last condition is not verified when the input trajectories are in . Indeed, when considering two input locations belonging to the subset of trajectories with and if , and , since they are function of the signals, that assume the same values in and . This implies , and for the uniqueness if the Euler angles , that is against Definition 1.
The last observation proves that, when the joint signals are chosen in accordance with Definition 1, can not be constant over if there is at least one revolute joint between the links. Consequently, to conclude the proof, we consider a configuration in which there are one or more prismatic joints between and . In these configurations the RF origin in the RF is
where the terms before the last sum are constant and equal to , while the ones in the last sum depends on the joint signals. As (5), the last equation defines a linear system with coefficient matrix and vector of variables