Hybrid Data-Driven and Analytical Model for Kinematic Control of a Surgical Robotic Tool

06/04/2020 ∙ by Francesco Cursi, et al. ∙ Shanghai Jiao Tong University 0

Accurate kinematic models are essential for effective control of surgical robots. For tendon driven robots, which is common for minimally invasive surgery, intrinsic nonlinearities are important to consider. Traditional analytical methods allow to build the kinematic model of the system by making certain assumptions and simplifications on the nonlinearities. Machine learning techniques, instead, allow to recover a more complex model based on the acquired data. However, analytical models are more generalisable, but can be over-simplified; data-driven models, on the other hand, can cater for more complex models, but are less generalisable and the result is highly affected by the training dataset. In this paper, we present a novel approach to combining analytical and data-driven approaches to model the kinematics of nonlinear tendon-driven surgical robots. Gaussian Process Regression (GPR) is used for learning the data-driven model and the proposed method is tested on both simulated data and real experimental data.



There are no comments yet.


page 1

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Robot kinematic modelling is is the pre-requisite of effective robot control. Having good kinematic models allows to properly control the robotic system, without requiring complex compensation strategies. The more accurate the robot model is, the more precise the control will be. Moreover, in cases where it is not possible to rely on external sensors to compensate for positioning errors (such as camera obstructions), accurate kinematic models are essential.

There exist a large variety of robotic structures, such as rigid-link articulated robots, flexible-link robots, continuum robots, soft robots. Depending on the structure, different modelling techniques exist. Articulated robots with rigid links are usually modelled by using Denavit-Hartenberg convention [Siciliano2009RoboticsControl]. Flexible link robots [Cheong2004InverseApplications] are usually modelled by using Euler-Bernoulli beam theory and a set of generalized coordinates to describe the rigid and flexible motion. The models of continuum robots and soft robots are generally derived by using constant-curvature, variable-curvature and Cosserat rod models [Burgner-Kahrs2015].

These models are often computed analytically, by means of mathematical formulations. However, analytical models are usually based on some assumptions and simplifications. Even though these simplifications allow to make the modelling easier and more understandable, they lead to modelling errors that need to be properly compensated by means of the control strategies [Reinhart2017Hybrid].

Figure 1: The Micro-IGES robotic surgical tool.

Recently, there is a growing interest in the use of data-driven machine learning techniques for robot modelling. In the field of robotics, machine learning has been widely used to accurately approximate models of robots, without the need of analytical models, which may be hard to obtain due to the complexity of the system [Nguyen-Tuong2011]. Despite proving very powerful, data-driven approaches depend on the algorithm chosen and, most of all, on the training data-set [Nguyen-Tuong2011]

. In order for the model to be accurate and generalizable, the training data must be gathered properly and should cover as much as possible the input and output spaces. Moreover, bad data points such as outliers should be rejected in order to avoid improper modelling


Robots for minimally invasive surgery have usually complex structures, being very articulated. Therefore, modelling their kinematics may be very challenging. Moreover, due to miniaturization requirements, flexibility, and sterilization, these robots are usually tendon-driven. Tendon transmission is a source of high nonlinearities due to hysteresis, tendon elongation and slack, friction. These nonlinearities are very hard to model, even if many researches have focused on building analytical models [Do2015a, Ismail2009, Do2014c, Do2017, varghese2020nonlinearity, Tjahjowidodo2016, Palli2012, Do2013]. On the other hand, other works have focused on modelling robotic system by uing data-driven approaches. Yu et al. [YuProbabilisticControl]

used Gaussian Mixture Model to build the kinematic model of a robotic catheter. A comparison of different approaches (Gaussian mixture models, k-nearest neighbour regression, and extreme machine learning) to model the inverse kinematics of a cable-robot was presented in


However, combining analytical and data-driven models may leverage the advantages of both methods and thus improve the modelling. Thus far, little work has been focused on hybrid data-driven and analytical approaches.

Reinhart et al. [Reinhart2017Hybrid] utilized three different data-driven approaches to be mixed with the analytical model of a soft robot based on the constant curvature assumption. The hybrid model is built by using the data-driven approaches to learn the errors between the analytical model and the acquired data.

Nguyen-Tuong et al. [DuyNguyen-Tuong2010UsingDynamics] incorporated the known dynamic model of a robot into the prior of a Gaussian Process, either by using it in the process mean function or in the kernel function. In both works, the hybridization yields to better modelling results, with improvements also in the generalization of the model.

To the best of the authors knowledge, these hybridization methods have little been applied in the field of minimally invasive-surgery, where high accuracy is required. For instance, [Porto2019PositionAnalysis] exploited the knowledge of the kinematic model of a continuum robot to improve the learning of the inverse kinematics, compensating for hysteresis. This, however, required learning different forward and inverse models and the robot was supposed to bend following the constant curvature assumption.

In this work we present a novel approach for combining analytical and data-driven approaches for modelling the forward kinematics of robots, with particular emphasis to the Micro-IGES [Shang2017] (Figure 1

), a tendon-driven robotic surgical tool. The method utilizes Gaussian Process Regression (GPR) for the computation of the data-driven model. This regression method has been chosen thanks to its ability to provide a confidence interval, indicating the uncertainty in the model. The contribution of the paper is therefore two-fold:

  • compare the results for kinematic modelling by using different approaches based on GPR;

  • present a novel method for kinematic modelling based on mixing data-driven and analytical methods.

The paper is therefore structured as follows. Section II presents the robot under exam and describes the computation of the analytical, data-driven, and hybrid models, along with a brief introduction of Gaussian Process Regression. Section III presents the results for the forward kinematic modelling of the system. Different data-driven approaches are used for the modelling and their results are compared to those of the proposed hybrid approach. The methods are tesetd both on simulated and real data. Finally, conclusions are drawn in Section IV.

Ii Robot Forward Kinematic Models

The kinematic model of a robot allows to estimate the end-effector Cartesian pose (position and orientation), given some input joint values. In tendon driven system, however, the tendon transmission leads to nonlinearities in the motor to joint mapping, which are difficult to estimate analytically, thus requiring data-driven approaches to estimate the robot pose. In this Section a brief overview of the robot under consideration is presented. Then, the computation of the data-driven and analytical kinematic models are described. Finally, the approach to combine the two models to obtain a hybrid one is presented.

Ii-a Gaussian Process Regression

Gaussian Process Regression is a nonparametric regression method that allows to approximate any nonlinear function , where with

is the vector of measured outputs,

is the input vector, and is Gaussian noise with

mean and variance equal to

[Murphy2012a, Rasmussen2006].

A Gaussian Process is completely defined by its mean function and covariance function , and it defines the distribution of the data. Given the knowledge of the prior on the training set defined by and , it is possible to estimate the distribution on a test input set which results to be


where is the covariance matrix between the input values and .

The prior knowledge of the model can therefore be incorporated in the regression by defining the mean function of the Gaussian process [Rasmussen2006].

Ii-B The Micro-IGES Robotic Surgical Tool

The Micro-IGES [Shang2017, Leibrandt2017] (Fig. 1) is a surgical robotic tool, composed of a rigid shaft (27 cm) and a flexible section (39mm at zero configuration). The shaft is responsible for the roll and translation DOFs. The articulated end, instead, consists of 2 elbows for pitch and yaw, with each elbow made of a pair of coupled joints, a 1DOF revolute joint for the wrist pitch, and the jaws. The jaws provide two more DOFs: one for the wrist yaw and one for the gripper’s opening/closing. Each of the joints of the articulated part is driven by an antagonistic pair of tendons, with each pair being connected to the corresponding driving capstan at the proximal drive unit. The coupling of the two pairs of joints of the elbows occurs at the driving unit: the two capstans that drive the two serial joints for each DOF of the elbow (pitch and yaw) are coupled by a series of gears with 1:2 ratio.

Ii-C Analytical Kinematic Model

In many applications models of the kinematics of the robots have been developed analytically. In order to control a robot, desired motor input values must be provided. These motor values are converted, by means of the motor transmission, to joint values, which, in turn, depending on the kinematic model of the robot, are then mapped onto the end-effector Cartesian pose. The analytical approach consists in finding mathematical relationships between the joint values and the the Cartesian pose, and between the motor values and the joint values.

The joint to Cartesian mapping depends on the geometry of the robot. For articulated robots, Denavit-Haretnber convention can be used [Siciliano2008]. In continuum robots, instead, constant curvature or variable curvature models have been developed [Burgner-Kahrs2015, Camarillo2008, Bajo2012]. The motor to joint mapping, instead, depends on the type of motor transmission. In tendon-driven systems analytical models of hystersis, friction, tendon elongation need to be formulated [Do2015a, Ismail2009, Do2014c, Do2017, varghese2020nonlinearity, Tjahjowidodo2016, Palli2012, Do2013].

Being the Micro-IGES robot an articulated robot, the joint to tip-pose mapping is computed by means of Denavit-Hartenberg convention. For the motor to joint mapping, an hysteresis model is included as described in [Leibrandt2017]. The analytical model is then described as


where is the end-effector Cartesian position, describe the current motor state and is the motor velocity in the previous state. The addition of the actual and past motor velocities are needed to compensate for the hysteretical behaviour.

Ii-D Data-Driven Kinematic Model

Despite being very generalizable, analytical models are usually based on some approximations and assumptions. The errors in the modelling may lead to wrong or poor control. In tendon driven systems, especially, the nonlinearities in the motor to joint mapping are very difficult to model analytically, and the analytical approximations may not be satisfying enough.

Data-driven approaches, on the other hand, allow to build models of the system based on the data acquired. For the computation of the data-driven model, Gaussian process regression is used thanks to its ability of providing a confidence interval of the model.

In order to compensate for the nonlinearities in the Micro-IGES system, the data-driven model is computed as


The input vector includes the motor velocities and accelerations in order to have better compensation of hysteretical effects and friction. In order to compute , three different independent Gaussian Processes need to be used. Therefore, with each corresponding to each posterior mean value obtained from the Gaussian Processes. Each predicted position component is also associated with a variance, defining . LimboGP library [Cully2016Limbo:Optimization] for C++ has been used for the Gaussian Process Regression.

Ii-E Hybrid Kinematic Model

Both analytical and data-driven methods have their advantages and disadvantages. Analytic models are typically more generalizable and can be applied to different scenarios. Nevertheless, they are often based on simplifications. Data-driven models, instead, allow for more complex modelling, but they rely on the acquired data. This makes the generalization more difficult, due to poor dataset exploration, and may lead to wrong models if data contains outliers [Cursi2019ALearning]. Therefore, to improve the accuracy of the model, a combination of the two may be necessary.
Let be a single input vector, the data-driven model associated with , and the analytical model. The hybrid model is then computed as (4)


where is a weighting diagonal matrix, with . For the computation of the analytical model the input value can be computed as .

Each component of is computed as (5)


Equation (5) shows that in regions where the data-driven model is more uncertain (high variance) the weight tends to 1, thus favouring the analytical model. On the contrary, if the uncertainty of the data-driven model is low, the data-driven model is preferred. Moreover, if the error between the two models is high, then the data-driven model is preferred.

The value is defined as , where is a desired threshold. This threshold can have different values for each Cartesian component. This regularization is needed because, if the data-driven model is very uncertain, the analytical approach may be favoured, even if the error is high. Therefore, large values of will tend to give more importance to the data-driven model.

Iii Results

Figure 2: Window showing the noisy data used for the learning process in simulation. The red line is the actual data; the blue line is the data with added noise used for the learning.
Figure 3: Model comparison on a subset of the simulated learning data with both a precise and a wrong analytical model. For each case, the three different GPR and hybridization approaches are shown. The shaded regions indicate a 95% confidence interval of the GP models.

In this Section the results for modelling the kinematics of the Micro-IGES robotic surgical tool are presented. Different approaches are compared for the computation of the data-driven models. Moreover, two different analytical models (a wrong model and a more precise one) are used to test the results from the hybrid models.

Iii-a Simulated Forward Kinematic Model

The tendon-transmission in the Micro-IGES robot causes many nonlinearities due to friction, hysteresis, tendon elongation, etc. To validate the proposed method a simulated environment in VREP [VREP] is used. The robot is supposed to exhibit linear backlash at the motor side. Data for the learning process are acquired by commanding excitation trajectories to the robot and retrieving the tip position from the simulator. Each motor is excited as (6)


where , . The parameters , are computed by solving an optimization problem, maximizing the amplitude of the wave while satisfying position, velocity, and acceleration limit for each motor during the whole motion. For each motor is either 0 or 1, depending if the desired motor is moving or not. In total combinations of motions are performed.

Some noise is then added to each Cartesian component. The noise is Gaussian with zero mean and standard deviation of

. Figure 2 shows a window of the data used for the learning. In total 19499 samples were collected. For the learning, however, 1000 random samples are used due to limitations due to the GPR.

In order to learn the kinematic model of the robot three different approaches are considered for the data-driven models. Consequently, three different hybrid models are also computed:

  • Error Learning (): the error between the analytical model and the data is learned by using GPR.

    In this case the data-driven model can be written as , where is the error between the data and the analytical model learned through GPR. The hybrid model, according to (4) results to be . This means that the error compensation is activated or deactivated based on the level of uncertainty.

  • GP with Prior (): the analytical model is used as mean function for the prior distribution. Form (1), the data-driven model is computed as [DuyNguyen-Tuong2010UsingDynamics], where and is the value of the analytical model on the training set. The resulting hybrid model is then . In this case, the contribution of the prior distribution is affected not only by the values in the input space (by means of the covariance matrices), but also by the output values.

  • GP without Prior (): the input output mapping is computed without any prior knowledge. The hybrid model is computed as .

Wrong Analytical
RMSE wrt ground truth
x 0.0022 0.0024 0.0041 0.0025 0.0026 0.0041
y 0.0024 0.0027 0.0045 0.0026 0.0029 0.0044
z 0.0016 0.0018 0.0037 0.0022 0.0023 0.0035
RMSE wrt analytical model
x 0.0039 0.0038 0.0055 0.0032 0.0032 0.0052
y 0.0037 0.0035 0.0057 0.0031 0.0029 0.0054
z 0.0023 0.0021 0.0043 0.0008 0.0007 0.0038
Correct Analytical
RMSE wrt ground truth
x 0.0021 0.0021 0.0041 0.0014 0.0014 0.0037
y 0.0019 0.0027 0.0045 0.0010 0.0017 0.0041
z 0.0008 0.0018 0.0037 0.0007 0.0012 0.0032
RMSE wrt analytical model
x 0.0020 0.0018 0.0041 0.0012 0.0010 0.0037
y 0.0018 0.0022 0.0045 0.0007 0.0012 0.0042
z 0.0007 0.0016 0.0037 0.00005 0.0009 0.0032
Table I: RMSE (in m) between the computed models and the ground truth of the simulated model on the learning dataset.
Wrong Analytical
x 0.0085 0.0092 0.0201 0.0073 0.0089 0.0201
y 0.0096 0.0098 0.0236 0.0083 0.0087 0.0236
z 0.0056 0.0060 0.0155 0.0056 0.0069 0.0155
Correct Analytical
x 0.0072 0.0011 0.0201 0.0066 0.0011 0.0201
y 0.0062 0.0096 0.0236 0.0052 0.0096 0.0236
z 0.0028 0.0054 0.0155 0.0026 0.0047 0.0155
Table II: Maximum absolute error (in m) between the computed models and the simulated model on the learning dataset.

The results of the proposed method are then tested on two different cases: with a wrong analytical model, and with a more precise analytical model. Figure 3 and Table I show the results for the different approaches on the learning dataset.

In the case of wrong analytical model, no backlash compensation is used and wrong links’ lengths are assumed. In the second case, instead, correct measures and the same backlash compensation used for gathering the data are considered. In both cases, the threshold value in the weighting function (5) has been set to for each Cartesian component.

Comparing the three data-driven methods, results show that learning the error () yields to better results, with smaller RMSE on both datasets. The pure data-driven (), instead, is the one that always performs the worst.

When hybridization is used, adding the analytical model as in (4), results vary depending on the accuracy of the model provided. If a wrong analytical model is used, the RMSE errors between the computed models and the real one are slightly higher than in the case without hybridization. As a matter of fact, the hybridization with the chosen parameters (weigthing function, threshold, process confidence interval) makes the model tend more toward the provided analytical model. Also in this case, though, the hybrid model built from is the one that performs the best, whereas the one from the worst. However, as shown in Table II the maximum absolute error is reduced (or at least not increased) for all the models by adding the hybridization.

If, on the other hand, a more accurate model is used, the hybridization leads to much better results. The RMSE are much smaller than in the case without hybridization. As expected, also the maximum absolute errors decrease.

In order to further validate the proposed method, the robot was required to perform an additional testing motion. The motion is described by:


with and being the maximum and minimum motor position values. The end-effector tip position was collected during the motion. In this case, the hybridization was performed with the more precise analytical model. Results in Table III show again that learning the error has better results than the other data-driven approaches. Moreover, the hybridization leads to improved results for all cases.

RMSE wrt ground truth
x 0.0025 0.0024 0.0040 0.0018 0.0015 0.0034
y 0.0018 0.0032 0.0056 0.0012 0.0025 0.0053
z 0.0005 0.0011 0.0036 0.0005 0.0008 0.0029
Maximum Absolute Error wrt gound truth
x 0.0071 0.0051 0.0090 0.0056 0.0037 0.0088
y 0.0047 0.0085 0.0124 0.0041 0.0084 0.0124
z 0.0012 0.0022 0.0076 0.0012 0.0012 0.0076
Table III: RMSE and maximum absolute error (both in m) between the computed models and the ground truth of the simulated model on the testing motion.

Iii-B Real Data Acquisition

In order to build the kinematic model of the real robot, the tool tip position needs to be acquired. For this purpose, a Intel Realsense stereo camera has been used. The exciting motor trajectories commanded for the data acquisition are similar to those in the simulated experiments in (6), but with maximum frequency of

. Also, only 4 degrees of freedom are considered (Roll, Elbow1, Elbow2, Wrist Pitch). For learning the data-driven model, 1000 samples are used, randomly taken from the whole collected dataset.

We employ the CSRT Tracker[Lukezic2016DiscriminativeReliability] to track the tip position of the robot from 2D images in real-time. This 2D position is then projected into 3D using the associated depth images and the focal length of the depth camera as follows:


where is the tip position in 2D image; is the depth value from associated depth image; , , and , are the intrinsic parameters of the camera. Figure 4 shows a snapshot of the motion during data collection.

Figure 4: Snapshot of the motion for the data collection tracking the tip position.

The collected tip positions, however, are expressed in the camera frame. Since all the measurements need to be expressed in the robot base frame, camera calibration is used to map the collected tip position from the camera frame to the robot frame.

Iii-C Real Robot Kinematic Model

Figure 5: Comparison of the different models on a subset of the dataset from the real experiment, both for the learning (upper row) and the test motion (bottom row). All units are in m. The shaded regions indicate a 95% confidence interval of the GP models.

Due to the better results obtained from learning the error () between the analytical and the collected data, this approach is employed for retrieving the data-driven model of the real robot. For the hybridization, the precise analytical model (with backlash compensation and correct links’ lengths) is employed and the thresholds in (5) are chosen as , respectively for . The upper row of Figure 5 shows the results comparing the different models (analytical, GP, hybrid) on a subset of the collected learning dataset.

As expected, due to unmodelled nonlinearities, the analytical model does not always behave properly. When some complex motion is commanded, for instance when more joints are moving together, the analytical model differs largely from the collected data. Under some simpler motions, instead, the analytical model results satisfactory. The data-driven model allows to explain pretty well the collected data, with a RMSE for each component of and a maximum absolute error of .

Nevertheless, relying solely on the data-driven model may lead to wrong behaviours, due to errors in the camera calibration or noise in the collected data. When hybridization is added, the RMSE of the hybrid model with respect to the collected data increases, resulting to be . However, the maximum absolute error is kept invariant. This is because in regions where the analytic model is good, it is preferred or, at least, it has influence on the hybrid model. Otherwise, the hybrid model is always closer to the data-driven model, due to the inability of the analytical to model the system appropriately.

To further validate the models, a test motion is also performed. The motor values to command to the motors are computed from the analytical model imposing a Cartesian tip trajectory described by the lemniscate of Bernoulli path as


The initial and final configurations are set to , , , and the execution time to . For the hybridization, the same thresholds as in the learning experiments are used. The lower row of Figure 5 shows the results on a subset of the data. Also in this case, the analytical model doesn’t manage to provide accurate results, especially for the direction. The RMSE and maximum absolute errors result to be and . However, also the GP model doesn’t appear very accurate with the RMSE and maximum error being and . The larger error values in the learned model indicate the low ability of the data-driven approach to generalize, with the provided learning dataset. When the hybridization is included, the resulting model is a bit smoother than the data-driven model, yielding to RMSE and maximum error values of and . This shows that where the analytic model behaves well, the hybridization improves the performances of the data-driven approach, as in the direction. Conversely, when it behaves poorly, performances of the data-driven approach are worsened. Nevertheless, due to the large confidence intervals, the hybrid model appears to be a reasonable compromise between the analytical and the GP model.

Iv Conclusions

In order to have accurate control, a precise robot kinematic model is needed. The design and the actuation, however, may lead to complex systems, with many nonlinearities. Surgical robots, for example, are usually very articulated (eventually continuum) and tendon-driven, which gives rise to many nonliearities in the kinematic model.

Analytical methods allow to describe the robot kinematic model based on some assumptions and simplifications. Being based on mathematical formulations, they are usually very generalizable. On the other hand, data-driven approaches allow to build much more complex models from the data acquired. Nevertheless, they are less generalizable and the results are highly affected by the acquired dataset. In this work, GPR has been used for the data-drievn modelling thanks to its ability to provide a confidence interval for the accuracy of the results.

Hybrid approaches combining both methods may leverage the advantages of both methods, thus providing better modelling. The method presented in this work allows to mix analytical and data-driven, giving more importance to one or another depending on the level of confidence of the GPR model: where the data-driven model is uncertain, the analytical model is preferred. Different methods for building the data-driven models have been used (GP for error learning, GP with prior, GP without prior). Results show that learning the error between the acquired data and the provided analytical model provides better results. The poorest model is the one obtained from GPR without any prior knowledge. When using the proposed hybridization method, results show that the final model is affected by the provided analytical model. Wrong analytical models may reult in poorer models. However, when a more precise analytical model is employed, the modelling errors are highly reduced.

As noticed from the experiments on the real robot, the proposed method is capable of giving more importance to the analytical model or to the data-driven one, depending on their accuracy with respect to the collected data. The effect of the hybridization, however, depends on the parameters of the weighting function.

Future work will focus on improving the hybridization with an improved way to set the parameters of the weighting function and, following, on employing the hybrid model to control the robot in trajectory tracking and surgical task automation.