It is essential for mobile robots to perform robust and accurate real-time localization when deployed in real-world applications such as autonomous delivery. While visual localization or SLAM has made significant progress in the last decades, most of current algorithms are generally purposed and not tailored to particular robotic systems – that is, their design is often independent of robots. However, the robotic system can provide informative state constraints due to its dynamics and/or kinematics, which should be exploited when designing localization algorithms for robots at hand. In this paper, bearing this in our mind, we develop a kinematics-constrained visual-inertial localization algorithm for skid-steering robots, which tightly fuses low-cost camera, IMU and odometer sensors to provide high-precision real-time localization solutions in 3D.
Visual-inertial sensors are becoming ubiquitous and many general-purpose visual-inertial navigation algorithms have been developed in recent years (e.g., see [Li2013high, li2014high, qin2018vins, xing2017photometric, Eckenhoff2019IJRR]), which has motivated an increasing number of deployments of such sensor suite on real robotic systems [Hartley2018IROS]. Due to their low cost and complementary sensing capabilities, we have also employed them in our proposed skid-steering robotic system (see Fig. 1). Note that, instead of having explicit mechanism of steering control, skid-steering robots rely on adjusting the speed of left and right tracks to turn around. The simplicity of the mechanical design and the ability to turn around with zero-radius have made such robots popular in scientific research and development.
Due to the popularity of skid-steering robots, substantial research efforts have focused on the motion dynamics modeling, control, and planning [martinez2005approximating, huskic2017path, pentzer2014use]. In particular, yi2009kinematic introduced a simple dead-reckoning (DR) method for skid-steering robots, while wang2018terrain relied on accurate GPS to provide localization (which clearly is not applicable if GPS is not available or reliable). As the closest to this work, wu2017vins recently proposed a visual-inertial localization method for wheeled vehicles by directly using an odometer’s 2D linear/angular velocity measurements. While this approach wu2017vins is perfectly suitable for a standard differential-drive robot, significant efforts on kinematic modeling and fusion may be required to deploy it on skid-steering robots; if blindly ignoring that, localization performance would be degraded.
To address these issues and promote visual-inertial localization for skid-steering robots, in this paper, we, for the first time, design a tightly-coupled visual-inertial estimation algorithm that fully exploits the robot’s ICR-based kinematic martinez2005approximating constraints and efficiently offers 3D localization solutions. In particular, to compensate for the time-varying ICR model parameters (e.g., due to slippage and terrain roughness), we explicitly model and estimate online the kinematic parameters of a skid-steering robot. To this end, leveraging our significant prior work on visual-inertial odometry li2014online; Li2013high, we develop an efficient sliding-window bundle adjustment (BA)-based estimator to optimally fuse measurements from a camera, an IMU, and wheel encoders. Moreover, we have performed observability analysis in detail, showing that the kinematic parameters are all observable under general motions while the observability would not hold when the IMU is not used, which is important for estimator design.
2 Related Work
As there is rich literature on mobile robot localization Cadena2016TRO, by no means, we intend to provide a comprehensive review on this topic and instead focus on wheeled robots here. For example, censi2013simultaneous
performed pose estimation with online wheel odometry parameter (the radius of left and right wheels as well as the distance between them) calibration for a differential drive robot equipped with two wheels, whilescaramuzza20111 introduced a camera based localization algorithm for Ackermann model-based wheeled robots. As mentioned earlier, wu2017vins developed a sliding-window EKF to probabilistically fuse the measurements from wheel encoders, an IMU, and a monocular camera to provide 6DOF motion. yap2011particle solved the similar problem but with a particle filter based method. However, in all of these methods, it is assumed that linear and angular velocities of a robot can be directly computed from wheel encoder readings, which is not the case for skid-steering robots.
A skid-steering robot often uses the ICR positions of treads to model its motion dynamics martinez2005approximating. Since it was found empirically that the ICR parameters have small variations under same terrain conditions martinez2005approximating, additional modeling parameters were introduced for better modelling. For example, huskic2017path used additional scale variables for allowing accurate path following and martinez2017inertia modeled additional sliding, eccentricity and steering efficiency. Note that ICR is not the only model for skid-steering robots and there are many others. For instance, reina2016slip modeled the distance between left and right tread and integrated it into terrain classification. sutoh2018motion modeled the ratio of the velocities between left and right wheels as an exponential function of ratio of readings between left and right wheel encoders, and these exponential parameters are estimated during terrain navigation.
Depending on sensors used and application scenarios, different localization algorithms for skid-steering robots have been developed in recent literature. In particular, lv2019fvo proposed a method for using images for correcting headings for skid-steering robot, while requiring parallel and perpendicular lines which mainly are suitable for human-made environments. which however did not provide detailed description of how the wheel encoder’s measurements were integrated. IMU measurements are typically used together with wheel encoder readings to provide motion tracking of skid-sterring robots. For example, yi2009kinematic used an IMU on the skid-steering robot to perform both trajectory tracking and slippery estimation, and lv2017indoor fused measurements from wheel encoders, a gyroscope, and a magnetometer to localize the skid-steering robot. GPS measurements, if available, are also leveraged with EKF pentzer2014model
, in which the ICR locations were modeled as parts of the state vector and estimated online. Specifically, the wheel encoder measurements were used for EKF pose prediction and the GPS measurements were used for EKF update.wang2018terrain combined both GPS and IMU measurements, in which they used GPS to perform high-precision navigation and rely on accelerometer measurements terrain classification. In contrast, in this paper, we focus on skid-steering robot localization with low-cost multi-modal sensors while integrating kinematic constraints.
3 ICR-based Kinematics of Skid-Steering Robots
In this work, we employ the ICR parameters martinez2005approximating to approximately model the kinematics of a skid-steering robot.
Specifically, as shown in Fig. 1,
we denote the ICR position of the robot frame, and
and the ones of the left and right wheels, respectively.
The relation between the readings of wheel odometer measurements and the ICR parameters can be derived as follows:111Throughout this paper,
the robot is equipped with a camera, an IMU, and wheel odometers, whose frames are denoted by , , and , respectively,
while refers to the global frame of reference.
and denote the 3DOF position and rotation of frame with respect to .
We use and to represent the estimate of random variable
to represent the estimate of random variableand its error state. The symbol is used to denote the inferred measurement mean value of .
where and are linear velocities of left and right wheels, and are robot’s local linear velocity along and axes defined in Fig. 1, and denotes the local rotational speed. Moreover, we introduce two additional scale factors, , to compensate for the possible effects, e.g., due to tire inflation and interface roughness. With the scale factors and Eq. 1, we can express the motion variables as:
where , and is the entire set of kinematic parameters.
Interestingly, as a special configuration when , with being the distance between left and right wheels, Eq. 2 can be simplified as:
This is the kinematic model for a wheeled robot moving without slippage (e.g., a differential drive robot), and used by most existing work for localizing wheeled robots wu2017vins; quan2018tightly. However, in the case of skid-steering robots under consideration, if directly applying Eq. 3, the localization accuracy would be significantly degraded (see Section 6). It is important to point out that as cannot remain constant due to different motions and terrains martinez2005approximating; huskic2017path, we will perform online “calibration” to estimate these kinematic parameters along with the navigation states as in li2014online; censi2013simultaneous; Li2013high (see Section 4.2).
4 Kinematics-Constrained Visual-Inertial Localization
We develop a window-BA estimator for the proposed kinematics-constrained visual-inertial localization for a skid-steering robot equipped with a camera, an IMU, and wheel encoders. For simplicity, although not necessary, we assume known extrinsic transformations between sensors. At each time step, we optimize the following window of states, whose typically oldest state will be marginalized out when moving to the next window in order to bound computational cost:
In the above expression, denotes the cloned poses in the sliding window at time . represents the 6DOF pose of the robot at time . We choose the odometry frame is the base sensor frame and the system is initialized by the initial position of odometer while the direction of is aligned with the gravity. contains all the 3D global positions of visual features. are the IMU velocity in global frame, acceleration bias and angular velocity bias, respectively. Note that we estimate online the ICR kinematic parameters and thus include them in the state as well. Lastly, denotes the parameters related to the motion manifold constraints enforcing local smooth ground planar motion. As illustrated in Fig. 2, the sliding window BA is our estimation engine whose cost function includes the following constraints:
which includes the prior of the states remaining in the current sliding window after marginalization Eckenhoff2019IJRR, the projection error of visual features, the IMU integration constraints Eckenhoff2019IJRR; Li2013high, the odometer-induced kinematic constraints, and the motion manifold constraints.
4.1 Visual-Inertial Constraints
In the sliding-window BA, only keyframes are optimized, which are selected based on a simple heuristic: the odometer prediction has a translation or rotation over a certain threshold (e.g., 0.2 meter and 3 degrees as in our experiments). In contrast, for computational savings, non-keyframes will be discarded, unlike existing methodsqin2018vins; leutenegger2015keyframe which extract features firstly and analyses the distribution of the features for keyframe selection. Among keyframes in the window, corner feature points are extracted rosten2006machine and tacked by KLT optical flow algorithm lucas1981iterative. The standard reprojection errors of the tracked features comprise the visual cost in (5) as in Eckenhoff2019IJRR; Li2013high.
On the other hand, the IMU measurements between any two consecutive keyframes are integrated and form the inertial constraints across the sliding window Eckenhoff2019IJRR; Li2013high:
where is the IMU state at time , and denote the IMU acceleration and angular velocity measurements between and , respectively. represents the inverse covariance (information) of the IMU prediction .
4.2 ICR-based Kinematic Constraints
We now derive the ICR-based kinematic constraints based on the wheel encoders’ measurements of the skid-steering robot. Specifically, by assuming the supporting manifold of the robot is locally planar between and , the local linear and angular velocities, and , are a function of the wheel encoders’ measurements of the left and right wheels and as well as the ICR kinematic parameters [see (2)]:
where is the selection matrix with being a unit vector with the th element of 1, and are the odometry noise modeled as zero-mean white Gaussian. Once the instantaneous local velocities of the robot are available, with the initial conditions and , we can integrate the following differential equations in the time interval :
This integration will result in the relative pose , which is then used to propagate the global pose from to :
Additionally, we model the ICR kinematic parameter as a random walk to capture its time-varying characteristics:
where is zero-mean white Gaussian noise.
Based on the ICR-based kinematic model (9) and (10), we predict the pose and kinematic parameter at the newest keyframe time , , by integrating all the intermediate odometery measurements . As a result, the odometer-induced kinematic constraint can be generically written in the following form:
where represents the inverse covariance (information) obtained via covariance propagation. Specifically, the discrete-time linearized kinematic model of the error state at , , corresponding to (9) and (10) at time can be found as follows:
where , and is the error-state transition matrix which is given by:
where are non-zero blocks, corresponding to positional and rotational elements with respect to and scale factor , and is the noise Jacobian matrix. Due to space limitations, the detailed derivations of these matrices can be found in our companion technical report tr_icr.
Additionally, as the skid-steer robot navigates on ground surface, its positions within a short period of time should be well modeled by a quadratic polynomial zhang2019large:
where are the manifold parameters. Note also that the roll and pitch of the ground robot should be consistent with the normal of the motion manifold (ground surface), which can be expressed as follows:
where denotes the first and second rows of the symmetric matrix of the 3D vector . At this point, the motion manifold constraint for all the poses in the current sliding window can be written as:
5 Observability Analysis
An important prerequisite condition for the proposed localization algorithm to work properly is that the skid-steering kinematic parameter vector, , is locally observable (or identifiable222Since derivative of is modeled by zero-mean Gaussian, we here use observability and identifiability interchangeably.) Bar-Shalom1988. Therefore, in this section, we provide detailed observability analysis. We note that, it is also interesting to investigate the observability properties by applying the proposed method with monocular camera and odometer only (without having IMU). This will examine whether skid-steering robots can be localized with reduced number of sensors, and emphasize the importance of our choice of adding the IMU.
5.1 Observability of with a monocular camera and odometer
To conduct our analysis, we follow the idea of li2014online, in which information provided by each sensor is firstly investigated and subsequently combined together for deriving the final results. By doing this, ‘abstract’ measurements instead of the ‘raw’ measurements are used for analysis, which greatly simplifies our derivation. A moving monocular camera is able to provide information on rotation and up-to-scale position with respect to the initial camera frame hartley2003multiple li2014online. Equivalently, we can say that a moving camera is able to provide the following two types of measurements: (i) camera’s angular velocity and (ii) its up-to-scale linear velocity:
are the white noises, andand are true angular and linear velocities of camera with respect to global frame expressed in camera frame respectively. Finally, is an unknown scale factor. We also note that, since the camera to odometer extrinsic parameters are precisely known in advance, the camera measurements can be further denoted as:
We will later show that this will simplify the analysis.
On the other hand, as mentioned in Sec. 3, odometer provides observations for the speed of left and right wheels, i.e., and respectively. By linking , , , , and kinematic parameter vector together, the observability properties can be analyzed in details. We also note that, during the observability analysis, the zero-mean noise terms are ignored, since they will not change our conclusions.
By ignoring the noise terms, the following equation holds:
where are the first and second element of , and are the first and second element of . For brevity, we use to denote the third element of . By defining , and , we can write
Note that, this equation only contains 1) sensor measurements, and 2) a combination of vision scale factors and skid-steering kinematics:
The identifiability of can be described as follows:
By using measurements from a monocular camera and wheel odometers, is not locally identifiable.
is locally identifiable if and only if is locally identifiable:
By expanding Eq. 22, we can write the following constraints:
A necessary and sufficient condition of to be locally identifiable is the following observability matrix has full column rank van2009identifiability:
By defining the th block columns of , the following equation holds:
which demonstrates that is not of full column rank. This completes the proof. ∎
5.2 Observability of with a monocular camera, an IMU, and odometer
When an IMU is added, the ‘abstract’ measurement of visual-inertial estimation can be also derived. Visual-inertial estimation provides: camera’s local (i) angular velocity and (ii) linear velocity, similar to vision only case (Eq. 19) without having scale effect li2014online. Similarly to Eq. 5.1, to simplify the analysis, we prove identifiability of instead of :
By using measurements from a monocular camera, an IMU, and wheel odometer, is locally identifiable, except for following degenerate cases: (i) velocity of one of the wheels, or , keeps zero; (ii) keeps zero; (iii) , , and are all constants; (iv) is always proportional to .
Similarly to Eq. 23, by removing the scale factor, the constraints become:
The observability matrix for then becomes:
which can be simplified by linear operations:
There are four special cases to make not of full column rank: (i) velocity of one of the wheels, or , keeps zero; (ii) keeps zero; (iii) , , and are all constants; (iv) is always proportional to . If none of those conditions is met, is of full column rank. This reveals that, under general motion, is locally identifiable. This completes the proof. ∎
6 Experimental Results
As shown in Fig. 1, our experiments were conducted by two skid-steering robots with both ‘localization’ sensors and ‘ground-truth sensors’ equipped. For ‘localization’ sensors, we used a Hz monocular global shutter camera at resolution of , a Hz Bosch BMI160 IMU, and Hz wheel odometers. The ‘ground truth’ sensor mainly relies on RTK-GPS, who reports Hz data when the signal is reliable. The accuracy of RTK-GPS is at centimeter level.
The first experiment is to demonstrate the improvement of localization accuracy by estimating (Eq. 2) online. As shown in Fig 3, we conducted experiments under different environments, i.e., (a) lawn, (b) cement brick, (c) wooden bridge, (d) muddy road, (e) asphalt road, (f) ceramic tiles, (g) carpet, and (h) wooden floor. Fig. 4 shows the trajectory and visual features estimated by the proposed method on sequence "CP01-2019-05-27-14-50-49", in which the robot traversed outdoors and indoors. Since GPS signal is not available in all tests (e.g., indoor tests), we here used final drift as the first error metric. To make this possible, we started and terminated each experiment at the same position. Two algorithms were implemented in this test: 1) the proposed one by explicitly estimating , and 2) using Eq. 3 without modeling 333In fact, Eq. 3 can be considered as one-parameter approximation of skid-steering kinematics, if is probabilistically estimated..
In Table. 1, we show the final drift values on 20 representative sequences. Since we used the two robots, we used the notation “CP01, CP02" to denote the names of the robots. Table. 1 clearly demonstrates that when skid-steering kinematic parameters are estimated online, the localization accuracy can be significantly improved. In fact, in almost half of the tests, the errors are reduced by approximately a order of magnitude. This validates our claim that to use odometer measurements of skid-steering robots, the complicated mechanism must be explicitly modeled to avoid accuracy loss.
In some sequences where GPS signals were available, we also evaluated the positional root mean square errors (RMSE)Bar-Shalom1988
. To compute that, we interpolated the estimated poses to get the ones corresponding to the timestamp of GPS measurements. The RMSE errors are shown in Table.2, which demonstrate that estimating is beneficial for trajectory tracking. Trajectory estimates on representative sequences are shown in Fig. 5.
6.1 Convergence of Kinematic Parameters
In this section, we show tests to demonstrate the convergence of under general motion. Unlike the experiments in the previous section where relatively good initial values of kinematic parameters were used, we manually set ‘bad’ initial value to kinematic parameters. Specifically, we added following error terms to initial kinematic parameters used in the previous section (good values): . We carried out tests on outdoor sequence "CP01-2019-05-08-17-50-51" and indoor sequence "CP01-2019-05-27-14-41-33", which did not involve changes of terrain types on the fly. In Fig. 6, the estimates of kinematic parameters are shown, along with the corresponding uncertainty envelopes.The results demonstrate that the kinematic parameters quickly converge to their correct values, and remains slow change rates for the rest of the trajectory. The uncertainty envelopes also shrink quickly. The results exactly meet our theoretical expectations that is locally identifiable under general motion.
In this paper, we have developed a novel kinematics-constrained visual-inertial localization method specialized for skid-steering robots, where a tightly-coupled sliding-window BA serves as the estimation engine for fusing multi-modal measurements. In particular, we have explicitly modeled the kinematics of skid-steering robots using both track ICRs and scale factors, in order to compensate for complex track-to-terrain interactions, imperfectness of mechanical design and terrain smoothness. Moreover, we have carefully examined the observability analysis, showing that the kinematic parameters are observable under general motion. Extensive real-world validations confirm that online kinematic estimation significantly improves localization.