Sensor Aware Lidar Odometry

by   Dmitri Kovalenko, et al.

A lidar odometry method, integrating into the computation the knowledge about the physics of the sensor, is proposed. A model of measurement error enables higher precision in estimation of the point normal covariance. Adjacent laser beams are used in an outlier correspondence rejection scheme. The method is ranked in the KITTI's leaderboard with 1.37 achieved in comparison with the LOAM method on the internal dataset.



There are no comments yet.


page 1


LLOL: Low-Latency Odometry for Spinning Lidars

In this paper, we present a low-latency odometry system designed for spi...

Recalibrating the KITTI Dataset Camera Setup for Improved Odometry Accuracy

Over the last decade, one of the most relevant public datasets for evalu...

Fail-Aware LIDAR-Based Odometry for Autonomous Vehicles

Autonomous driving systems are set to become a reality in transport syst...

LiDAR Odometry Methodologies for Autonomous Driving: A Survey

Vehicle odometry is an essential component of an automated driving syste...

LIMO: Lidar-Monocular Visual Odometry

Higher level functionality in autonomous driving depends strongly on a p...

Unsupervised Learning of Lidar Features for Use in a Probabilistic Trajectory Estimator

We present unsupervised parameter learning in a Gaussian variational inf...

Scene-Aware Error Modeling of LiDAR/Visual Odometry for Fusion-based Vehicle Localization

Localization is an essential technique in mobile robotics. In a complex ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The driverless transportation is becoming a technological dream on a verge of turning a day-to-day reality, as more competitors are joining to [yandex_in_vegas] the pursuit. Early prototypes appeared decades ago [leonard1991simultaneous], while DARPA Urban Challenge could be considered an important milestone. AV (Autonomous Vehicle), hardware and models of the victorious team are highlighted in [urmson2007tartan] with an emphasis on the crucial role that a lidar plays in providing that level of autonomy. The comprehensive analysis [fletcher2008cornell] of a low-speed collision between MIT’s AV "Talos" and Cornell’s AV "Skynet" occured during the challenge puts the positioning error as a one of causes. The lidar odometry is a subject of this work and our main contributions are:

  • point cloud filtering by the covariance of point normal, informed of the measurement error, specific for a lidar sensor

  • scheme for the false matches rejection between consecutive point clouds, based on a neighbor beam flight distance metric

  • reliable performance in the realistic environment is demonstrated on KITTI [geiger2012we] and on the internal dataset, collected from the autonomous fleet facing intense modes of operation every day (Fig. 1) in comparison with LOAM, state-of-the-art competitor [zhang2014loam]

Fig. 1: Environments included in the internal dataset. Camera images are used for visualization purposes, the method operates on lidar scans only

2 Related work

Driver-assist systems (autonomy level 3 [SAE_ADAS]) tend to avoid 3D SLAM (Simultaneous Localization and Mapping) by reliance on GNSS, odometry and the lane detection. However, level 4 AVs often benefit from data stored in a HD map. This study is concerned with a level 4 setup, which dictates the separation of localization and mapping because doing the latter repetitively on an every car in a fleet in a same location is wasteful.

A substantial body of reseach exists on a problem of egomotion estimation and SLAM, yet the application to AVs imposes an unique set of challenges, as the environment is:

  • unprepared for autonomous navigation and is unfeasible to be retrofitted with markers

  • feature-poor: highways, tunnels, wastelands

  • dynamic: vegetation growth, snowbanks accumulation, construction works

  • hard to perceive: the weather, time of a day and an annual season change influence

  • rich with occlusions of background features by other vehicles

The degenerative effect of dense urbanisation on the SLAM performance was shown in [wen2018performance]. This challenge is addressed by graph optimization [grisetti2010tutorial], achieving the map global consistency. However, to reduce the computation time and decrease the chance of convergence to a suboptimal local minima, the better maps for optimization are called for. The input maps are built with odometry, which is the objective of our study.

The comparison [sokolov2017analysis] of lidar SLAM with the visual counterparts favours the former. The top of the KITTI’s leaderboard is also occupied by lidar methods. Laser range measurements tend to show a higher reliability and lower dependence on luminance than the depth measurements produced by cameras, which drives our focus on a lidar.

The seminal contribution is Iterative Closest Point [besl1992method]

, a method to register a relative 3D transformation between point clouds of arbitrary shape, based on the least squares. The closest point correspondence estimation is replaced by a grid lookup, and Euclidian loss is replaced by Mahalanobis in Normal Distributions Transform 

[biber2003normal]. Novel learning-based methods are proposed [velas2018cnn][cho2019deeplo], but do not perform on par yet. Reflectance intensity images with depth are reconstructed from range scans in [dong2014lighting] and then used as an input for the visual odometry algorithm. Contrary to that, many works [deschaud2018imls][neuhaus2018mc2slam],  [ji2018cpfg],  [behley2018efficient] rely solely on the geometric properties of registered laser point clouds.

The continuous time was found benefitial by H. Alismail et al in [alismail2014continuous] as it allows to solve simultaneously and explicitely for the pose and the point cloud drift, which happens due to an egomotion during the sensor spin time. The proposed method does not estimate the point cloud drift correction, as the wheel odometry data could be employed for that.

Optimal registration algorithms [cai2019practical] guarantee convergence without initialization, but are prohibitively expensive to compute. The landmark-based registration is advocated in [li2010extracting] on grounds of decreasing a SLAM graph density. In spite of this valid concern our method operates directly on point clouds, because the dense approach provides a greater robustness in feature-poor environments.

Experimental results are evaluated against an another feature-based approach, LOAM [zhang2014loam], which holds the top result on KITTI’s leaderboard111 at the time of writing this paper. LOAM makes use of two types of features (corner-like and plane-like) associated with dedicated losses (point-to-line, point-to-plane). These point-to-plane losses may include outlier entries because plane normal estimates are computed with three points each and could contain noise, while the proposed method estimates normals robutly and filters point clouds by normal covariances.

The rest of the paper is organized as follows: Sec. 3 introduces the mathematical model of the proposed method, experimental results are layed down in Sec. 4, then Sec. 5 concludes with the summary and further work.

3 Lidar Odometry

The odometry task could be stated formally as the problem of determining a relative rigid-body 3D transformation , which brings the sensor coordinate frame at a time of the current range scan (source point cloud ) registration to the sensor coordinate frame at a time of the previous range scan (target point cloud ) registration.

The proposed method is built on the ICP framework [besl1992method]. We proceed with detailing specific choices made for every subroutine mentioned in Alg. 1 without a dive into the discussion of available alternatives, which are covered by the excellent review [rusinkiewicz2001efficient] and the best-practices for choosing among them [pomerleau2013comparing].

Initialization of the transformation (initial guess) practically is derived from the independent sensors available: IMU, wheel odometry. However, this work considers the lidar-only odometry and uses the linear extrapolation from past states for initialization.

Filtration of Points is crucial as the data rate of a modern lidar would hinder the computational performance. Depending on the lidar model and the configuration of the environment 30K-150K points are registered within one range scan , while only are passed on through the filter (Alg. 1, line 1) to the model as the point cloud . The filtration steps are:

  • voxel grid filter, passing on centroids of occupied voxels

  • normal estimation and normal covariance filtration, covered in detail in Sec. 3.1

  • local curvature filter: points with a higher curvature are discarded (estimated as in [pcl_curvature])

Matching of Points is done with a k-d tree, selecting a closest point in the target cloud. These preliminary correspondences undergo the rejection of outliers:

  • geometric correspondence rejector, highlighted in Sec. 3.2

  • remove of matches having the largest distance (introduced by D. Chetverikov et al in [chetverikov2002trimmed])

Transformation Estimation is performed by the uniformly weighted linearized least squares optimization with the point-to-plane metric, which was shown to be generally superior to point-to-point by F. Pomerleau et al in [pomerleau2013comparing].

Termination Criteria is a combination of thresholds on the absolute and relative magnitude of transformation increment with the threshold on an iterations count.

0:  , previous and current lidar point clouds, transformation initialization
0:  , estimated transformation
4:  while not  do
10:  end while
11:  return  
Algorithm 1 Estimate the transformation between consecutive range scans with ICP

With that the ICP framework is defined and the proposed models for point filtration and matching follow.

3.1 Normal Covariance Filtration

The filter introduced in Alg. 1, line 1, takes a raw range scan and outputs , where each point has an estimated normal associated with it, as well as the normal covariance:

, as the points having high-uncertainty normals are rejected. The reason is miscalculated normals impair transformation estimation precision, as individual losses would be off.

The normal

is estimated using SVD (Singular Value Decomposition) in a

-neighborhood of the point

, so when a substantial fraction of points is sampled from an another surface, which could happen in many boundary cases, SVD would yeild a normal vector, that is not describing a true plane well. The higher values of the last singular value are indicative of the lower confidence in the normal orientation. However, such a model discounts the fact that the point coordinates are not certain and are subject to the measurement error.

We propose a normal covariance filter (NCF), which accounts for the uncertainty in point coordinates, treating it as Gaussian random variables. The measurement error is approximated by a sphere with the standard deviation


For the notation clarity, until the rest of the section the lower indices are used (as in ) to access a matrix entry and a single lower index to denote a matrix column , counting indices from .

The SVD input is an array of unbiased local points, sampled around from , i.e. the data matrix . The SVD output is , where , , , and , are orthogonal.

When entries of are random variables, is as well. The covariance of the plane normal is estimated by the propagation of uncertainty technique:


The data matrix covariance was informally introduced above and is . The function is a sequence of computations, yielding from . To avoid the appearance of 3-dimensional Jacobian matrix, is reformulated w.r.t. , simplified further, given the form of :


Then the intermediate matrix is defined:


where , and defines a matrix, having zeros everywhere except -th element. With Eq. 3 the Jacobian w.r.t. to is obtained:


For the full derivation of the SVD Jacobian please refer to [candes2013unbiased],  [papadopoulo2000estimating].

Recovered in Eq. 1, lies in the sensor coordinate frame, hence it should be rotated into an alignment with before the threshold may be applied:


The matrix is a covariance of in a normal-aligned coordinate frame computed in Eq. 5 through eigendecomposition. Having for an each point in a range scan , the points associated with the higher normal uncertainty are discarded: . The filtration result is a range scan with precise normals, utilized on later stages of the ICP framework.

3.2 Geometric Correspondence Rejector

For a current range scan the set of correspondences is found by a search in the previous range scan (Alg. 1, line 5). The proposed model yields filtered correspondences , where is a point in the current range scan, participating in the correspondence with a point from the previous range scan.

The test isInlier is conducted independently for an every correspondence , characterized by the Euclidian distance between matched points: . The test description follows:


The matching distance in compared to an upper bound of neighbor beam flight distances, or, shortly, , highlighted red in Fig. 2.

The neighbors of , , are four hypothetical points, which might have been registered by lasers, vertically adjacent to the one registering . Conventionally for spinning lidars, these laser levels are called rings, so if is a part of the ring , then it’s neighbors are on the adjacent rings , , shown in purple and orange in Fig. 2.

The model introduces two neighbors on an each of the adjacent rings assuming that "left" neighbors were registered at a previous increment of lidar azimutal rotation, at a current, and "right" neighbors at a next one.

For the calibrated lidar, all angular distances between and it’s neighbors are easily found. The calibration parameters are:  (an angular increment of lidar azimutal rotation), , are angular distances in pitch between rings.

Fig. 2: A laser beam, producing point (black), it’s neighbor beams from the upper ring (purple) and the lower ring (orange) together define a plane (gray), intersected by the green surface orthogonal to normal at . The highest among distances between planes along neighbor laser beams (red) is used as the rejection threshold.

Hence, with it’s neighbors define a segment of the unit-sphere. This sphere is scaled to and the segment is approximated by a gray plane on Fig. 2, in turn assuming neighbor lidar beams parallel. This assumption was found admissable for lidars in our use, for the hardware parameters are close to : , .

The diagonal vector between and on the approximated plane (shown in cyan in Fig. 2) is obtained as follows:


The vectors , define absciss and ordinate axes of coordinate frame. With proceed to the neighbor beam flight distance metric:


The intuition behind the metric is a distance between the -scaled laser beam sphere surface and a local reflecting surface along the laser beam (grey and green surfaces in Fig. 2). Matches having

higher than that are probably made with points laying even further away from

than and should be discarded.

4 Experiments

The thorough evaluation of the proposed method is provided. Two parts of this exist: the dedicated study of the geometric correspondence rejector and the integrated test of the proposed lidar odometry. The former is a comparison of the proposed rejector with the Euclidian distance rejector while the latter is a test with three ICP odometry variants and the competitor apporach, LOAM, on two real-world AV benchmarks, renown KITTI and the internal dataset, taken during Yandex own AV fleet operation.

Fig. 3: Translation error of ICP point cloud alignment with different outlier rejectors. The proposed in Sec. 3.2 GCR scheme (geom) outperforms the baseline – distance-based rejection (dst) on every noise level. Noise parameters are in meters and degrees.

4.1 Outlier Correspondence Rejection

The geometric correspondence rejector (GCR) is evaluated against the baseline, which is the Euclidian distance rejector. The test is run on point cloud pairs from the internal dataset. The ground truth for the test is relative transformations between pairs, refined by the graph optimization.

In an every test instance the initial guess is the composition of the ground truth transformation with a random noise. The method’s ability to recover the correct transformation is evaluated for several noise levels. The proposed method have shown the performance, superior to the baseline, as in Fig. 3. Only distributions of translation estimation errors are included as the rotation errors were found to be very correlated with it.

The noise transformation have two parameters ,  (magnitude limits in rotation and translation) and is sampled as follows: generate two random unit vectors on a sphere, where the former is the rotation axis and the latter is the translation axis , rotate for about and translate for along .

4.2 Lidar Odometry Evaluation

The study of the fully integrated odometry method follows with the baseline (BL), where the Euclidian distance rejector is used within the ICP framework, described in Alg. 1 along with the common approach for points filtration, following Sec. 3.

The baseline method is used as a control in a performance evaluation of LOAM, the proposed method (SALO) and an intermediate variant, integrating NCF into BL. GCR was not evaluated separately from NCF, because it relies on the quality of point cloud normals, which is ensured by NCF.

The internal Yandex dataset of total length of 16 minutes, 12 km was compiled for this study. It was taken in different cities and seasons of the year with a 32-beam lidar, mid-grade single-band GNSS with RTK. GNSS signal frequency was increased by the Kalman filter interpolation with IMU and wheel odometry to be used as the ground truth in this evaluation. The removal of dynamic objects from lidar scans was not performed.

Detailed results for all three methods and the competitor are showcased in Tab. I with trajectory visualizations in Fig. 4.

A. Geiger argues in [geiger2012we] that the absolute metric accounts errors non-uniformly based on a time passed since the start of the sequence. We adhere to KITTI’s standard and use the same metric and the official devkit in all evaluations, with a minor change: the reference pose is set metres away instead of averaging across , to foster the results interpretability. The other motivation for that is space constraints and we may suggest readers to refer to our submission on the benchmark website [geiger2012we] for more results.

Along with error averages, standard deviations are listed and it may be noted that the lower mean error always coincides with the lower deviation.

It was shown before [sun2018dlo] and is confirmed now on the internal dataset and KITTI (Fig. 4.d) that the community-supported implementation of LOAM [community_loam_velodyne] does not perform accordingly to the version, competing in the KITTI’s leaderboard and tends to degrade in many cases. Contrary to that, the proposed method produces a trajectory close to the ground truth, accumulating little drift over 7 minutes (Fig. 4.d).

The proposed method achieves 27% and 45% drift reduction on KITTI and internal dataset respectively along with an even stronger error variance reduction.

The avarage processing time of a point cloud is  s. Studied variants do not vary substantially in computational cost, which mostly comes from the normal estimation.

a) internal #02 b) internal #06 c) internal #07 d) KITTI #00
Fig. 4: SALO (green) is compared to the ground truth (red), baselines and LOAM on the internal dataset (a-c) and KITTI (d). SALO shows lower drift in all scenarios except b), while the community-supported implementation of LOAM[community_loam_velodyne] does not perform on a level reported in literature.
Training set of KITTI
Method 00 01 02 03 04 05 06 07 08 09 10 Overall

Internal Dataset
Method 00 01 02 03 04 05 06 07 08 Overall

BL is a baseline odometry, as described in Alg. 1, NCF – the baseline enhanced with the normal covariance filtration, as in Sec. 3.1, SALO is the proposed method, i.e. BL+NCF+GCR.

TABLE I: Mean Relative Positioning Error per 100 m

5 Conclusions

The MIT/Cornell collision during DARPA Grand Urban Challenge is the first AV road incident, but regretfully would not be the last. We share our experience in designing odometry solution for self-driving car hoping to foster safety and reliabilty of the autonomous fleets across the globe. In this work ICP lidar odometry performance boosted by the normal covariance point cloud filter and by the geometric outlier correspondence rejector. These routines jointly enable for cleaner point clouds with better correspondences yielding better convergence properties, which is demonstrated on KITTI and internal dataset with relative positioning errors of and respectively. Further pursuit is an implementation of a local map and fast map query mechanisms to allow for implicit loop closures. Another conceived improvement is mixing point-to-point and point-to-plane losses depending on the local structure, as well as the preparation of internal dataset for public release.

We are grateful to our colleague Sergey Krutikov for his invaluable contribution into this project.