LiDAR Odometry Methodologies for Autonomous Driving: A Survey

Vehicle odometry is an essential component of an automated driving system as it computes the vehicle's position and orientation. The odometry module has a higher demand and impact in urban areas where the global navigation satellite system (GNSS) signal is weak and noisy. Traditional visual odometry methods suffer from the diverse illumination status and get disparities during pose estimation, which results in significant errors as the error accumulates. Odometry using light detection and ranging (LiDAR) devices has attracted increasing research interest as LiDAR devices are robust to illumination variations. In this survey, we examine the existing LiDAR odometry methods and summarize the pipeline and delineate the several intermediate steps. Additionally, the existing LiDAR odometry methods are categorized by their correspondence type, and their advantages, disadvantages, and correlations are analyzed across-category and within-category in each step. Finally, we compare the accuracy and the running speed among these methodologies evaluated over the KITTI odometry dataset and outline promising future research directions.

READ FULL TEXT VIEW PDF

Authors

page 2

page 3

04/08/2020

DMLO: Deep Matching LiDAR Odometry

LiDAR odometry is a fundamental task for various areas such as robotics,...
05/25/2021

Simple But Effective Redundant Odometry for Autonomous Vehicles

Robust and reliable ego-motion is a key component of most autonomous mob...
04/27/2022

Autonomous Vehicle Calibration via Linear Optimization

In navigation activities, kinematic parameters of a mobile vehicle play ...
09/01/2020

LodoNet: A Deep Neural Network with 2D Keypoint Matchingfor 3D LiDAR Odometry Estimation

Deep learning based LiDAR odometry (LO) estimation attracts increasing r...
07/22/2019

Sensor Aware Lidar Odometry

A lidar odometry method, integrating into the computation the knowledge ...
04/12/2021

Point wise or Feature wise? Benchmark Comparison of Public Available LiDAR Odometry Algorithms in Urban Canyons

Robust and precise localization is essential for the autonomous system w...
03/05/2021

Fail-Aware LIDAR-Based Odometry for Autonomous Vehicles

Autonomous driving systems are set to become a reality in transport syst...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Vehicle odometry is a crucial component of the vehicle localization module in an automated driving system. In contrast to the GNSS/INS [71] based localization that requires external signals (e.g. GNSS signal), vehicle odometry takes advantage of local sensors’ readings to track the vehicle’s movement to get a reliable measurement in scenarios where the external signals are blocked or are highly noisy. For instance, in urban canyons, tunnels, and valleys, the GNSS signals are highly noisy due to the multi-path errors, whereas the odometry modules are not affected. Additionally, the odometry methods localize the vehicle in 3-Dimensions that facilitate vehicle localization in multi-level roads, while 2D GNSS/INS systems get easily confused.

Traditionally, vehicle odometry algorithms [53], [35], [73], [59], [31], [63], [58], [51], [74], [62], [3], are based on camera frames. However, there are several drawbacks in the visual odometry algorithms: (1) the performance is subjected to the variations in illumination, (2) the odometry is estimated in the image coordinates that is not homogeneous to the world coordinates. On the contrary, light detection and ranging (LiDAR) devices [46] actively emit laser beams to avoid the effects of illumination changes and measure the range directly in the world coordinates, which makes them ideal for vehicle odometry tasks.

In the recent decades, LiDAR-based vehicle odometry has attracted increasing research interests. In this task, a LiDAR frame is formatted as a point cloud in the LiDAR coordinates, and each point represents a scan point on an object. Therefore, the LiDAR frames are formatted as point clouds in different coordinates, and the goal of LiDAR odometry is to estimate the transformation between consecutive LiDAR frames. With those transformations estimated, the pose of any LiDAR frame can be obtained by using homogeneous transformation among the coordinate frames.

In this paper, we surveyed the existing works in the LiDAR odometry domain. Existing works on point cloud registration are also introduced as they can be adapted to transformation estimation between LiDAR frame pairs, which is a crucial step in LiDAR odometry. The rest of this paper is categorized as follows: In Section II

, the pipeline of the LiDAR odometry is summarized and divided into four steps: (1) pre-processing, (2) feature extraction, (3) correspondence searching, (4) transformation estimation, and (5) post-processing; In Section

II, the current works are categorized into three major approaches based on the type of correspondences: (1) point correspondence, (2) distribution correspondence, and (3) network feature correspondence; Section III introduces the existing algorithms and compare them in each step by their approaches; In Section IV, we examine those algorithms over the KITTI odometry dataset, and compare the precision and running speed. Section V concludes the paper and outlines the promising future research directions.

Ii Pipeline of LiDAR odometry

Ii-a Problem setting

A LiDAR frame is formatted as a point cloud that is represented as a set of 3D points. Given two LiDAR frames and , and their pose and in the world coordinates, a point cloud registration algorithm estimates the transformation . As shown in the Eqn 1, the transformation can be formatted as a 4-by-4 matrix, where denotes the 3D rotation matrix and

denotes the 3D translation vector.

(1)

In extension, given the transformation between LiDAR frame pairs, the transformation of any LiDAR frame can be calculated using Equation 2.

(2)

Therefore, the key step of LiDAR odometry task is to estimate the transformation between each consecutive LiDAR frame pairs .

Ii-B Pipeline

According to the existing works [30], [50], [68], the pipeline of LiDAR odometry can be divided into five stages: (1) pre-processing, (2) feature extraction, (3) correspondence searching, (4) transformation estimation, and (5) post-processing.

In the pre-processing step, LiDAR point clouds are re-organized, segmented, and filtered for better feature extraction and matching. Typical pre-processing methods are 3D-to-2D projection, semantic segmentation, moving object removal and ground removal.

Fig. 1: An illustration of example tasks in LiDAR frame pre-processing.

In the feature extraction step, key points or feature vectors are extracted from points or point clusters, which serve as candidates in feature correspondences. The feature extraction can be achieved by traditional image feature descriptors such as SIFT [32], SURF [4], ORB [47], or even networks like DCP [61]

. Other than key points, other features are also utilized, such as normal distribution NDT

[6] and network embedding as in OverlapNet [10].

Fig. 2: An illustration of example tasks in LiDAR frame feature extraction.

In the correspondence searching step, candidates are matched to generate correspondences. Existing works adopt three types of correspondences: (1) point correspondence, (2) distribution correspondence, and (3) network correspondence. Point correspondence searching is mostly applied since it is straightforward and compatible with existing keypoint matching algorithms. ICP [67], RANSAC [19]

, and neural networks are commonly used point correspondence searching algorithms. Distribution based algorithms do not need correspondence search as the pair of distributions is naturally a correspondence. Similarly, a network feature based algorithm has only one correspondence.

In the transformation estimation step, the methods are highly dependent on the type of correspondences. In general, given a correspondence set , and a transformation function

, the goal is to minimize the loss function as follows:

(3)
Fig. 3: An illustration of loop closure, an example task in LiDAR odometry post processing.

In the post-processing step, most existing works calculate the pose of each LiDAR frame using Equation 2. Several works takes advantages of loop closure [10], [60], [22] to refine the pose chain [66] [12].

Fig. 4: An illustration of example tasks in LiDAR frame transformation estimation.

Ii-C loss function for point cloud registration

For point correspondences, the loss function is

(4)

, where and are point pairs from the point correspondence , and R and t

are the target transformation parameters. In the existing works, singular value decomposition (SVD)

[26] is the most widely used method to to solve the equation.

Fig. 5: Summary of LiDAR odometry methods according to the correspondence type.

For distribution correspondence, normal distribution transform (NDT) [6] is the most popular solver. NDT is done by computing the extent of the match with normal distribution instead of the points directly. The normal distribution is obtained by calculating the mean () and covariance matrix (). As mentioned in [28], under the assumption that a set of point-clouds contained in a voxel is , the mean and covariance matrix are calculated as follows:

(5)

The normal distribution of D-dimension is computed as follows, based on () and () in Equation 5:

(6)

Then, from Equation 6, the score function of the individual source point is calculated. The target is to maximize the product of the scores of all points. The problem was also modified to calculate the sum of all scores by taking the log-likelihood of the whole expression rather than calculating the product of scores.

(7)

Here, in Equation 7, is the means to transform by vector . The scoring sum of all the source points converted by is the final score of . An optimization technique based on Newton’s law is employed to find a vector that minimizes the score.

(8)

Newton’s law is expressed as given in Equation 8, where, and represent the Hessian matrix and the gradient vector, respectively. They can be obtained as follows:

(9)

Here, is defined as .

(10)

The NDT approach computes the Hessian matrix and the gradient vector at each point and calculates , which minimizes the score by adding all the values. The calculated is added to , calculated in the previous step, to result in a new . Finally, the optimized transformation vector between two point clouds can be obtained by repeating the same process.

Comparing to point correspondence based and distribution correspondence based methods, network correspondence based methods do not use a determined feature extractor to search the pairs. Instead, they use neural networks with millions of parameters to embed a pair of point clouds and then estimate the transformation between them. By training with a huge amount of data samples, methods in this category learn the point cloud patterns in certain scenarios and precisely estimate the transformation between the LiDAR frames.

Iii Method and Algorithm

Over the past few decades, there have been several methods published to solve the LiDAR odometry problem. Based on the type correspondence used during the point cloud registration step, the works mainly follow three branches: (1) point correspondence based methods ([33] [69] [54] [30]), (2) distribution correspondence based methods ([39] [66] [50]), and (3) network correspondence based methods ([29] [70] [68] [44]). In Section II, we introduced the pipeline of a general LiDAR odometry solution, in which the major branches are compared in each step. This is summarized in the Figure 5. In this section, the methods in each branch are analyzed and compared at each step.

Iii-a Point correspondence based method

One of the earliest approaches, introduced in [67], that is still is in use for the LiDAR odometry application is the Iterative Point Cloud (ICP) method. The working of the algorithm is as delineated in Section II and Algorithm 1. The last three decades have seen a rise in ICP based methods like TrimmedICP [14], GICP [49], SparseICP [7], AA-ICP [43], SemanticICP [42], Suma++ [12], DGR [16], and ELO [69].

The point correspondence based methods extract key points from the LiDAR frames, establish the key point pairs, and estimate the transformation according to the pairs. A primary disadvantage of the ICP method is that when the point clouds are far, it is prone to local minima. TrimmedICP [14] is an extension of ICP that incorporates the Least Trimmed Squares (LTS) approach enabling the algorithm to function for overlaps under 50%.

Input:
  Two point-clouds:
  An initial transformation:
Output:
  The correct transformation, , which aligns
  and
1:  
2:  while not converged do
3:   for to do
4:      FindClosestPointInA;
5:     if then
6:      ;
7:     else
8:      ;
9:     end if
10:   end for
11:    ;
12:  end while
Algorithm 1 ICP algorithm [49]

Point-to-plane [21] variant of ICP improves the performance by taking advantage of surface normal information [13]. Instead of minimizing step 11, Algorithm 1 the point-to-plane algorithm minimizes error along the surface normal (i.e. the projection of onto the sub-space spanned by the surface-normal. Generalized ICP [49]

is a combination of the ICP and point-to-plane ICP algorithms into a single probabilistic framework. This approach outperforms ICP and the point-to-plane methods and is, in comparison, more robust to incorrect correspondences. GICP, apart from having a similar speed and simplicity to ICP, facilitates the addition of measurement noise, outlier terms, and probabilistic techniques to increase robustness.

Later, approaches like SemanticICP [42] and IMLS-SLAM [17]

have provided a new direction for tackling the LiDAR odometry problem. IMLS-SLAM pre-processes the dynamic objects and employs a sampling strategy on LiDAR scans to define a model from prior LiDAR sweeps using Implicit Moving Least Squares (IMLS) surface representation. Whereas SemanticICP conducts joint geometric and semantic inference to improve the registration task by incorporating pixelated semantic measurements into estimating the relative transformation between two point clouds. Here, point associations are treated as latent random variables, which leads to an Expectation-Maximization style solution. SemanticICP outperforms GICP

[49] due to the amalgamated use of semantic labels and the EM data associations.

Papers like SALO [27], DCP [61], DeepVCP [33] were published alongside the sufel based approaches SuMa and SuMa++. SALO utilizes the LiDAR sensor hardware and improved the ICP algorithm with novel downsampling and point matching rejection methods. Its advantage is the integration of the physics of the sensor for increased precision LiDAR odometry. The DCP paper [61] discusses the spherical projection of point cloud data to reduce the dimensionality of the input data.

Approaches like the SuMa++ [12] have a semantic step, with point-wise labels provided by RangeNet++ [37], to filter out the pixels of dynamic objects and add semantic constraints to the scan registration. In doing so, it outperforms SuMa [5]. DGR [16], DMLO [30], and SROM [48], are few recent methods where the three module-based DGR approach incorporates a Procrustes error for odometry estimation.

When the spherical projection is employed, points belonging to different surfaces could be adjacent in the range image. Some of the most recent approaches are ELO [69], and Zhu et al. [72], with Zhu et al. proposing a different way of handling point cloud sparseness with cylindrical instead of spherical projection and ELO utilizing Bird’s eye view as well, tackles the aforementioned problem. ELO has the smallest runtime among the methods, even outperforming the long-term performer LOAM.

Iii-B Distribution correspondence based methods

The Normal Distribution Transform method, introduced in [6], has been a defining type of approach to tackle the registration problem for LiDAR odometry applications. The relevant computations and theory for NDT approaches have been discussed in Section II. Over the last two decades, multiple versions of NDT such as 3DNDT [36], PNDT-LO [23], AugmentedNDT [2] and weightedNDT [28] have been proposed.

Input:
  The source point cloud
  The target point cloud
  Initial guess of transformation
Output:
  Final transformation between and
1:  
2:  for all points do
3:   find the cell that contains
4:

   classify

to entire cells
5:  end for
6:  for all cells do
7:   
8:   
9:  end for
10:  while not converged do
11:   score 0, 0, 0
12:   for all points do
13:    find the cell that contains
14:    update
15:   end for
16:   solve
17:   
18:  end while
Algorithm 2 NDT algorithm [28]

The key step in these approaches is to convert the input point clouds into equal cell sizes on which the normal distributions are computed. These are then compared with other normal distributions from other point clouds and are assigned a score. The algorithm finds the rotation and translation that increases the scores.

The initial NDT method was utilized primarily for 2D scan registrations. 3D-NDT, apart from extending this approach to 3-Dimensions, is advantageous as it forms a smooth piece-wise spatial representation that inturn facilitates a complete 3D-NDT map generation, post registration. This approach also outperforms the intial approaches like the ICP for point cloud registration task [36].

Next, one of the significant NDT-based approaches for odometry estimation called LOAM [66], was published that still has a high standing in the testing performance in the KITTI odometry dataset, Table II. It has no pre-processing for odometry. In LOAM, the feature points on the sharp edges and planar surface patches are selected, smoothness is evaluated [66] to identify edge and plane points and the point-to-edge and point-to-plane scan-matching is employed to arrive at the transformation between two scans. In the PNDT [23]

, unlike classical NDT, the probability ditribution function of each point is computed while calculating mean and the covariance, resulting in an improved translational and rotational accuracy. The advantage is that distributions are generated in all the occupied cells irrespective of the resolution. A variant of LOAM called LegoLOAM, introduced in

[50], additionally employs label matching to increase the likelihood of finding matches corresponding to the same object between two scans. Also, a two-step LM optimization is incorporated that allows similar accuracy as LOAM, while achieving a 35% reduced runtime.

DeLiO [54] for the first time in LiDAR odometry, introduces the decoupled translation and the rotation modules. But, DeLiO does not deal with dynamic objects. However, an approach called the weightedNDT [28] tackles point cloud matching in robust environments by assigning greater or lesser weights to points that have greater or lesser probabilities, respectively [74]. A LOAM-variant called SLOAM [9] introduced the idea of semantic features having greater reliability than the texture-based lines and planes. This enables SLOAM to have an advantage in being more robust in unstructured noisy environments. LiDAR based loop closure detection ignores the intensity reading and uses geometric-only descriptor, but, ISC-LOAM [57] leverages the intensity readings to facilitate effective place recognition. Most recent approaches towards LiDAR odometry are F-LOAM [56] and R-LOAM [39]. F-LOAM outperforms LOAM and LegoLOAM in terms of runtime. Whereas R-LOAM is an improvement upon LOAM as an additional cost for mesh features is incorporated that results in a reduction of median APE, compared to LOAM.

In terms of performance on the KITTI Odometry dataset, LOAM has consistently performed well. Among the LiDAR-only methods available, LOAM has of of the lowest rotation and translational error.

Methods 00 01 02 03 04 05 06 07 08 09 10 Avg
LOAM [66] 0.78/0.53 1.43/0.55 0.92/0.55 0.86/0.65 0.71/0.50 0.57/0.38 0.65/0.39 0.63/0.50 1.12/0.44 0.77/0.48 0.79/0.57 0.85/0.51
ELO [69] 0.54/0.20 0.61/0.13 0.54/0.18 0.65/0.27 0.32/0.15 0.33/0.17 0.30/0.13 0.31/0.16 0.79/0.21 0.48/0.14 0.59/0.19 0.50/0.18
IMLS-SLAM [17] -/0.50 -/0.82 -/0.53 -/0.68 -/0.33 -/0.32 -/0.33 -/0.33 -/0.80 -/0.55 -/0.53 -/0.55
SUMA++ [12] 0.22/0.64 0.46/1.60 0.37/1.00 0.46/0.67 0.26/0.37 0.20/0.40 0.21/0.46 0.19/0.34 0.35/1.10 0.23/0.47 0.28/0.66 0.29/0.70
SALO [27] 0.91/0.72 1.13/0.37 0.98/0.45 1.76/0.50 0.51/0.17 0.56/0.29 0.48/0.13 0.83/0.51 1.33/1.43 0.64/0.30 0.97/0.41 0.95/0.80
SuMa [5] 0.3/0.7 0.5/1.7 0.4/1.1 0.5/0.7 0.3/0.4 0.2/0.5 0.2/0.4 0.3/0.4 0.4/1.0 0.3/0.5 0.3/0.7 0.3/0.7
GICP [49] 1.29/0.64 4.39/0.91 2.53/0.77 1.68/1.08 3.76/1.07 1.02/0.54 0.92/0.46 0.64/0.45 1.58/0.75 1.97/0.77 1.31/0.62 1.91/0.73
LO-Net [29] 1.47/0.72 1.36/0.47 1.52/0.71 1.03/0.66 0.51/0.65 1.04/0.69 0.71/0.50 1.70/0.89 2.12/0.77 1.37/0.58 1.80/0.93 1.09/0.63
DeepLO [15] 0.32/0.12 0.16/0.05 0.15/0.05 0.04/0.01 0.01/0.01 0.11/0.07 0.03/0.07 0.08/0.05 0.09/0.04 13.35/4.45 5.83/3.53 1.83/0.76
DeepVCP [33] -/- -/- -/- -/- -/- -/- -/- -/- -/- -/- -/- 0.071/0.164
TABLE I: Comparision of performances of published methods on LiDAR odometry over KITTI Odometry training data
Published Methods Translation Rotation Runtime
LOAM [66] 0.55 % 0.0013 [deg/m] 0.1 s
MULLS [41] 0.65 % 0.0019 [deg/m] 0.08 s
ELO [69] 0.68 % 0.0021 [deg/m] 0.005 s
IMLS-SLAM [17] 0.69 % 0.0018 [deg/m] 1.25 s
MC2SLAM [38] 0.69 % 0.0016 [deg/m] 0.1 s
F-LOAM [39] 0.71 % 0.0022 [deg/m] 0.05 s
ISC-LOAM [57] 0.72 % 0.0022 [deg/m] 0.1 s
PSF-LO [8] 0.82 % 0.0032 [deg/m] 0.2s
CAE-LO [64] 0.86 % 0.0025 [deg/m] 2 s
CPFG-slam [25] 0.87 % 0.0025 [deg/m] 0.03 s
PNDT LO [23] 0.89 % 0.0030 [deg/m] 0.2 s
SuMa-MOS [11] 0.99 % 0.0033 [deg/m] 0.1s
SuMa++ [12] 1.06 % 0.0034 [deg/m] 0.1 s
ULF-ESGVI [65] 1.07 % 0.0036 [deg/m] 0.3 s
STEAM-L [52] 1.22 % 0.0058 [deg/m] 0.2 s
SALO [27] 1.37 % 0.0051 [deg/m] 0.6 s
SuMa [5] 1.39 % 0.0034 [deg/m] 0.1 s
3DOF-SLAM [18] 1.89 % 0.0083 [deg/m] 0.02 s
DeepCLR [24] 3.83 % 0.0104 [deg/m] 0.05 s
D3DLO [1] 5.40 % 0.0154 [deg/m] 0.1 s
TABLE II: Comparision of performances of published methods on LiDAR odometry over KITTI Odometry test data

Iii-C Network correspondence based methods

The network-based approaches employ neural networks in the pose estimation module of the LiDAR odometry pipeline. Approaches in the domain of LiDAR odometry using deep learning as the point cloud registration step are relatively recent when compared to the ICP and NDT-based approaches. Nevertheless, there has been a considerable number of works in the deep learning based Visual odometry like

[58] [74] [62].

The PointNet paper [44] (and PointNet++ [45]) a popular paper, proposed for point cloud object classification and part segmentation tasks, incorporates a novel network that directly takes in point clouds instead of having to use voxelization like in other approaches.

Next, an end-to-end method called LO-Net [29] was introduced that takes in LiDAR point cloud data and computes inter-scan 6-DoF relative pose. With it being an end-to-end trainable method, LO-Net learns an effective feature representation. This is facilitated by a new mask-weighted geometric constraint loss. This loss helps the algorithm encash the data’s sequential dependencies and dynamics. Here, the position and the orientation are estimated simultaneously. L3-Net [34]

has a blend of multiple approaches with mini-PointNet being used for feature extraction and 3DCNNs being used for regularization. Most approaches are supervised learning approaches, but for the first time, the paper DeepLO

[15] introduces supervised and unsupervised frameworks for geometry-aware LiDAR odometry. DeepLO also incorporates vertex and normal maps as network inputs without precision loss.

LodoNet [68] adapted the PointNet classification architecture into its rotation and translation estimation modules. But unlike LO-Net, the translation and the rotation modules are separate, resulting in two 3-DoF predictions at the end of odometry. Similar to DCP [61] and LeGOLOAM [50], spherical projection of the input LiDAR point clouds is made in approaches like LodoNet, in order to represent the 3-Dimensional data in 2-Dimensions which facilitates utilization of relatively less compute resources. The main constraint with the standard ICP approach to the LiDAR odometry problem is that due to the coupled nature of the odometry regression and the keypoint matching functions, there is a potential issue with training loss non-convergence as delineated in [29]. Whereas in LodoNet, in order to increase the robustness and the effectiveness of network, an MKP selection module based on PointNet [44] is employed to solve the segmentation problem. To tackle multi-view problems was a challenge that was yet to be solved for LiDAR odometry applications. An approach along these lines is the 3DRegNet [40] which can extend for scenarios that involve handling multiple views, not just two like with classical methods. In this approach, in order to classify the point correspondences and regress motion parameters for common reference frame scan alignment, convolutional layers and deep residual layers are incorporated into the neural network.

The most recent works are D3DLO [1], PWCLO [55] and OverlapNet [10]. Even though D3DLO and DeepCLR [24] have similar network architectures, D3DLO utilizes 3.56% of the network parameters. In doing so, it slightly underperforms compared to DeepCLR but manages to reduce the point cloud size by up to 40-50%. On the other hand, PWCLO, with hierarchical embedding mask optimization, outperforms LodoNet, LOAM, DMLO, and LO-Net in terms of translational and rotational errors on the KITTI odometry sequences.

The network based approaches have improved drastically over the years. Methods like D3DLO [1] have matched runtime of LOAM [66] and DeepCLR [24] has even performed twice as better in terms of runtime.

Iv Performance Comparison

All the LiDAR-only odometry methods as shown in Table I and Table II [20] have been compared on the publicly KITTI odometry dataset [20].In the training performance on the KITTI odometry data, Table I, DeepLO [15] seems to perform well in most of the sequences compared to the other methods. DeepVCP [33] gives the best average performance over all the 11 sequences, compared to others. From Table II, the NDT-based LOAM is the best performing method for LiDAR odometry with least rotational and transitional errors. But, many other methods match or even beat the runtime performance metric. ELO [69] has the best runtime among all the listed ones, with a runtime of 0.005s. Runtime is a very important metric for utilization and deployment into automated driving systems. MULLS [41], CPFG-SLAM [25], 3DOF-SLAM [18], DeepCLR [24], F-LOAM [39] and ELO [69] outperform LOAM in terms of runtime performance.
The Efficient LiDAR Odometry: ELO [69] method with a comparable error performance to LOAM and with a much superior runtime in the test set, also outperforms LOAM in the average performance in training set. ELO seems to be the best approach for the application of real-time LiDAR Odometry for autonomous driving.

V Conclusion

In this paper, the existing works on LiDAR odometry are surveyed and categorized as point correspondence, distribution correspondence, and network correspondence based methodologies. We also show the evaluations on the KITTI odometry dataset. In the survey, we found that each approach has its advantages and disadvantages. Also, that there are several works that explore the ways to fuse different approaches for better odometry estimation, e.g. DCP [61] use deep neural networks to generate point correspondences, which achieves promising results. In regards to the future direction of research, fusion-based approaches are suggested for precise LiDAR odometry.

References

  • [1] P. Adis, N. Horst, and M. Wien (2021) D3DLO: deep 3d lidar odometry. arXiv preprint arXiv:2101.12242. Cited by: §III-C, §III-C, TABLE II.
  • [2] N. Akai, L. Y. Morales, E. Takeuchi, Y. Yoshihara, and Y. Ninomiya (2017) Robust localization using 3d ndt scan matching with experimentally determined uncertainty and road marker matching. In 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 1356–1363. Cited by: §III-B.
  • [3] M. O. Aqel, M. H. Marhaban, M. I. Saripan, and N. B. Ismail (2016) Review of visual odometry: types, approaches, challenges, and applications. SpringerPlus 5 (1), pp. 1–26. Cited by: §I.
  • [4] H. Bay, T. Tuytelaars, and L. Van Gool (2006) Surf: speeded up robust features. In

    European conference on computer vision

    ,
    pp. 404–417. Cited by: §II-B.
  • [5] J. Behley and C. Stachniss (2018) Efficient surfel-based slam using 3d laser range data in urban environments.. In Robotics: Science and Systems, Vol. 2018. Cited by: §III-A, TABLE I, TABLE II.
  • [6] P. Biber and W. Straßer (2003) The normal distributions transform: a new approach to laser scan matching. In Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), Vol. 3, pp. 2743–2748. Cited by: §II-B, §II-C, §III-B.
  • [7] S. Bouaziz, A. Tagliasacchi, and M. Pauly (2013) Sparse iterative closest point. In Computer graphics forum, Vol. 32, pp. 113–123. Cited by: §III-A.
  • [8] G. Chen, B. Wang, X. Wang, H. Deng, B. Wang, and S. Zhang (2020) PSF-lo: parameterized semantic features based lidar odometry. arXiv preprint arXiv:2010.13355. Cited by: TABLE II.
  • [9] S. W. Chen, G. V. Nardari, E. S. Lee, C. Qu, X. Liu, R. A. F. Romero, and V. Kumar (2020) Sloam: semantic lidar odometry and mapping for forest inventory. IEEE Robotics and Automation Letters 5 (2), pp. 612–619. Cited by: §III-B.
  • [10] X. Chen, T. Läbe, A. Milioto, T. Röhling, O. Vysotska, A. Haag, J. Behley, and C. Stachniss (2021) OverlapNet: loop closing for lidar-based slam. arXiv preprint arXiv:2105.11344. Cited by: §II-B, §II-B, §III-C.
  • [11] X. Chen, S. Li, B. Mersch, L. Wiesmann, J. Gall, J. Behley, and C. Stachniss (2021) Moving object segmentation in 3d lidar data: a learning-based approach exploiting sequential data. arXiv preprint arXiv:2105.08971. Cited by: TABLE II.
  • [12] X. Chen, A. Milioto, E. Palazzolo, P. Giguere, J. Behley, and C. Stachniss (2019) Suma++: efficient lidar-based semantic slam. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4530–4537. Cited by: §II-B, §III-A, §III-A, TABLE I, TABLE II.
  • [13] Y. Chen and G. Medioni (1992) Object modelling by registration of multiple range images. Image and vision computing 10 (3), pp. 145–155. Cited by: §III-A.
  • [14] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek (2002) The trimmed iterative closest point algorithm. In Object recognition supported by user interaction for service robots, Vol. 3, pp. 545–548. Cited by: §III-A, §III-A.
  • [15] Y. Cho, G. Kim, and A. Kim (2019) Deeplo: geometry-aware deep lidar odometry. arXiv preprint arXiv:1902.10562. Cited by: §III-C, TABLE I, §IV.
  • [16] C. Choy, W. Dong, and V. Koltun (2020) Deep global registration. In

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    ,
    pp. 2514–2523. Cited by: §III-A, §III-A.
  • [17] J. Deschaud (2018) IMLS-slam: scan-to-model matching based on 3d data. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 2480–2485. Cited by: §III-A, TABLE I, TABLE II.
  • [18] M. Dimitrievski, D. Van Hamme, P. Veelaert, and W. Philips (2016) Robust matching of occupancy maps for odometry in autonomous vehicles. In 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016), Vol. 3, pp. 626–633. Cited by: TABLE II, §IV.
  • [19] M. A. Fischler and R. C. Bolles (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24 (6), pp. 381–395. Cited by: §II-B.
  • [20] A. Geiger, P. Lenz, and R. Urtasun (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, pp. 3354–3361. Cited by: §IV.
  • [21] D. Grant, J. Bethel, and M. Crawford (2012) Point-to-plane registration of terrestrial laser scans. ISPRS Journal of Photogrammetry and Remote Sensing 72, pp. 16–26. Cited by: §III-A.
  • [22] W. Hess, D. Kohler, H. Rapp, and D. Andor (2016) Real-time loop closure in 2d lidar slam. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 1271–1278. Cited by: §II-B.
  • [23] H. Hong and B. H. Lee (2017) Probabilistic normal distributions transform representation for accurate 3d point cloud registration. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3333–3338. Cited by: §III-B, §III-B, TABLE II.
  • [24] M. Horn, N. Engel, V. Belagiannis, M. Buchholz, and K. Dietmayer (2020) DeepCLR: correspondence-less architecture for deep end-to-end point cloud registration. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–7. Cited by: §III-C, §III-C, TABLE II, §IV.
  • [25] K. Ji, H. Chen, H. Di, J. Gong, G. Xiong, J. Qi, and T. Yi (2018) CPFG-slam: a robust simultaneous localization and mapping based on lidar in off-road environment. In 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 650–655. Cited by: TABLE II, §IV.
  • [26] K. Kanatani (1994) Analysis of 3-d rotation fitting. IEEE Transactions on pattern analysis and machine intelligence 16 (5), pp. 543–549. Cited by: §II-C.
  • [27] D. Kovalenko, M. Korobkin, and A. Minin (2019) Sensor aware lidar odometry. In 2019 European Conference on Mobile Robots (ECMR), pp. 1–6. Cited by: §III-A, TABLE I, TABLE II.
  • [28] S. Lee, C. Kim, S. Cho, S. Myoungho, and K. Jo (2020) Robust 3-dimension point cloud mapping in dynamic environment using point-wise static probability-based ndt scan-matching. IEEE Access 8, pp. 175563–175575. Cited by: §II-C, §III-B, §III-B, 2.
  • [29] Q. Li, S. Chen, C. Wang, X. Li, C. Wen, M. Cheng, and J. Li (2019) Lo-net: deep real-time lidar odometry. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8473–8482. Cited by: §III-C, §III-C, TABLE I, §III.
  • [30] Z. Li and N. Wang (2020) Dmlo: deep matching lidar odometry. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6010–6017. Cited by: §II-B, §III-A, §III.
  • [31] K. Lianos, J. L. Schonberger, M. Pollefeys, and T. Sattler (2018) Vso: visual semantic odometry. In Proceedings of the European conference on computer vision (ECCV), pp. 234–250. Cited by: §I.
  • [32] D. G. Lowe (2004) Distinctive image features from scale-invariant keypoints. International journal of computer vision 60 (2), pp. 91–110. Cited by: §II-B.
  • [33] W. Lu, G. Wan, Y. Zhou, X. Fu, P. Yuan, and S. Song (2019) Deepvcp: an end-to-end deep neural network for point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12–21. Cited by: §III-A, TABLE I, §III, §IV.
  • [34] W. Lu, Y. Zhou, G. Wan, S. Hou, and S. Song (2019) L3-net: towards learning based lidar localization for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6389–6398. Cited by: §III-C.
  • [35] Y. Lu and D. Song (2015) Robust rgb-d odometry using point and line features. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3934–3942. Cited by: §I.
  • [36] M. Magnusson, A. Nüchter, C. Lörken, A. J. Lilienthal, and J. Hertzberg (2008) 3D mapping the kvarntorp mine: a rield experiment for evaluation of 3d scan matching algorithms. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Workshop” 3D Mapping”, Nice, France, September 2008, Cited by: §III-B, §III-B.
  • [37] A. Milioto, I. Vizzo, J. Behley, and C. Stachniss (2019) Rangenet++: fast and accurate lidar semantic segmentation. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213–4220. Cited by: §III-A.
  • [38] F. Neuhaus, T. Koß, R. Kohnen, and D. Paulus (2018) Mc2slam: real-time inertial lidar odometry using two-scan motion compensation. In German Conference on Pattern Recognition, pp. 60–72. Cited by: TABLE II.
  • [39] M. Oelsch, M. Karimi, and E. Steinbach (2021) R-loam: improving lidar odometry and mapping with point-to-mesh features of a known 3d reference object. IEEE Robotics and Automation Letters 6 (2), pp. 2068–2075. Cited by: §III-B, TABLE II, §III, §IV.
  • [40] G. D. Pais, S. Ramalingam, V. M. Govindu, J. C. Nascimento, R. Chellappa, and P. Miraldo (2020) 3dregnet: a deep neural network for 3d point registration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7193–7203. Cited by: §III-C.
  • [41] Y. Pan, P. Xiao, Y. He, Z. Shao, and Z. Li (2021) MULLS: versatile lidar slam via multi-metric linear least square. arXiv preprint arXiv:2102.03771. Cited by: TABLE II, §IV.
  • [42] S. A. Parkison, L. Gan, M. G. Jadidi, and R. M. Eustice (2018) Semantic iterative closest point through expectation-maximization.. In BMVC, pp. 280. Cited by: §III-A, §III-A.
  • [43] A. L. Pavlov, G. W. Ovchinnikov, D. Y. Derbyshev, D. Tsetserukou, and I. V. Oseledets (2018) AA-icp: iterative closest point with anderson acceleration. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3407–3412. Cited by: §III-A.
  • [44] C. R. Qi, H. Su, K. Mo, and L. J. Guibas (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660. Cited by: §III-C, §III-C, §III.
  • [45] C. R. Qi, L. Yi, H. Su, and L. J. Guibas (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413. Cited by: §III-C.
  • [46] R. Roriz, J. Cabral, and T. Gomes (2021) Automotive lidar technology: a survey. IEEE Transactions on Intelligent Transportation Systems. Cited by: §I.
  • [47] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski (2011) ORB: an efficient alternative to sift or surf. In 2011 International conference on computer vision, pp. 2564–2571. Cited by: §II-B.
  • [48] N. Rufus, U. K. R. Nair, A. S. B. Kumar, V. Madiraju, and K. M. Krishna (2020) SROM: simple real-time odometry and mapping using lidar data for autonomous vehicles. In 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1867–1872. Cited by: §III-A.
  • [49] A. Segal, D. Haehnel, and S. Thrun (2009) Generalized-icp.. In Robotics: science and systems, Vol. 2, pp. 435. Cited by: §III-A, §III-A, §III-A, TABLE I, 1.
  • [50] T. Shan and B. Englot (2018) Lego-loam: lightweight and ground-optimized lidar odometry and mapping on variable terrain. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4758–4765. Cited by: §II-B, §III-B, §III-C, §III.
  • [51] L. Sheng, D. Xu, W. Ouyang, and X. Wang (2019) Unsupervised collaborative learning of keyframe detection and visual odometry towards monocular deep slam. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4302–4311. Cited by: §I.
  • [52] T. Y. Tang, D. J. Yoon, and T. D. Barfoot (2019)

    A white-noise-on-jerk motion prior for continuous-time trajectory estimation on se (3)

    .
    IEEE Robotics and Automation Letters 4 (2), pp. 594–601. Cited by: TABLE II.
  • [53] J. J. Tarrio and S. Pedre (2015) Realtime edge-based visual odometry for a monocular camera. In Proceedings of the IEEE International Conference on Computer Vision, pp. 702–710. Cited by: §I.
  • [54] Q. M. Thomas, O. Wasenmüller, and D. Stricker (2019) Delio: decoupled lidar odometry. In 2019 IEEE Intelligent Vehicles Symposium (IV), pp. 1549–1556. Cited by: §III-B, §III.
  • [55] G. Wang, X. Wu, Z. Liu, and H. Wang (2021) PWCLO-net: deep lidar odometry in 3d point clouds using hierarchical embedding mask optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15910–15919. Cited by: §III-C.
  • [56] H. Wang, C. Wang, C. Chen, and L. Xie (2021) F-loam: fast lidar odometry and mapping. arXiv preprint arXiv:2107.00822. Cited by: §III-B.
  • [57] H. Wang, C. Wang, and L. Xie (2020) Intensity scan context: coding intensity and geometry relations for loop closure detection. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 2095–2101. Cited by: §III-B, TABLE II.
  • [58] R. Wang, S. M. Pizer, and J. Frahm (2019) Recurrent neural network for (un-) supervised learning of monocular video visual odometry and depth. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5555–5564. Cited by: §I, §III-C.
  • [59] R. Wang, M. Schworer, and D. Cremers (2017) Stereo dso: large-scale direct sparse visual odometry with stereo cameras. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3903–3911. Cited by: §I.
  • [60] Y. Wang, Z. Sun, C. Xu, S. E. Sarma, J. Yang, and H. Kong (2020) Lidar iris for loop-closure detection. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5769–5775. Cited by: §II-B.
  • [61] Y. Wang and J. M. Solomon (2019) Deep closest point: learning representations for point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3523–3532. Cited by: §II-B, §III-A, §III-C, §V.
  • [62] N. Yang, L. v. Stumberg, R. Wang, and D. Cremers (2020) D3vo: deep depth, deep pose and deep uncertainty for monocular visual odometry. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1281–1292. Cited by: §I, §III-C.
  • [63] N. Yang, R. Wang, J. Stuckler, and D. Cremers (2018) Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 817–833. Cited by: §I.
  • [64] D. Yin, Q. Zhang, J. Liu, X. Liang, Y. Wang, J. Maanpää, H. Ma, J. Hyyppä, and R. Chen (2020) Cae-lo: lidar odometry leveraging fully unsupervised convolutional auto-encoder for interest point detection and feature description. arXiv preprint arXiv:2001.01354. Cited by: TABLE II.
  • [65] D. J. Yoon, H. Zhang, M. Gridseth, H. Thomas, and T. D. Barfoot (2021) Unsupervised learning of lidar features for use ina probabilistic trajectory estimator. IEEE Robotics and Automation Letters 6 (2), pp. 2130–2138. Cited by: TABLE II.
  • [66] J. Zhang and S. Singh (2014) LOAM: lidar odometry and mapping in real-time.. In Robotics: Science and Systems, Vol. 2. Cited by: §II-B, §III-B, §III-C, TABLE I, TABLE II, §III.
  • [67] Z. Zhang (1992) Iterative point matching for registration of free-form curves. Ph.D. Thesis, Inria. Cited by: §II-B, §III-A.
  • [68] C. Zheng, Y. Lyu, M. Li, and Z. Zhang (2020) Lodonet: a deep neural network with 2d keypoint matching for 3d lidar odometry estimation. In Proceedings of the 28th ACM International Conference on Multimedia, pp. 2391–2399. Cited by: §II-B, §III-C, §III.
  • [69] X. Zheng and J. Zhu (2021) Efficient lidar odometry for autonomous driving. arXiv preprint arXiv:2104.10879. Cited by: §III-A, §III-A, TABLE I, TABLE II, §III, §IV.
  • [70] Y. Zhou and O. Tuzel (2018) Voxelnet: end-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4490–4499. Cited by: §III.
  • [71] N. Zhu, J. Marais, D. Bétaille, and M. Berbineau (2018) GNSS position integrity in urban environments: a review of literature. IEEE Transactions on Intelligent Transportation Systems 19 (9), pp. 2762–2778. Cited by: §I.
  • [72] X. Zhu, H. Zhou, T. Wang, F. Hong, Y. Ma, W. Li, H. Li, and D. Lin (2021) Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9939–9948. Cited by: §III-A.
  • [73] A. Zihao Zhu, N. Atanasov, and K. Daniilidis (2017) Event-based visual inertial odometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5391–5399. Cited by: §I.
  • [74] Y. Zou, P. Ji, Q. Tran, J. Huang, and M. Chandraker (2020) Learning monocular visual odometry via self-supervised long-term modeling. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, pp. 710–727. Cited by: §I, §III-B, §III-C.