LSG-CPD: Coherent Point Drift with Local Surface Geometry for Point Cloud Registration

03/28/2021 ∙ by Weixiao Liu, et al. ∙ National University of Singapore 0

Probabilistic point cloud registration methods are becoming more popular because of their robustness. However, unlike point-to-plane variants of iterative closest point (ICP) which incorporate local surface geometric information such as surface normals, most probabilistic methods (e.g., coherent point drift (CPD)) ignore such information and build Gaussian mixture models (GMMs) with isotropic Gaussian covariances. This results in sphere-like GMM components which only penalize the point-to-point distance between the two point clouds. In this paper, we propose a novel method called CPD with Local Surface Geometry (LSG-CPD) for rigid point cloud registration. Our method adaptively adds different levels of point-to-plane penalization on top of the point-to-point penalization based on the flatness of the local surface. This results in GMM components with anisotropic covariances. We formulate point cloud registration as a maximum likelihood estimation (MLE) problem and solve it with the Expectation-Maximization (EM) algorithm. In the E step, we demonstrate that the computation can be recast into simple matrix manipulations and efficiently computed on a GPU. In the M step, we perform an unconstrained optimization on a matrix Lie group to efficiently update the rigid transformation of the registration. The proposed method outperforms state-of-the-art algorithms in terms of accuracy and robustness on various datasets captured with range scanners, RGBD cameras, and LiDARs. Also, it is significantly faster than modern implementations of CPD. The code will be released.



There are no comments yet.


page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: CPD with Local Surface Geometry (LSG-CPD). Green denotes the source set; purple denotes the target set. Unlike CPD [myronenko2010point] which constructs a sphere-like GMM component (c) for each point in the source set, the proposed method constructs GMM components on the target set. In the proposed method, plane-like GMM components (a) are created for points with small surface variation [surface_variance], i.e., flat local surface; sphere-like GMM components (b) are created for points with large surface variation.

Point cloud registration plays an important role in various computer vision tasks

[nuchter20076d, whelan2015elasticfusion, chen1992object, schmidt2014dart]. It is a task of aligning two point clouds and estimating the relative transformation between them. The most popular method, iterative closest point (ICP) [besl1992ICP], iteratively computes the alignment from point-to-point correspondence obtained via the closest point. Various point-to-plane ICP variants [chen1992object, Segal-RSS-09, chetverikov2002trimmed, NICP]

make use of surface normals and take advantage of the tendency that most point cloud data are locally planar. However, ICP and its variants require good initialization and are susceptible to outliers, noise, and missing points.

Probabilistic methods formulate point cloud registration as a probability density estimation problem

[myronenko2010point, gao2019filterreg, jian2010robust, stoyanov2012fast]. In particular, methods such as CPD [myronenko2010point] and FilterReg [gao2019filterreg]

describe the geometry of the first point cloud via a probability distribution over Euclidean space with a Gaussian mixture model (GMM). The GMM is constructed by placing a Gaussian component with isotropic covariance at every point in the set. The second set is fitted to it via maximum likelihood estimation (MLE). The introduction of the outlier component in the GMM and the sphere-like isotropic covariance makes these methods robust. However, unlike point-to-plane ICP variants

[chen1992object, chetverikov2002trimmed, Segal-RSS-09, NICP] which take advantage of the local planar geometry in point cloud data, these methods ignore such information. While these methods are robust, the isotropic covariance does not faithfully capture the local planar geometry of the point clouds, leading to inaccuracy in registration.

In this paper, we propose a novel method, CPD with Local Surface Geometry (LSG-CPD), which takes into account local surface geometry: surface normal and surface variation. Surface variation [surface_variance] is closely related to curvature and surface flatness. Intuitively, the flatter the local surface, the smaller the curvature, the smaller the surface variation. We construct a GMM on the target set. The source set is fitted to it via MLE (Fig. 1). However, unlike previous methods [myronenko2010point, gao2019filterreg] which restrict covariances to be isotropic, we adaptively impose different levels of point-to-plane penalization on top of the original point-to-point penalization, resulting in anisotropic covariances (Sec. 3.1). If the local surface is flat, i.e., the surface variation is small, a large point-to-plane penalization is added. The GMM component will be plane-like and points will be mainly attracted towards the local plane instead of the centroid (Fig. 1(a)). If the local surface has large curvature or the normal estimation is noisy, i.e., the surface variation is large, the point-to-plane penalization will be small (Fig. 1(b)). In this case, the GMM component will be sphere-like and robustness is maintained – the registration will not be misled by those ineffectively estimated surface normals. In addition, we propose a method to estimate the weight placed on the outlier component of the GMM (Sec. 3.2). Furthermore, we introduce the idea of confidence filtering (CF) which allows flexibility in reallocating weights of GMM components and outlier weights based on measurement confidence of points (Sec. 3.3). The Expectation-Maximization (EM) algorithm [bishop2006pattern, DEMP1977] is used to solve the MLE problem. To make our method efficient, we reformulate the computation in the E step into parallelizable matrix manipulations which can be efficiently computed on a GPU (Sec. 3.4). In the M step, we propose an unconstrained optimization method on a matrix Lie group and find the optimum with the efficient Newton’s method (Sec. 3.5).

Our motivation is to make the GMM-based description of the point cloud geometry richer and more faithful to the “true” probabilistic distribution by incorporating local surface geometry, afforded by the surface normal and variation, with the original point position. The optimum of the objective function in the EM process will be closer to the “true” transformation compared to that using the isotropic covariances. Thus, our method is able to improve the registration accuracy while maintaining the robustness of probabilistic methods. Results show that the proposed method outperforms state-of-the-art algorithms in terms of accuracy and robustness on a wide variety of dataset captured by range scanners, RGBD cameras, and LiDARs. Thanks to the parallelizable computation and the efficient optimization, our method is about 20 times faster than modern implementations of CPD [myronenko2010point].

2 Related Works

Point cloud registration has been extensively studied in the literature. In this review, we limit the scope to GMM-based probabilistic methods.

One popular formulation of GMM-based methods builds a GMM distribution on one point set and softly assigns correspondence between each pair of points in the two sets probabilistically [rangarajan1997robust, luo2003unified, mcneill2006probabilistic, granger2002multi, myronenko2010point, gao2019filterreg, horaud2010rigid, eckart2018hgmr, min2019robust, ravikumar2017generalised]. Our method falls into this category. The most related works are CPD [myronenko2010point] and FilterReg [gao2019filterreg]. CPD [myronenko2010point] derives the EM procedure and proposes a close-form solution in the M step. Recently, FilterReg [gao2019filterreg] introduces Gaussian filter and twist parameterization to improve the efficiency of the CPD framework. Point-to-point FilterReg [gao2019filterreg] is similar to CPD [myronenko2010point]. However, both methods ignore local surface geometry and place an isotropic covariance around every point in the first point cloud to construct the GMM. Point-to-plane FilterReg [gao2019filterreg] tries to account for the surface normal by substituting the point-to-point penalization with point-to-plane penalization in the M step. This leads to various problems including rank deficiency of the covariance and inconsistency between the probabilistic model in the E and M step. A more technical analysis can be found in Sec. 3.1. ECMPR [horaud2010rigid] updates the whole covariance matrix in the M step. Although it allows general anisotropic covariances, no local geometric information is considered. GMM-Tree [eckart2018hgmr] partitions points into clusters and builds a hierarchy of Gaussian mixtures for the clusters. The covariances, estimated directly by all the points in the cluster instead of local surface geometry, are anisotropic. But it does not account for outliers in its probabilistic modeling, making it not robust to outliers and missing points. In [min2019robust, ravikumar2017generalised], hybrid mixture models (e.g., Gaussian von-Mises-Fisher) are used to model position and orientation uncertainty. The complicated mixture models introduce heavy computation. In contrast, our method incorporates surface normals in a simple GMM adaptively, making it robust to ineffective surface normal estimation.

Another popular line of GMM-based methods build GMM probability distribution on both the source and the target sets [jian2010robust, tabib2018manifold, stoyanov2012fast, campbell2015adaptive]. Point cloud registration is formulated as a problem of minimizing the difference between the two distributions. [jian2010robust] uses KL-divergence to parameterize the distribution difference while [tabib2018manifold] uses squared L2 norm. To improve the efficiency, voxelization [stoyanov2012fast]

and Support Vector Machine

[campbell2015adaptive] are used to represent the probability representation in a compact and efficient way.

3 Methods

3.1 Formulation of Probabilistic Model

The alignment of two point clouds is formulated as an MLE problem. We fix one point cloud (target set) and align the other one (source set) to it. Unlike CPD [myronenko2010point] which places the GMM on the source set, the GMM in our method is built on the target set, which is similar to FilterReg[gao2019filterreg] (Fig. 1). Rigid transformation is applied on the source set and the transformed source set is regarded as the observation of the GMM. The goal is to find the optimal transformation which makes the observation best fit the GMM. Through out the paper, we adopt the following notation:

  • - number of points in the source set

  • - number of points in the target set

  • - source set where the -th row is the position of the -th point in the set

  • - target set where the -th row is the position of the -th point in the set

  • - surface normal matrix where the -th row

    is the unit normal vector of the corresponding

    -th point in ,

Following [myronenko2010point, gao2019filterreg, eckart2018hgmr]

, the probability density function of the GMM is given as:



is the prior probability of

being assigned to the -th component; is the corresponding component density function. Besides the Gaussian components centered at each point in the target set, an outlier component is added to the GMM to make the algorithm robust to outliers and missing points. is the outlier weight of and is the outlier probability density.

In [myronenko2010point, gao2019filterreg], the outlier probability density . As pointed out in [hirose2020bayesian], it does not satisfy a property of probability density function: the integral of over should equal 1. In this paper, we follow [hirose2020bayesian] and define a working space whose volume equals . represents a volume which encapsulates the target set. if and otherwise. In [myronenko2010point, gao2019filterreg], for ; for . Both and are uniform across all points in the source and target set, respectively. In Sec. 3.2, we introduce a method to estimate from the outlier ratio of the point cloud. In Sec. 3.3, we propose a method which allows flexibility to assign different and different to different and respectively according to their measurement confidence.

The key novelty of our method lies in our design of which incorporates both point-to-point and point-to-plane penalization into the probabilistic modeling. Surface normals are easily obtainable features of point cloud data [hoppe1992surface, surface_variance] and have been successfully utilized in many ICP variants [chen1992object, Segal-RSS-09, 10.1007/978-3-319-10404-1_23, NICP] to improve the registration performance. Point-to-plane extension of FilterReg [gao2019filterreg] tries to extend the CPD [myronenko2010point] framework by substituting the point-to-point distance in the original objective function with the point-to-plane distance in the M step without further modifying the E step. This will lead to rank deficiency of the covariances, thus resulting in an ill-posed optimization in the M step. More importantly, the underlying probabilistic model in the E and M step are inconsistent. Neither convergence nor optimality is guaranteed for the EM process [wu1983EMConverge]. In this paper, we propose a component density function as follows:


is a normalizing constant. is a full-rank anisotropic covariance matrix of which the inverse encodes information of the local surface geometry. is the covariance multiplier. It is initialized autonomously similar to CPD [myronenko2010point]. The identity matrix in penalizes the point-to-point distance between and similar to [myronenko2010point, gao2019filterreg]. The rank one projection matrix penalizes the point-to-plane distance between and the local plane defined by at . These two matrices mutually benefit each other. enhances the accuracy provided by the point-to-point penalization. Meanwhile, serves as a regularizer, pulling the problem away from ill-posedness. The penalization coefficient is set adaptively to add different levels of point-to-plane penalization depending on how flat the local surface is. This is realized by evaluating the surface variation [surface_variance] within the neighborhood of . Surface variation [surface_variance]

is the ratio of the smallest eigenvalue to the trace of the covariance matrix of a local neighborhood (note that this covariance matrix is not

). It is computed via eigenanalysis. It describes the degree of deviation of points from the tangent plane. Intuitively, the flatter the local surface, the smaller the curvature, the smaller the surface variation.

is given by a modified logistic sigmoid function:


where is the upper bound; is a coefficient governing the sensitivity of to . when (large surface variation, non-flat local surface geometry); when (small surface variation, flat local surface geometry). By making adaptive to local surface geometry, our algorithm obtains several desirable properties: (1) large point-to-plane penalization is only applied to points around which the local surface can be well approximated by a plane; (2) little directional preference is imposed when the curvature is large or effective normal estimation is unavailable, e.g., corrupted by noise.

We denote as the image of the point transformed by the rigid transformation . Given the GMM and the observation, the negative log-likelihood function is:


Our goal is to find the optimal rigid transformation which minimizes . However, directly solving this optimization problem is intractable. Thus, the EM algorithm [bishop2006pattern, DEMP1977] is applied to find the solution. The correspondence between the points in and is treated as a latent variable.

E step: We use the transformation from the previous iteration to compute the posterior correspondence probability between each pair of point in and via Bayes’ rule. The correspondence probability is stored in a matrix where the -th entry

represents the posterior probability of the point

being assigned to the -th component:


The computation of can be reformulated into efficient and parallelizable matrix manipulations (Sec. 3.4).

M step: The rigid transformation and covariance multiplier are inferred by minimizing the following [myronenko2010point, bishop2006pattern] (see Supplementary Material for details):


In the case where the covariance matrix is isotropic, a closed form solution of Eq. (6) can be found [myronenko2010point]. However, this is not the case when is anisotropic[horaud2010rigid]. Thus, we propose an unconstrained optimization method on a matrix Lie group to solve the optimization in the M step in Sec. 3.5.

3.2 Outlier Weight Estimation

In [myronenko2010point, gao2019filterreg], the uniform outlier weight is manually tuned by the user. However,

is merely a hyperparameter which is different from the outlier ratio

- the ratio of outliers to the number of points in the source set. This makes it hard to choose in different tasks intuitively. Here, we introduce a method to determine from using the fact that and are probabilistically related. In this way, the user only needs to set the outlier ratio , which is more intuitive than setting . When the point clouds are well aligned by the “true” transformation , the expectation of the outlier number should equal the true number of outliers :



is the posterior probability of the point being an outlier. can then be solved from Eq. (7). However, is not available a priori. Thus, from Eq. (7), we can only derive the upper bound of , i.e., , by setting as its upper bound (Eq. (2)). To make the algorithm robust, generally a large is desired. However, if , probabilistically, more points are regarded as outliers than the truth. The inlier density has the risk of being submerged by the outlier density. And as , the density of the GMM will be primarily dominated by the outlier component. The algorithm tends to get stuck at a local minimum. Therefore, we set in practice to strike a balance.

3.3 Confidence Filtering

Most GMM-based methods [gao2019filterreg, horaud2010rigid, jian2010robust, myronenko2010point, eckart2018hgmr, min2019robust] weigh all the GMM components uniformly with . In addition, the outlier weight is uniform across all points in the source set. This setting is reasonable when the measurement error induced by the sensor is uniform within its range, e.g., range scanners. And thus the measurement confidence of all the points is equal in general. However, point clouds acquired from consumer-grade RGBD cameras are subject to bias and noise which grows with the increment of depth [basso2018robust, depth_camera_error_model]. The larger the depth of the point is, the less confident (or more uncertain) the measurement is. The algorithm should pay more attention to the alignment of points measured with high confidence. In this paper, we define as the measurement confidence of a point , where is the error model which provides the statistical measurement error of a point measured at [depth_camera_error_model]; is the minimum measurement error within the sensor range. Error analysis and modeling of different sensors can be found in [depth_camera_error_model, horaud2016overview, christian2013survey]. The confidence filtering (CF) modifies and as:


The component prior probability decreases as decreases. The reallocation of reduces the attraction force of a component generated by a point with less confidence and vice versa. The outlier weight increases as decreases. is the baseline outlier weight determined in Sec. 3.2. As the measurement of becomes less confident, we would rather categorize it as an “outlier” to lower its contribution to the registration. Furthermore, points with confidence lower than a threshold can be truncated. This not only improves the accuracy but also reduces the runtime as shown in Sec. 4.3. Note that the uniformly weighing setting is a special case where for any .

3.4 E Step: Efficient Computation of Correspondence Matrix

The introduction of the correspondence matrix provides GMM-based methods with an inherent advantage on robustness. However, computing is the most computationally expensive part of the algorithm: the correspondence probability of every pair of points in and needs to be computed. We reformulate this process into simple matrix manipulations which can be efficiently computed on a GPU.

Plugging and defined in the previous sections and Eq. (2) into Eq. (5), the -th entry of the correspondence matrix becomes:


where and is the -th entry of a matrix :

Exploiting the special formulation of , the computation of can be recast into fundamental matrix manipulations:


in which is the element-wise product of matrices; is the element-wise exponential; the -th entry of

in which is the column vector with all ones,

are terms independent of transformation , and thus can be pre-computed.

3.5 M Step: Optimization On Matrix Lie Group

In the M step, our goal is to infer the optimal transformation and variance multiplier

by minimizing given by Eq. (6). We propose an unconstrained optimization method to solve this optimization problem. Instead of parameterizing the rigid transformation, our method optimizes on the matrix Lie group via Newton’s method. The gradient and the Hessian matrix of a function on matrix Lie group are defined with respect to the basis in Lie algebra [stochastic]. We denote as the basis matrices of Lie algebra. The gradient . The Hessian is asymmetric in which the -th entry . is the right derivative operators with respect to the -th basis matrix . Both the gradient and the Hessian can be derived in closed-form:


, , and are the homogeneous form of , , and , respectively. is the augmented inverse covariance matrix where denotes the direct sum. Detailed derivations can be found in the Supplementary Material. It is worth noting that this method can be easily extended to other types of registration of which the transformation can be represented by a matrix Lie group, e.g., affine transformation, by replacing the basis matrices in Lie algebra. The transformation is updated by Newton’s method until convergence:


where is the operation which maps the vector to the corresponding matrix Lie algebra [murray1994mathematical]. After gets updated, is inferred by setting the corresponding partial derivative of Eq. (6) to zero. Unlike ECMPR[horaud2010rigid] in which the whole covariance matrix is updated, only the covariance multiplier gets updated in our approach. As a result, while the volume of the covariance is updated in every iteration, the local geometric structure is preserved.

4 Results

We conduct a wide variety of experiments on public 3D point cloud datasets, including both object and environment dataset, to show the robustness and accuracy of our method. We implement our method on Matlab. We evaluate on a computer running Intel Core i9-9900K with an NVIDIA GeForce RTX 2080Ti GPU. In all the experiments, the code of the baseline methods are either provided directly by the authors or taken from the popular open source library with various performance optimization. The parameters for all the baseline methods are either provided by the authors or the softwares or well-tuned by ourselves if not provided. CF is only used in the experiment on the RGBD dataset (Sec.

4.3). The error model of the sensor (Asus Xtion Pro), which is a quadratic function of the depth of , is sourced from [depth_camera_error_model]. More details on the experiment settings can be found in the Supplementary Material.

4.1 Registration on Data with Outliers and Noise

We follow [myronenko2010point, gao2019filterreg] and test on laser range data by adding outliers and noise, respectively. For both experiments, we use the Bunny dataset in Stanford 3D scanning repository [curless1996volumetric]. We downsample the original point cloud to about 3500 points and rotate it about a random axis by 50 degrees. We compare with 4 baseline method: CPD [myronenko2010point] , FilterReg [gao2019filterreg], ECMPR [horaud2010rigid], and TrICP [chetverikov2002trimmed]. CPD and FilterReg represent GMM-based algorithm with isotropic covariances; ECMPR represents GMM-based algorithm with anisotropic covariances; TrICP, a widely used robust ICP variant, represents the ICP methods. Following [gao2019filterreg], we use the corresponding point-to-point distance to measure the registration error:


where is the i-th point in the source set; and are the transformations of the registration and the ground truth, respectively. All methods are exhaustively run with 100 iterations to ensure convergence. All the results are statistical: errors are averaged values of 30 independent runs.

Figure 2: Outlier results. (a) Errors (Eq. (14)) of different methods. (b) Enlarged image showing errors of CPD, FilterReg, and LSG-CPD. (c) Initialization and results of different methods on data with 0.5 outlier ratio. TrICP point-to-plane result is shown as the result for TrICP. The images are cropped for better visualization. (d) Average runtime of CPD and LSG-CPD on data with different outlier ratios. (e) Averaged iterations of CPD and LSG-CPD on data with different outlier ratios.
Figure 3: Noise results. (a) Errors (Eq. (14)) of different methods. (b) Enlarged image showing errors of CPD, FilterReg, and LSG-CPD. (c) Initialization and results of different methods on data with 0.03m noise level. TrICP point-to-plane result is shown as the result for TrICP. Denoised point clouds are used for better visualization of registration result. (d) Averaged penalization coefficient (Eq. (2)) of all points in the target set with respect to the noise level.

In the outlier experiment, we add different ratios of Gaussian outliers. Results are shown in Fig. 2. All the GMM-based methods significantly outperform TrICP when there are outliers presented, where CPD, FilterReg, and the proposed method are the best. However, with the increase of the outlier ratio, the error of CPD and FilterReg increases while the performance of the proposed method maintains and outperforms CPD and FilterReg (Fig. 2(b)).

Figure 4: Multiview cross-section results where the corresponding regions are indicated in (a). Thinner curves indicate better registration. (a) shows the whole object modeling results of the proposed method; (b) and (c) show the initial and ground truth; the rest show the cross-section results of different methods.

In the noise experiment, we corrupt each point in the source and target set by adding Gaussian noise with different std. Results are shown in Fig. 3. All the GMM-based methods significantly outperform TrICP when there is noise. CPD, FilterReg, and the proposed methods have similar performance among which the proposed method has a small advantage (Fig. 3(b)). We also show the averaged penalization coefficient at different noise levels (Fig. 3(d)). As the noise level increases, decreases. This is because without noise, the local surface will look like a plane and thus will be relatively large; when the points are corrupted, the local plane will also be corrupted and normal estimation becomes less effective, resulting in small as we desire.

We compare the speed with CPD111Matlab official implementation of CPD is used for comparison: on data with different outlier ratios. In this experiment, we terminate the algorithm when the rotation error is less than and the translation error is less than 1cm. Fig. 2(d) and (e) summarize the result. Our method is 5 times faster than CPD (89ms vs 450ms) when there are no outliers and 29 times faster (346ms vs 9940ms) when the outlier ratio is 1. Generally, our method takes less iterations to converge. And the time for each iteration is less than that of CPD. Averaging across all the experiment, our method is 20 times faster than CPD.

For FilterReg, we use the point-to-point version. We tested the point-to-plane version extensively but find it doesn’t give a good performance. We think the reasons could be the singular covariance and the inconsistency of the probabilistic model in the E and M step, as mentioned in Sec. 3.1, making it not robust to bad initialization and/or outliers and noise. In the following experiments, if not mentioned, FilterReg is tested with the point-to-point version.

Method Dragon Happy Armadillo
Initial 5.12 5.41 5.23
TrICP 1.87 1.16 0.66
FilterReg 1.13 2.13 0.63
MATrICP 1.18 1.29 0.53
EMPMR 1.39 1.07 0.50
LSG-CPD (ours) 1.00 1.10 0.44
Table 1: Object Modeling Errors (Eq. (14), unit mm).

4.2 Object Modeling on Range Datasets

In this experiment, we test on object modeling from laser range data of objects captured from different viewpoints. We evaluate on three datasets from the Stanford 3D scanning repository [curless1996volumetric], i.e., Dragon Stand, Happy Stand, and Armadillo. Each consisting of 12 or 15 sequential scans of an object with evenly spaced view angles. We compare our method against two pairwise methods, i.e., FilterReg [gao2019filterreg] and TrICP [chetverikov2002trimmed], and two multi-view methods, i.e., MATrICP [li2014improved] and EMPMR [zhu2020registration]. Object modeling for pairwise methods makes use of the spatial information provided by the scanning sequence. Point clouds that are spatially adjacent to each other are first pairwisely registered. Then, all the point clouds are aligned to the first one for whole object modeling. Following [zhu2020registration], we perturb the point clouds from its ground truth pose with a random pose (rad in rotation and m in translation). We downsample the point cloud of each scan with voxelized grids, resulting in about 3500 points for each view. For each dataset, we carry out 20 independent experiments. We use the averaged corresponding point-to-point distance (Eq. (14

)) as the evaluation metric.

Qualitative results are shown in Table 1. Fig. 4 shows the cross-section results of different methods. The proposed method outperforms the two pairwise baseline methods. The chain-like reconstruction method used for pairwise methods has a disadvantage of error accumulation. The error in each pairwise registration will propagate and accumulate along the chain when aligned to the base point cloud. On the other hand, multi-view methods can simultaneously process all the point clouds and take advantage of techniques such as motion averaging [li2014improved]. Even with such a significant disadvantage, compared with state-of-the-art multi-view methods, our method outperforms them on two datasets and is on par with them on the third.

Figure 5: Stanford Lounge Result. (a) Examples of registration result. (b) Angle error and speed versus truncating threshold. (c) Registration results of various methods. The results of the baseline methods with a star (*) are sourced from [gao2019filterreg]; the results of other baseline methods are sourced from [eckart2018hgmr].

4.3 Registration on RGBD Dataset

In this section, we follow the setting in [eckart2018hgmr] and test on the Stanford Lounge dataset [zhou2013dense] which contains range data of an indoor scene captured with an RGBD camera. We register every 5-th frame for the first 400 frames. Each frame is downsampled to about 5000 points. Following [eckart2018hgmr], we use the average Euler angle deviation from the ground truth to measure the rotation error between each pair.

The results are shown in Fig. 5. Our method achieves the state-of-the-art without CF. With CF, the accuracy improves greatly and outperforms the state-of-the-art. The angle error is more than 2 times lower than the state-of-the-art. Additionally, CF greatly accelerates the registration by truncating points with lower accuracy confidence. The registration frame rate increases from 3 FPS to 19 FPS.

Figure 6: Kitti Odometry Benchmark 07 Sequence Result Visualization. Red indicates the ground truth; blue indicates the registration result.
Method Rel. Rot. (°) Rel. Tran. (m) Abs. Rot. (°) Abs. Tran. (m) Last Rot. (°) Last Tran. (m)
ICP-pt2pt 0.283 (3.690) 0.214 (5.966) 9.476 (17.641) 17.296 (36.379) 7.100 16.131
ICP-pt2pl 0.475 (8.683) 0.187 (14.320) 7.986 (16.496) 48.422 (70.712) 6.526 58.902
TrICP-pt2pt 0.123 (0.689) 0.117 (2.324) 1.606 (2.476) 10.446 (24.881) 1.587 24.449
TrICP-pt2pl 0.071 (0.607) 0.080 (2.604) 1.365 (3.150) 8.914 (33.113) 3.073 33.094
GICP 0.070 (0.787) 0.035 (2.108) 1.181 (1.799) 3.052 (6.939) 1.772 6.593
CPD 0.167 (1.100) 0.190 (2.701) 5.091 (7.900) 21.696 (56.690) 6.353 47.359
FilterReg 0.091 (0.615) 0.029 (0.099) 1.995 (7.016) 3.812 (5.666) 6.984 5.373
LSG-CPD (ours) 0.062 (0.599) 0.021 (0.091) 0.795 (1.363) 1.069 (1.782) 1.320 0.297
Table 2: Kitti Odometry Benchmark 07 Sequence Result. We show the relative rotation error (Rel. Rot.), relative translation error (Rel. Tran.), absolute rotation error (Abs. Rot.), absolute translation error (Abs. Tran.), last frame rotation error (Last Rot.), and last frame translation error (Last Tran.) of different methods. In the first four columns of result, the value outside the parenthesis is the averaged error of the 550 pairs; the value inside is the maximum.

We also experiment on truncating points at different confidence thresholds (Fig. 5(b)). The registration error decreases from to by just adding CF without any truncation. As expected, the registration speed increases as the truncating threshold increases. However, if the threshold is too large, too many points, including the accurately measured points, are truncated. The point cloud will contain insufficient information for registration. If the threshold is too small, i.e., too few points are truncated, the point cloud contains many noisy data.

4.4 Registration on LiDAR Dataset

In this experiment, we evaluate on the Kitti dataset [geiger2012we]. We test on the 07 sequence of the odometry benchmark, which contains a drive for about 700 meters. We downsample the point clouds to about 5000 points with voxelized grids. We register together every other frame pairwisely for the whole sequence, resulting in 550 pairs of point clouds for registration. All the frames are aligned to the first frame to reconstruct the whole sequence. We evaluate the averaged relative and absolute rotation and translation errors. We also compute the rotation and translation errors of the last frame. The translation error is the Euclidean distance between the registration result and the ground truth. The rotation error is the angle of the axis-angle parameterization of the relative rotation between the ground truth and the registration result. The results are shown in Fig. 6 and Table 2. Our method outperforms all baseline methods on every evaluation metric. Remarkably, even without loop closure, our proposed method is able to achieve very small translation and rotation errors in the last frame of the whole sequence. All the ICP-based methods except GICP [Segal-RSS-09] have relatively large errors. For FilterReg[gao2019filterreg], we use the point-to-plane version as the point-to-point version does not give a good result. FilterReg [gao2019filterreg] and the proposed method outperform CPD [myronenko2010point].

5 Discussion & Future Works

Extensive experiments on various kinds of data show that our method is accurate and robust. As shown in Sec. 4.1, ICP-based methods require good initialization and are not robust to outliers and noise. CPD and point-to-point FilterReg, although robust to outliers, noise, and missing points, does not perform well on the Kitti dataset. We hypothesize this is because LiDAR data contain many large planes. The isotropic covariance in CPD and point-to-point FilterReg are not able to accurately represent the geometry of the data, thus leading to bad performance. Point-to-plane FilterReg performs well on the Stanford Lounge and Kitti dataset in which the relative pose between adjacent frames for registration are close to each other. However, it is sensitive to bad initialization and are not robust to outliers and noise as discussed in Sec. 3.1 and 4.1. Our method is robust to outliers, noise, and missing points and accurate. The reason for the robustness is twofold. The probabilistic modeling which includes the outlier component make it robust to outliers and missing points like other GMM-based methods [myronenko2010point, gao2019filterreg]. The covariance adaptively adjusts its shape according to the local surface geometry, making it insusceptible to ineffectively estimated surface normals. The reason for the accuracy is also credit to the local surface geometry-based covariance. By using the surface variation [surface_variance] to adjust the point-to-plane penalization in the covariance, our method can accurately represents the geometry of a wide variety of data, including data of different scales, with/without large planes. Future works include extending the current framework to non-rigid registration and applying the method to more computer vision tasks, e.g.

, pose estimation and SLAM.

6 Conclusions

We present a novel GMM-based probabilistic registration method which outperforms state-of-the-art algorithms in terms of accuracy and robustness on various datasets. Our method makes use of the surface normal and surface variation [surface_variance] to create anisotropic covariances in the GMM which faithfully represents the geometry of the point cloud. We show that the computation can be reformulated into simple matrix manipulations which can be efficiently computed on a GPU. In addition, we propose an efficient optimization method to solve for the optimal transformation on a matrix Lie group in the EM process. Thanks to the efficient computation and optimization, the proposed method is about 20 times faster than modern implementations of CPD [myronenko2010point].

7 Acknowledgements

This work was supported by NUS Startup grants R-265-000-665-133, R-265-000-665-731, Faculty Board account C-265-000-071-001, and JHU internal funds.