## 1 Introduction

With the increasing availability of 3D data acquisition devices such as Kinect [1], [2], the researchers are inclining towards exploiting the 3D information for a variety of processes. Registration of 3D point-clouds is a necessary initial step for many such processes. Examples include change detection in a scene, industrial quality control, pose tracking [3] etc. Generally, two point-clouds (called reference and template respectively) of an object or scene are captured at two time instances, possibly from different camera positions. The scene may have gone under some changes during this time. The registration refers to the process of translating, rotating and scaling the template such that it optimally aligns with the reference. In this paper, we focus on the automatic rigid registration of 3D point-clouds. The proposed algorithm does not require any color, texture, point correspondence or point topology information of the point-clouds. Moreover, presence of noise, outlilers and missing parts make the problem even more challenging. Now, we briefly review the related literatue and state our contributions (Fig. 1).

One of the earliest and most popular method for rigid point set registration is Iterative Closest Point (ICP) [5], [6]. Using non-linear optimization algorithm, ICP minimizes the mean squared distances between the two point sets. Though this method is simple to implement, it is sensitive to outliers and the performance depends on the initial alignment [4]. Some variants [7], [8] of ICP are also there.

While ICP [5] assigns discrete correspondences between the points of the two sets, later methods such as Robust Point Matching (RPM) [9] and [10], [11], [12] assign soft correspondences. A set of methods [13], [14]

treat the registration problem as maximum likelyhood estimation problem where the template points are assumed to be the centroids of Gaussian Mixture Model (GMM) and the reference points are seen as data points. Expectation-Maximization (EM) algorithm is generally used for optimization of likelyhood function. A closed form solution to the M-step of the EM algorithm for multidimensional case is proposed in

[15]. Their method is called Coherent Point Drift (CPD). The GMM based methods perform better in the presence of noise. CPD was refined in [16] and [17] for structured outliers. Other recent mentionable approaches are [18], [19] and [20]. Recently in 2016, a new Gravitational Approach (GA) based registration algorithm is proposed [4]. Under GA algorithm, the template moves towards the reference under the influence of gravitaional force as induced by the reference. GA based method is shown to perform better compared to CPD under more than 50% uniform noise. The performance of GA decreases with increasing rotation (more than ) and change (extra or missing parts) [4]. We propose substantial modifications to the GA algorithm [4] to overcome the disadvantages of GA.Our contributions are as follows. (1) GA method minimizes the distance between the center-of-masses of the two point-clouds but does not ensure proper shape alignment (see Fig. 1). We constrain the force of attraction with a measurement of local shape-similary for better alignment of the shapes (i.e., handles rotation and outliers better). (2) In GA, the force of attraction is inversely proportional to the distance beetween the two points involved. As a result, the increased speed at local minima (where the distance is small) does not allow the algorithm to converge (see Fig. 2). To handle this situation, a number of free parameters are introduced in GA. In our algorithm, the force of attraction is proportional to a monotonically increasing function of distance. This modification allows for better convergence without the need for extra free parameters. (3) GA has limited capability to handle scale change [4]. In our method, to handle scale change, an orientation and translation invariant model is proposed that uses the spatial point distribution of the two point-clouds. (4) The new algorithm is evaluated extensively for fully automatic performance as opposed to a number of (seven) free, manually adjusted parameters of GA method [4].

## 2 Gravitational Approach

In Gravitational Approach (GA) based registration, each point in one point set (called reference) attracts each point in the other point set (called template). The force of attraction is governed by the following formula.

(1) |

In (1) G is the gravitational constant and represents the total force applied on a point of the template. The symbols , , and rerpesent the th point in the reference, mass of , absolute coordinates of and number of points in the reference respectively. The symbols , and

represent a unit vector in the direction of force, the velocity of

in the previous iteration and a constant respectively. The template, as a rigid body, gets displaced in an iterative way under the influence of the cumulative force induced by all the points in the reference. The term does not allow the force to increase beyond a certain small distance between the reference and the template (note that ). The term in (1) acts as friction that controls the velocity near local minima. Given the force, the displacement of the template is estimated ollowing Newton’s second law of motion. The scale and rotation, required for registration, are estimated using the new positions of the template points which are estimated following (1).## 3 The Proposed Registration Approach

The objective of the gravitational force is to minimize the distance between the center of mass of the two objects. On the other hand, the objective of registration is to align the two objects (point-clouds here) such that the parts of the objects having similar shapes map to each other. Therefore, we need to modifiy the principles of gravitation to accommodate this constraint. We explain these modifications next.

### 3.1 Modified Gravitational Approach

In GA, . As a result, in registration following (1), the heavier parts of the point-clouds attract each other more irrespective of the shape, resulting in mis-alignment (e.g., Fig. 1). Therefore, we modify (1) such that,

(2) |

where represents a measure of shape similarity of local neighborboods centered at and respectively. is a monotonically increasing function of . We have evaluated different features (e.g., histogram of normals, coefficients of polynomial approximating the local surface, curvature) that can represent shape in spatially local domain. The curvature [21]

seems to be the best representative of local shape. As a similarity measure of shape, any radial basis function (RBF) can be used where the output decreases monotonically as the dissimilarity increases. In our implementation, we used

(3) |

where, represents absolute value, and represent the spread of the RBF funtion and the curvatue value of the neighborhood centered at . In the proposed registration approach, we want the template to make large movement if the distance between the template and the reference is large. We want it to take tiny steps for finetuning when close to alignment. This is ensured by the following formula.

(4) |

The value of the function in (2) or the function in (4) can be the value of the argument itself or some suitable monotonically increasing function as we shall discuss in Section 4. Therefore, we propose the following formula to estimate the total force applied on a point of the template.

(5) |

Note that in (5) we could drop two free parameters and as used in (1) in GA based registration [4].

### 3.2 Estimating Translation, Rotation and Scale

We estimate the displacement represented by say, of the template following Newton’s second law of motion.

(6) |

In (6) , and represent the total force () applied on the template , mass of and velocity of in a previous iteration repsectively.

The rotation matrix is estimates following Kabsch algorithm [22]. Let and represent the mean subtracted matrices representing the coordinates of the template before and after translation following (6). Let represent the corss-covariance matrix. The rotation matrix

is estimated using Singular Value Decomposition (SVD) of

. After SVD and let . Then,(7) |

In GA, the scale is estimated as the ratio of the positions estiamted using (1) and previous positions of the template cloud points. Any error occuring in estimating the translation using (1) gets propagated to estimating the scale at each iteration of GA based registration. Our model does not depend on the estimation of translation. We find the eigen valules of the covariance matrix (say

) of each of the two point-clouds. The largest eigen-value represents a measure of the lengh of the point-cloud distribution in the direction of the largest variance.This measurement is independent of the orientation or relative position of the two point-clouds. Using SVD we have

where the diagonal elements of represent the eigen values and and are orthogonal matrices. We estimate the scale as where and are the largest eigen values of the reference and the template respectively. (see Supplementary Fig. S1). We scale the template () as.(8) |

In (8), stands for iteration number and represents a square matrix with in the diagonal and zeros elsewhere. In each iteration of our algorithm, the template is updated using

(9) |

In (9) is a matrix where each row represents the mean position of the template.

## 4 Implementation Details and Discussion

The value of shape similarity is the main factor controlling the translation and rotation of the template. If we give equal weightage to all the local shapes then we set . In indoor and some ourdoor scenes, non-planar local shapes play vital role in registration compared to planar shape. Therefore, we can set as follows to give more weightage to non-planar local shapes.

(10) |

For finding the curvature (), setting the radius of the local neighborhood to and respectively yields good results. We design of (4) as follows.

(11) |

Higher value of results in greater speed but oscillation about the local minimal. Lower value of results in better registration but makes the process slow. The mass of every point is assumed to be 1 and is set to 0. and can be modified depending on prior information, if any. We also suggest that of (5) should take larger value near convergence when the algorithm takes smaller steps. We modify as follows.

(12) |

where, we set and is the number of iterations. For noisy real scenarios, pyramidal approach can be taken where the size of the local neighborhood and of (3) decreases with iteration (Supplementary Fig. S2).

## 5 Evaluation

On Synthesic data: We evaluate the performance of the proposed method on data with different amounts and types of noise (structured and unstructured outliers, missing data). In Fig. 3 we compare our method with GA [4] in the presence of structured outliers. When the mass (no. of points) in the outlier is less than the reference (human scan), both the methods work well (Fig. 3, col 1). As the mass of the outlier increases, following GA, the outlier rather than the human shape moves towards the reference (Fig. 3, col 2, 3). Following our method, regardless of the mass of the outlier, the human shape (downloaded from [23] [24]) in the template registers with the human shape in the reference. Here come into light the advantage of our method over GA [4]. Unlike GA where the movement of the template depends on the mass, in our method the movment depends on the shape.

Fig. 4 shows the performance of the proposed method on data with missing points. We delete 18% consecutive points from the head of the Armadillo [25] data. For both the cases when the full Armadillo or the one with missing points is used as the template and the other one as the template, our method gives desired result (Fig. 4) whereas the competing methods give less accurate results (Supplementary Fig. S3).

Fig. 4c and d show the results when 50% Gausian and Uniform outliers respectively are added to the data. To quantitatively compare the proposed approach with that of GA [4], CPD [15] and ICP [5], we follow the same evalutaion protocol as used in [4] and use the same bunny point cloud from Stanford 3D Scanning Repository [25] as used in [4] and [15] . We add 5%, 10%, 20%, 40% and 50% Gaussian and uniform noise to the template. For each of the 10 noise type-percentage combinations, 500 random tranformations are applied on the point cloud to form the template. The non-transformed point-cloud acts as the reference. Fig. 5 compares the the amout of noise vs. Root Mean Square Error (RMSE) for the four competing methods. All the methods perform better for Gaussian compared to uniform noise. Overall CPD [15] performs better compared to ICP [5]. Our method outperforms all the three competing methods.

On Real Data: To evaluate the proposed method in real scenarios, we capture 3D point-cloud scans of a number of scenes, two captures for each scene using Kinect. Fig. 6 demonstrates one example. In the second capture, the position of the remote-controller has been changed and the front box has been opened. To make the problem even more difficult, we translate the template by 0.2 meters along x-axis. The registration result shows propoer alighment with the front box, cup and the background box. Notice the cover of the front box and the remote-controller have not been aligned with anything as the postion of these items have been changed in the second capture. Thus, the proposed registration approach can be used for change detection in real scenes.

## 6 Conclusions

We have proposed a novel rigid point-set registration algorithm with special characteristics (e.g., shape constraint, translation proportional to distance, spatial point-set distribution model for handling scale) that outperforms other competing approaches [4], [15] and [5]. The proposed approach registers better in difficult conditions such as missing object part, rotation more than and different amounts of structured and unstructured outliers. We plan to employ the proposed registration approach for efficient change detection.

## References

- [1] Zhengyou Zhang, “Microsoft kinect sensor and its effect,” IEEE multimedia, vol. 19, no. 2, pp. 4–10, 2012.
- [2] “Microsoft kinect,” https://developer.microsoft.com/en-us/windows/kinect, Accessed: 4th Jan, 2017.
- [3] Song Ge and Guoliang Fan, “Sequential non-rigid point registration for 3d human pose tracking,” in Int. Conf. on Image Processing. IEEE, 2015, pp. 1105–1109.
- [4] Vladislav Golyanik, Sk Aziz Ali, and Didier Stricker, “Gravitational approach for point set registration,” in Computer Vision and Pattern Recognition. IEEE, 2016, pp. 5802–5810.
- [5] Paul J Besl and Neil D McKay, “Method for registration of 3-d shapes,” in Robotics-DL tentative. International Society for Optics and Photonics, 1992, pp. 586–606.
- [6] Zhengyou Zhang, “Iterative point matching for registration of free-form curves and surfaces,” International journal of computer vision, vol. 13, no. 2, pp. 119–152, 1994.
- [7] Szymon Rusinkiewicz and Marc Levoy, “Efficient variants of the icp algorithm,” in Int. Conf. on 3-D Digital Imaging and Modeling. IEEE, 2001, pp. 145–152.
- [8] Andrew W Fitzgibbon, “Robust registration of 2d and 3d point sets,” Image and Vision Computing, vol. 21, no. 13, pp. 1145–1153, 2003.
- [9] Steven Gold, Anand Rangarajan, Chien-Ping Lu, Suguna Pappu, and Eric Mjolsness, “New algorithms for 2d and 3d point matching: Pose estimation and correspondence,” Pattern recognition, vol. 31, no. 8, pp. 1019–1031, 1998.
- [10] Bin Luo and Edwin R Hancock, “A unified framework for alignment and correspondence,” Computer Vision and Image Understanding, vol. 92, no. 1, pp. 26–55, 2003.
- [11] Haili Chui and Anand Rangarajan, “A new algorithm for non-rigid point matching,” in Computer Vision and Pattern Recognition. IEEE, 2000, vol. 2, pp. 44–51.
- [12] Haili Chui and Anand Rangarajan, “A new point matching algorithm for non-rigid registration,” Computer Vision and Image Understanding, vol. 89, no. 2, pp. 114–141, 2003.
- [13] Graham McNeill and Sethu Vijayakumar, “A probabilistic approach to robust shape matching,” in Int. Conf. on Image Processing. IEEE, 2006, pp. 937–940.
- [14] Michal Sofka, Gehua Yang, and Charles V Stewart, “Simultaneous covariance driven correspondence (cdc) and transformation estimation in the expectation maximization framework,” in Computer Vision and Pattern Recognition. IEEE, 2007, pp. 1–8.
- [15] Andriy Myronenko and Xubo Song, “Point set registration: Coherent point drift,” IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 12, pp. 2262–2275, 2010.
- [16] Peng Wang, Ping Wang, ZhiGuo Qu, YingHui Gao, and ZhenKang Shen, “A refined coherent point drift (cpd) algorithm for point set registration,” Science China Information Sciences, vol. 54, no. 12, pp. 2639–2646, 2011.
- [17] Vladislav Golyanik, Bertram Taetz, Gerd Reis, and Didier Stricker, “Extended coherent point drift algorithm with correspondence priors and optimal subsampling,” in Winter Conference on Applications of Computer Vision. IEEE, 2016, pp. 1–9.
- [18] Soma Biswas, Gaurav Aggarwal, and Rama Chellappa, “Invariant geometric representation of 3d point clouds for registration and matching,” in Int. Conf. on Image Processing. IEEE, 2006, pp. 1209–1212.
- [19] Manuel Marques and Joao Costeira, “Guided search consensus: Large scale point cloud registration by convex optimization,” in Int. Conf. on Image Processing. IEEE, 2013, pp. 156–160.
- [20] Furong Peng, Qiang Wu, Lixin Fan, Jian Zhang, Yu You, Jianfeng Lu, and Jing-Yu Yang, “Street view cross-sourced point cloud matching and registration,” in Int. Conf. on Image Processing. IEEE, 2014, pp. 2026–2030.
- [21] “Point cloud library,” http://pointclouds.org/documentation/tutorials/normal_estimation.php, Accessed: 4th Jan, 2017.
- [22] Wolfgang Kabsch, “A solution for the best rotation to relate two sets of vectors,” Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography, vol. 32, no. 5, pp. 922–923, 1976.
- [23] “Microsoft kinect,” https://sites.google.com/site/pgpapadakis/home/CAPOD, Accessed: 11th Jan, 2017.
- [24] Panagiotis Papadakis, “The canonically posed 3d objects dataset,” in Eurographics Workshop on 3D Object Retrieval, 2014, pp. 33–36.
- [25] “Stanford university computer graphics laboratory,” https://graphics.stanford.edu/data/3Dscanrep/, Accessed: 11th Dec, 2016.

Comments

There are no comments yet.