With the rapid development of 3D data acquisition technologies, point clouds are becoming an effective way to express the surfaces of 3D objects and scenes. Because 3D point clouds have many advantages over 2D images, for example easily extracting spatial geometric, shape and pose information, point clouds become popular to be used in many fields and applications. Some current vision systems even contain more than one type of sensors, such as LiDAR and stereo camera in autonomous driving vision system. The existing applications show improved performance by using fused point clouds from different types of sensors. However, fusion of cross-source point clouds are very challenging because they contain mixture of various differences, such as different density, noise, outliers, viewpoint changing and missing data. The detail of the differences is explained in [CSGM][huang2018coarse]. In this paper, a registration algorithm is proposed to deal with cross-source point cloud fusion problem by considering weak region affinity and pixel-wise refinement.
Same-source point clouds are homogeneous point clouds from same types of sensors. Cross-source point clouds are heterogeneous point clouds from different types of sensors.
There are several attempts to solve the cross-source point cloud registration. CSGM [CSGM] converts registration problem into graph matching problem. Then, ICP refinement (local) is done based on the graph matching results (global). [peng2014street] proposes a coarse-to-fine (global-to-local) strategy to solve the cross-source point cloud. [tvcg]
utilizes RANSAC and ICP to global align and local refine the cross-source point cloud registration. However, all the existing methods either cannot solve the cross-source point cloud registration accurately or cannot solve the problem efficiently. All the existing registration methods assume the strong structure consistency between two point-cloud sets being registered. However, it is not the case of cross source point cloud registration in which structure consistence is weak. The reliable correspondences which can be located between two sets are sparse. Larger inconsistency caused by different point cloud density, noise model and various outliers significantly degrade the estimation on rigid transformation.
According to the observation (One example is Figure 1), any two given cross-source point clouds still hold the intrinsic structure similarity although it is weak and degraded by the various noise, inconsistent point cloud density and outliers. The proposed method is able to discover the salient but weak geometric structure affinity (not statistic feature affinity) by joining sufficient number of local matching i.e. weak regional affinity. In order to adjust the mismatching in weak regional affinity search process, a pixel-wise refinement process is proposed. It is similar to global and local strategy in previous existing methods. Different to previous separated streams, in the proposed method, these two processes i.e. co-affinity search and pixel-wise refinement is unified together. Thus, it can be sorted out in an uniform optimization process. Compared to separated streams, the advantages of uniform optimization process is that interaction between local and global components will be given to the optimization process by jointly considering both local and global information in one unified process.
To mathematically formalize the above motivation, we need a tool that is a generalization of vectors and matrices. We select tensor as the mathematic tool to assemble this idea, because tensor provides an elegant and unified mathematic format to assemble the global weak regional affinity and local pixel-wise refinement. The weak regional affinity is triplet constraint that is stored in third-order tensors. The pixel-wise refinement is point-point or point-plane residual error that is stored in first-order tensor. Then, the correspondent finding problem can be optimized in a unified whole potential tensor space. Compared to previous separated global and local strategy, these components are interacted in the whole optimization process thanks to the tensor optimization. To solve the final registration problem, instead of doing tensor optimization once, we propose an iterative tensor optimization solution and a new energy function is formulated to obtain optimal geometric transformation. During the iterative optimization solution, in order to interact between transformation matrix estimation and correspondence estimation, the geometric transformationis integrated into tensor optimization and the tensor space will be updated when the is updated. When the energy function obtains convergence, both the correspondence and the geometric transformation are optimized.
The contributions of this paper are three aspects:
(1) two components, weak regional affinity and pixel-wise refinement, are proposed to keep global and local information in cross-source point clouds where structures are usually weak. We assemble them into tensors so that interaction between global and local information becomes possible.
(2) An registration algorithm is proposed to integrate these two components to solve the cross-source point cloud registration. We iteratively updated tensor space so that interaction between correspondence estimation and transformation estimation is considered.
(3) Comprehensive comparison experiments are conducted on cross-source point cloud registration problem.
2 Proposed algorithm
Inspired by [chui2003new], we recognize a point-point match as a correct match only if the selected triplet points are matched. We regard these triplet points matched constraint as weak region affinity constraint because the triplet point selection is an elegant way to remain the salient structure in the rigid+scale geometric transformation problem. We assemble weak region affinity into three-order tensor. Also, the point-point match is regarded as pixel-wise refinement and can be assembled as first-order tensor. In our method, the three-order tensor and the first-order tensor are integrated into a tensor optimization framework. Then, an efficient power iteration solution is proposed to solve the tensor optimization. In order to be more robust to geometric deformations, we simultaneously solve the optimal correspondence as well as the optimal transformation . Because the proposed method considers geometric constraint and tensor optimization, we define it geometric constraint tensor-based registration (GCTR).
2.1 Definition of pixel-wise refinement and weak region affinity
In the following, we will introduce how to formulate the above pixel-wise refinement and weak regional affinity into tensors. First-order tensor and third-order tensor are utilized to store pixel-wise refinement and weak regional affinity separately. In the following, we suppose Point cloud has points and point cloud has points.
Pixel-wise Refinement: Pixel-wise refinement is the potential point-to-point correspondence. In our algorithm, first-order tensor is used to store pixel-wise refinement represents correspondent similarity. For correspondent similarity, it is computed as the Euclidean distance between pixel-wise point pair:
where is a correspondent point similarity matrix, and . Each element of stores the similarity of point of point cloud and point of point cloud .
For the first-order tensor , it is a vector by concatenating the rows of a similarity matrix . The index conversion between to is (See Figure 2). is the feature vector of point in point cloud and is the feature vector of point in point cloud . For feature vector, we use 3D point coordinate. The first-order tensor stores local information.
Weak Regional Affinity: Weak regional affinity is the potential triplet-to-triplet correspondence(triplet points are a simple region). In our algorithm, third-order tensor is used to store weak regional affinity. In particular, triplet points are selected and are used to represent weak salient structure of cross-source point cloud (triplet points selection is detailed in Section 3). To estimate the correspondence between triplet points, we need to compute the similarity of these triplets. The similarity is computed by
where is a 6D supersymmetric tensor such as invariant under permutations of indices in or . Each element of the 6D tensor stores the similarity of the two triplets. Points () and () are two triplet points with the correspondent relations based on their orders. In particular, the above order represent Point of triplet 1 is correspondent to point of triplet 2, and the same to point correspondence of and . is a three-order tensor of size which is rewritten from tensor . Because triplet points are selected as large triangles, the third-order tensor stores global information.
Triplet similarity: For each triplet, we compute cosine value of three inner angles of the correspondent triangle combined by the triplet. Then, a descriptor ( , ) is formed to describe the triplet. The similarity between triplets are computed by using their descriptors . Using this similarity computation strategy, all the elements of the above three-order tensor can be computed. In the three-order tensor, each dimension reflects the potential of point-point correspondence. Therefore, three dimensions are same and the node correspondent will be permutation of all points between two point clouds. Therefore, the tensor is a symmetric tensor.
2.2 Geometric Constraint Tensor-based registration method (GCTR)
The goal of our method is to find the optimal transformation matrix between cross-source point clouds. One key step is to finding point-point matching by considering triplet constraint. This is equal to finding the best matching solution in the whole tensor space. In this section, we integrate the above two components into an unified framework and solving the registration problem by maximizing the following objective function:
where (correspondence matrix) and (transformation matrix) are two parameters need to estimate. is a 6D supersymmetric tensor. Each node pair’s (i.e. and ) similarity contributes a 2D dimension matrix to , i.e. and can form a 2D matrix, and and can form another 2D matrix. is a 2D matrix to describe the pixel-wise similarity. is the assignment matrix where means two points are matched and otherwise. obtains vector form of by concatenating the columns of . is a three-order tensor of size where each element represents the similarity of two triplets. It is a rewritten of tensor . is vector form of by concatenating the columns of , where each element represents the point-point similarity. With a geometric transformation given, and are the two specific tensors and correspondence matrix can be estimated by tensor optimization. With an optimized , transformation matrix
can be estimated by singular-value decomposition (SVD).
For the scale computation, we compare triplet correspondent edges and compute the mean ratio as the final scale:
where is the length of point and point in point set , is the correspondence of and is the correspondence of , is the length of and point . is the number of correspondent pairs, identifies the length in point set A and point set B.
2.3 Power iteration solution
The power iteration solution aims to optimizing the objective function 3 by two iterative processes: the correspondence is optimized by firming the geometric transformation and the geometric transformation is optimized by firming the new correspondence . Algorithm 1 shows the whole process.
2.3.1 Optimization for the correspondence
With specific geometric transformation given, the optimization of formulation 3 is a tensor optimization problem. According to [shi2013multi], two terms in objective function 3 are rank-1 tensors, so that the optimization can be formulated as (R1TA) problem. Inspired by [shi2013multi, duchenne2011tensor], we use tensor power iteration to solve the above R1TA problems. Line 6-10 in Algorithm 1 shows the correspondence optimization procedures.
2.3.2 Optimization for the geometric transformation
With the correspondences are given, the estimation of the geometric transformation is similar to ICP which can be solved in close-form solution. Suppose and is matched pairs from correspondence matrix , is the points from point set and is the points from point set . If is the mean point of and is the mean point of , , the geometrical transformation is computed by . In this step, the key elements are scale estimation and tensor update. For scale estimation computation, we use formulation 4 defined before. For tensor update, we use formulation of Line 9 in Algorithm 1.
3 Implementation details
Initially, we introduce line 1 of Algorithm 1. Following [CSGM], we use supervoxel segmentation method [papon2013voxel] to segment the point clouds and use the central points of these segments as the salient structures.
Triplet point selections: Inspired by [super4pcs], we randomly select triangles satisfying wide baseline strategy and use the three nodes of these triangles as triplet points.
Then, we introduce line 3-4 of Algorithm 1. Inspired by [super4pcs], we select triplets satisfying wide baseline strategy, so that they are more likely to be global aligned. In this algorithm, we use wide baseline strategy to randomly select large triangles in point cloud 1. We define the large triangles as three edges of the triangle are large than 50% of the overlapping 3D containing voxel’s the diameter. That guarantees the selected triangles are large triangles and make the final registration more prone to globally registered. For the overlapping ratio, if there is unknown, we automatically search as ratio is 0.25, 0.5, 0.75, 1.0.
4 Experimental results
In this section, we conduct thoroughly comparison experiments on synthetic and real datasets. Firstly, we compare on a synthetic cross-source benchmark datasets. Several state-of-the-art registration methods have been run on it and compared with the proposed method. Secondly, we compare the performance on real cross-source point clouds.
4.1 Experiments setup
The proposed algorithm is implemented by using standard C and Matlab. All the comparison experiments are executed on an I5 CPU, 8GB memory computer. We select ICP [icp], GO-ICP [goicp], Super-4PCS[super4pcs], CPD[cpd], JR-MPC[jrmpc] and CSGM[CSGM] as the comparison methods. Most of the existed state-of-the-art registration methods are focus on same-source data and designed to solve SE(3) transformation. They have not designed for scale variation. To compare fairly, we conduct an automatically scale normalization for all the other methods by following [huang2016coarse], which is assuming the size of the point cloud’s 3D containing voxel is the same. Because JR-MPC becomes not practical when the point number increases significantly, we uniformly down-sample the point cloud to approximately 2000 points for JR-MPC.
For the evaluation, we compute the Frobenius Norm of the transformation matrix difference (TM) between ground-truth and estimation. The lower the value is, the higher accuracy the method achieves.
4.2 Synthetic cross-source benchmark dataset
This section will demonstrate the ability of the proposed method on the cross-source benchmark dataset. For the construction of benchmark dataset, following [CSGM], we start from Stanford 3D Scanning Models 111http://graphics.stanford.edu/data/3Dscanrep/ and simulate ten sets of cross-source benchmark dataset. Each set of cross-source dataset contains source A and source B which simulate cross-source problems (discussed in Section 1).
We select Angle as examples for detailed quantitative evaluation on the cross-source benchmark datasets. We evaluate the Translation(T), Rotation(R), Scale (S) error separately, and compare the error of RMSE (log(TM)). Also, we compare the runtime on this dataset. Table 1 shows the evaluation results. It shows the GCTR obtains comparable accuracy to CSGM while shows much faster than CSGM. Compared to other methods, the proposed method obtains higher performance on accuracy and efficiency.
Figure 4 shows the comparison results on whole datasets. We can see that our method obtains highest accuracy and robust results in all datasets. CSGM obtains comparable accuracy to the proposed method. In the other comparison methods, Super4PCS achieves the second performance for most datasets. An interesting phenomenon is that ICP obtains second accuracy than all other comparison methods in dataset Horse. Dataset horse is part of a horse and points arrange on a smooth surface. Our interpretation is that ICP shows some ability in aligning the cross-source point clouds when there is no scale variation and the initialization is very well and no large disorganized outliers. That is the reason why ICP is used as the final refinement step in many applications on same-source registration.
To compare the efficiency, we compute the average runtime on the 10 cross-source datasets. Table 2 show the proposed method is much faster than other compared methods. Although CSGM obtains comparable accuracy to the proposed method, our method achieves much higher efficiency than CSGM (about 16 times faster).
4.3 Real cross-source point clouds
We capture the cross-source point cloud dataset by using KinectFusion and VSFM . KinectFusion outputs point cloud directly while VSFM reconstructs 3D point cloud from the images taking by mobile phone camera. We captured more than 30 datasets for different indoor scenes and the proposed methods obtain promising results in all the datasets.
Figure 5 shows the selected datasets and compares with other methods. The results illustrate our method obtain robust and visually correct registration results in real application. For all these datasets, the proposed methods align at a fast speed in approximately 120 seconds which is much faster than other methods. Although sometimes JR-MPC and Go-ICP obtains registration results similar to our methods, their computation and memory complexity are very high. Our method can align the real cross-source point cloud accurately at a fast speed.
In this paper, we propose a fast registration method for cross-source point clouds registration. We have done two main works: firstly, weak regional affinity and pixel-wise refinement are proposed to keep global and local information in cross-source point clouds where structures are usually weak; secondly, an unified algorithm is proposed to integrate these two components to solve the cross-source point cloud registration. The experimental results show that the proposed method aligns the challenging cross-source point cloud fast and accurately.