Registration of multi-view point sets under the perspective of expectation-maximization

02/18/2020 ∙ by Jihua Zhu, et al. ∙ Xi'an Jiaotong University 9

Registration of multi-view point sets is a prerequisite for 3D model reconstruction. To solve this problem, most of previous approaches either partially explore available information or blindly utilize unnecessary information to align each point set, which may lead to the undesired results or introduce extra computation complexity. To this end, this paper consider the multi-view registration problem as a maximum likelihood estimation problem and proposes a novel multi-view registration approach under the perspective of Expectation-Maximization (EM). The basic idea of our approach is that different data points are generated by the same number of Gaussian mixture models (GMMs). For each data point in one well-aligned point set, its nearest neighbors can be searched from other well-aligned point sets to explore more available information. Then, we can suppose this data point is generated by the special GMM, which is composed of each of its nearest neighbor adhered with one Gaussian distribution. Based on this assumption, it is reasonable to define the likelihood function, which contains all rigid transformations required to be estimated for multi-view registration. Subsequently, the EM algorithm is utilized to maximize the likelihood function so as to estimate all rigid transformations. Finally, the proposed approach is tested on several bench mark data sets and compared with some state-of-the-art algorithms. Experimental results illustrate its super performance on accuracy and efficiency for the registration of multi-view point sets.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Point set registration is a fundamental methodology in many domains, such as computer vision 

[30, 16], robotics [13, 31], and computer graphics  [1, 7]. The development of point scanning devices makes it possible to reconstruct the 3D object or scene models. Due to the limited view, most scanning devices can only scan a part of the object or scene from one viewpoint. For the 3D model reconstruction, multiple point sets should be acquired from different viewpoints to cover the entire object or scene surface, and then unified into the common reference frame. Accordingly, multi-view registration is a prerequisite for 3D model reconstruction. Given multiple point sets, the goal of multi-view registration is to estimate the optimal rigid transformation for each point set and transform them from a set-centered frame to the same coordinate frame.

The point set registration problem has attracted immense attention, and many effective approaches have been proposed to solve this problem. Among these approaches, one of the most popular solutions is the ICP algorithm  [4, 5], which can achieve the pair-wise registration with good efficiency and accuracy. However, including the ICP algorithm, most of them can not directly solve the multi-view registration problem. Compared with the pair-wise registration problem, the multi-view registration is a more difficult problem and has comparatively attracted less attention. Although some approaches have been proposed to solve this difficult problem, most existing approaches are unable to appropriately explore available information for the accurate registration. For the multi-view registration, some approaches only establish point correspondences between one and some other point sets, which cannot fully explore available information for accurate registration. While, other approaches may blindly establish point correspondences between one and all other point sets, which may lead to amount of extra computation complexity.

To this end, this paper considers the alignment of multiple point sets as a maximum likelihood estimation problem and proposes a novel multi-view registration approach under the perspective of EM. The basic idea of our approach is that different data points are generated from the same number of GMMs. More specifically, each point set sequentially represents the data points and other opposite point sets are utilized to define many GMMs. To define the GMM for each data point, its nearest neighbors are searched from other opposite point sets and they are viewed as all centroids of the corresponding GMM to generate the data point itself. Therefore, it is reasonable to define the likelihood function, which includes all rigid transformations for the multi-view registration. To achieve the multi-view registration, the EM algorithm is therefore utilized to maximize the likelihood function so as to estimate all rigid transformations.

The remainder of this paper is organized as follows. Section II surveys related works on the registration of point sets. Section III formulates the multi-view registration problem under the perspective of EM. Following that is section IV, in which the proposed method is derived to solve the multi-view registration problem. In Section V, our method is tested and evaluated on six bench mark data sets. In Section VI, the proposed method is applied to the scene reconstruction. Finally, some conclusions are presented in Section VII.

2 Related work

This section only surveys existing works related to our proposed approach for multi-view registration. For convenience, we will use the terms point set and range scan interchangeably throughout this paper.

Due to the number of involved point sets, the registration problem can be divided into two sub-problems, the pair-wise registration and the multi-view registration. For the pair-wise registration, one of the most popular methods is the iterative closest point (ICP) algorithm, which can achieve pair-wie registration with good efficiency. But it cannot deal with non-overlapping point sets. Besides, it belongs to the local convergent algorithms. To improve its performance, many ICP variants have been proposed for pair-wise registration [24]. For non-overlapping point sets, Chetverikov et al. [6]

proposed the trimmed ICP algorithm, which introduces the overlap percentage to automatically trim non-overlapping regions for accurate registration. To address local convergence, the Genetic algorithm 

[17] or the particle filter [26] is integrated with the TrICP algorithm to search the desired results. For the efficiency, some point feature methods [16, 25] are proposed to provide good initial parameters for the TrICP algorithm or its variants. Recently, some GMM-based method approaches, such as CPD [21], GMMReg [12] and FilterReg [9] were also proposed to solve the pair-wise registration. Both CPD and FilterReg represent one point set as GMM, then cast the pair-wise registration problem as a maximum likelihood estimation problem. While, GMMReg utilizes Gaussian mixture models to represent both two point sets and reformulate the pair-wise registration as the problem of aligning two Gaussian mixtures such that a statistical discrepancy measure between two GMMs is minimized. Although these approaches can achieve pair-wise registration with good accuracy and robustness, they are time-consuming due to the huge number of point correspondences required to be established. Besides, they are unable to directly solve the multi-view registration problem.

For the multi-view registration, the intuitive method is the alignment-and-integration method [5], which sequentially aligns and integrates two point sets until all point sets are integrated into one model. This approach is simple but suffers from the error accumulation problem due to a large number of point sets. Then, Bergevin et al. [3] proposed the first solution for multi-view registration. It organizes all point sets by a star-network and sequentially puts one of point set in the center of star-network. For the center point set, it finds point correspondences from each other point sets and estimates the rigid transformation by the ICP algorithm. As this approach can only sequentially estimate the rigid transformation for one point set, it may be difficult to obtain the desired registration results. Besides, the ICP algorithm is unable to deal with non-overlapping regions, so this approach is difficult to obtain promising results. What’s more, it becomes more and more easily to be trapped into a local minimum with the increase of point set number.

To account for non-overlapping regions, Zhu et al.  [33] proposed a coarse-to-fine approach for multi-view registration, which sequentially traverses and refines the rigid transformation of each point set. More specifically, this approach views the traversed point set and all other coarse aligned point sets as data set and model set, respectively. Then the trimmed ICP algorithm is utilized to refine the rigid transformation of each traversed point set. To avoid the local convergence, Tang et al. [28] proposed a hierarchical approach to multi-view registration under the perspective of the graph, where each node and edge denotes a single point set and a connection between two overlapped point sets with non-slap percentage, respectively. Based on the graph, hierarchical optimization is implemented on the edges, the loops, and the entire graph. As it requires to detect all small loops, this approach turns to be difficult or impossible if none loop exist.

To address the sequential estimation, Krishnan et al. [15] proposed an optimization-on-a-manifold approach for the multi-view registration. This approach can simultaneously estimate all rigid transformation from the established correspondences between each pair of point sets. However, it is difficult to establish accurate point correspondences in practical applications. Accordingly, Mateo et al. [19] treat the pair-wise correspondences as missing data and proposed the approach for multi-view registration under the Bayesian perspective. Although this approach may be accurate, it requires to calculate huge number of latent variables. Before this approach is proposed, some other graph-based approaches [27, 29] were also proposed for the multi-view registration. The difference is that each edge denotes the pair-wise registration of two connected nodes. Then the graph optimization approach is performed to diffuse the registration error over a graph of adjacent point sets. Without the correspondence update, these approaches only transfer registration errors among graph nodes and are unable to really reduce the total registration errors.

Recently, Govindu and Pooja [10] proposed the motion averaging algorithm to solve the multi-view registration problem. This algorithm can directly estimate multi-view registration results (global motions) from a set of pair-wise registration results (relative motions) by the motion averaging algorithm. Meanwhile, Arrigoni et al. [2] introduced the low-rank and sparse (LRS) matrix decomposition to estimate global motions from a set of available relative motions. Compared with the motion averaging algorithm, the LRS method is more robust to some unreliable relative motions. However, to obtain the desired multi-view registration results, it requires more relative motions than that of motion averaging algorithm. What’s more, the reliability of each relative motions is different, but these two methods consider that their contributions to the motion averaging are equal, which inevitably decreases the registration performance. To address this issue, Guo et al. [11] proposed the weighted motion averaging algorithm to solve the multi-view registration problem. Meanwhile, Jin et al. [14] proposed the weighted LRS algorithm for multi-view registration. As these two approaches can pay more attentions to reliable relative motions, they can achieve more accurate and robust registration results than their original methods.

More recently, Evangelidis et al. [8] proposed the JRMPC method, which assumes that all points are realizations of the same GMM and then casts the multi-view registration into a clustering problem. Therefore, an EM algorithm is derived to achieve the clustering and estimate all rigid transformations as well as GMM parameters. As this approach requires to estimate a huge number of parameters, it is time-consuming and easy to be trapped into a local minimum. To address this issue, Zhu et al. [32]

derived K-means algorithm to achieve the clustering and estimate rigid transformations for the multi-view registration. This approach requires to estimate fewer parameters, so it is more efficient and robust than JRMPC. However, the performance of these clustering-based approaches is seriously affected by the number of clusters, which is a presetting parameter. In practical applications, we are difficult to know its optimal value without enough prior information. Besides, the clustering of points leads to the information loss, which reduces the registration performance.

Although both our method and JRMPC utilize the EM algorithm to achieve multi-view registration, their principles are totally different. JRMPC assumes that all points are generated from a central GMM, which contains a huge number of components required to be estimated. Accordingly, it requires to estimate GMM’s parameters as well as all rigid transformations for multi-view registration. While, our approach assumes that different data points are generated from the same number of GMMs, which utilize equal covariances and equal membership probabilities for all Gaussian components. Given a data point, one NN is existing in each other point set and it can be viewed as one centroid of the corresponding GMM to generate the data point itself. As all centroids of each GMM can be efficiently searched and assigned with equal covariances, our approach only requires to estimate all rigid transformations.

3 Problem formulation

Let be the union of point sets and be points that belong to the th point set. Given the model centered frame, the goal of multi-view registration is to estimate the rigid transformation including a rotation matrix

and a translation vector

for each point set, so as to transform them from a set-centered frame into the model centered frame. Fig. 1 illustrates the principle of the proposed approach.

Fig. 1: The proposed method assumes that different well-aligned data points are generated from different GMMs, all of which are composed of equal components. For each data point in one point set, e.g. , once rotated and translated from the set-centered coordinate frame to the model-centered frame, has one nearest neighbor in each other well-aligned point set. These nearest neighbors represent all centroids of the special GMM to generate the data point itself.

For one data point in the th point set, there exist its nearest neighbors in each other well-aligned point sets. Adhering with a Gaussian distribution, each nearest neighbor can be viewed as one centroid of a GMM, which contains components. Accordingly, we can assume the th point set represents the data points and each of these data points is generated from one special GMM defined by its nearest neighbors in other opposite point sets. Besides, we use equal covariances and equal membership probabilities for all GMM components. Under this assumption, it is reasonable to formulate the joint probability of data point as follows:

(1)

where , and denote the rotation matrix and the translation vector of the th rigid transformation, respectively. For simplicity, we define the function for the rigid transformation imposed on the data point .

To account for noise and outliers, it is essential to add an extra uniform distribution into the probability function:

(2)

where is the parameter representing the ratio of outliers and denotes the uniform distribution parameterized by the number of point sets involved in the multi-view registration. As shown in Eq. (2), the probability function contains all rigid transformations for the multi-view registration. Accordingly, it requires to estimate these model parameters , which can be achieved by maximizing the corresponding likelihood function.

4 Multi-view registration approach under the perspective of EM

As the estimation of model parameters can be achieved by maximizing the likelihood function, it is reasonable to utilize the EM algorithm. Therefore, it is necessary to define a set of hidden variables , where means the observation is drawn from the Gaussian distribution . Given all point sets , model parameters can be estimated by maximizing the expected completed data log-likelihood function as follows:

(3)

As we utilize equal membership probabilities for all GMM components, denotes the constant term. Therefore, can be reformulated as:

(4)

For simplicity, it is reasonable to assume that all data points are independent and identically distributed. Accordingly, Eq. (4) can be straightforwardly rewritten as:

(5)

where denotes the posterior. By replacing the probability density of Gaussian distribution and ignoring constant terms, the objective function is reformulated as follows:

(6)

where and denotes the determinant of matrix . For simplicity, we restrict each Gaussian distribution to the isotropic covariance, i.e., , where denotes the identity matrix. Therefore, Eq. (6) can be reformulated as:

(7)

where denotes point dimension, e.g. for range point.

As these parameters are of the Special Orthogonal , particular care should pay attention to their estimation. Therefore, the multi-view registration can be formulated as the constrained optimization problem:

(8)

This optimization problem can be solved by the EM algorithm, which is augmented with the establishment of point correspondences in E-step. Our approach can maximize the likelihood function to estimate all rigid transformations. We will refer to this approach as the expectation-maximization perspective for multi-view registration (EMPMR), which achieves the multi-view registration by iterations. In each iteration, both E-step and M-step are included in the EMPMR.

4.1 E-step

In this step, we need to calculate the posterior probability of one data point

generated from each component of the corresponding GMM. Before the calculation, it requires to specify the centroids of each GMM.

4.1.1 E-Corresponding-step

Given the parameter set obtained from the previous iteration, it is easy to transform all point sets into the same coordinate frame. Then, for one data point in the th point set, it is required to find its corresponding point in each other opposite point set:

(9)

Eq. (9) denotes the NN search problem, which can be efficiently solved by the -d tree based method [23]. For each data point in one point set, corresponding points are searched from other point sets and they are viewed as the centroids of the GMM to generate the data point itself.

4.1.2 E-Probability-Step

Given the centroid and covariance , it is easy to calculate the posterior probability of data point generated from this Gaussian distribution, e.g. , as follows:

(10)

where and the notation denotes the probability density of Gaussian distribution defined as:

(11)

4.2 M-step

Given current values of and , this step requires to estimate all transformations by maximizing the function . Although

rigid transformations require to be estimated for multiple point sets, their estimation can be carried out independently for each point set. More specifically, we can alternative estimate one rigid transformation by setting other rigid transformations and the standard deviation

to their current values. Accordingly, the rigid transformation of the th point set can be estimated from the constrained problem:

(12)

Eq. (12) denotes a weighted least square (LS) problem. As the parameter

is a special matrix, the Singular Value Decomposition (SVD) based method can be utilized to solve this weighted LS problem.

To facilitate analysis, the function is defined as follows:

(13)

Taking the derivative of with respective to , it is easy to obtain the following results:

(14)

Let , the translation vector can be estimated as:

(15)

Then the in Eq. (13) can be replaced by Eq. (15) and the objective function is simplified as:

(16)

where

(17)

and

(18)

Accordingly, the rotation matrix can be estimated by minimizing the function , which is expanded as follows:

(19)

The optimization problem illustrated in Eq. (19) has been well solved by the Singular Value Decomposition method  [22, 20]. Therefore, we only present the conclusion for the calculation of each rotation matrix without proving.

Compute the matrix and its singular value decomposition (SVD) results:

(20)
(21)

Estimate the rotation matrix:

(22)

According to Eq. (15), it is easy to obtain the estimation of translation vector . After the estimation of the th rigid transformation, it requires to estimate the next rigid transformation until EMPMR obtains the desired results for multi-view registration.

Finally, when all rigid transformations have been updated, it requires to update the standard deviation . Take the derivative of with respective to and set it to , can be updated as:

(23)

Obviously, the proposed method utilizes Eqs. (9) and (23) to specify the GMM to generate the data point .

4.3 Implementation

Based on the above description, the proposed EMPMR approach is summarized in Algorithm 1. Similar to most registration approaches, EMPMR is a local convergent algorithm. For accurate registration, good initial guess of each rigid transformation should be provided in advance.

Input: Point sets , maximum iteration ,
          initial guesses and .
Output:

1:  ;
2:  repeat
3:     ;
4:     for  do
5:        E-step
6:        Build the correspondence by Eq. (9);
7:        Estimate the posterior probability by Eq. (10);
8:        M-step
9:        Update and by Eqs. (15) and (22), respectively;
10:        Update by Eq. (23).
11:     end for
12:  until (’s change is negligible)) or ()
Algorithm 1 EMPMR approach
Operation Complexity
-d tree building
E-Corresponding-step
E-Probability-Step
M-Step
TABLE I: The total computation complexity of our approach
Angel Armadillo Bunny Buddha Dragon Hand
Point sets 36 12 10 15 15 36
Points 2347854 307625 362272 1099005 469193 1605575
TABLE II: Details of data sets utilized in experiments

4.4 Complexity analysis

This section analyzes the complexity of EMPMR. As EMPMR is proposed for the registration of multi-view point sets, the point set number and the total number of points in each point set is the central quantity. For ease analysis, we suppose the iteration number of this approach is . Before iteration, it requires to build tree to accelerate the NN search and the complexity is for each point set. At each iteration, three operations are implemented to estimate one rigid transformation.

E-Corresponding-step. For each point in the th point set, it is required to search its NN in each other point sets and the complexity is . As there are in the th point set and point sets involved in multi-view registration, the total complexity is for iterations.

E-Probability-Step. For each point in the th point set, there are hidden variables. Accordingly, the proposed approach requires to calculate hidden variables for the th point set. Given point sets, the total complexity is for iterations.

M-Step. Our method utilizes point pairs and their corresponding hidden variables to estimate the th rigid transformation. To estimate rigid transformations, the total complexity is for iterations.

Therefore, Table I lists the total computation complexity for the estimation of rigid transformations.

5 Experiments

In this section, EMPMR is tested and evaluated on six data sets, where four data sets are taken from the Stanford 3D Scanning Repository [18] and the other two data sets were provided by Torsello  [29]. Each of these data sets was acquired from one object model in different views and the multi-view point sets were provided along with the ground truth of rigid transformations for their registration. Table II illustrates some details of these data sets. To reduce the run time of registration, all data sets were uniformly down-sampled to around 2000 points per point set. In our method, the ratio of outliers is set to be .

To illustrate its performance, EMPMR is compared with three state-of-the-art approaches: the K-means based approach [32], the motion averaging approach with the TrICP algorithm [10], and the joint registration of multiple point clouds approach [8], which are abbreviated K-means, MATrICP, and JRMPC, respectively. For different datasets, all approaches utilize the same setting for each parameter. Experimental results are reported in the form of the runtime, errors of rotation matrix and translation vector, where the error of rotation matrix and translation vector are defined as and , respectively. Here, and indicate the ground truth and the estimated one of the th rigid transformation, respectively. All competed approaches utilize the -d tree method to search the NN. Experiments are performed on a four-core 3.6 GHz computer with 8 GB of memory.

Initial K-means [33] MATrICP [10] JRMPC [8] EMPMR
Angel 0.0312 2.0388 0.0079 0.9789 0.0146 2.8651 0.0092 1.9816 0.0038 0.8773
Armadillo 0.0415 2.7900 0.0114 2.2445 0.0121 2.4400 0.0189 2.8712 0.0113 0.8825
Bunny 0.0338 2.1260 0.0141 1.6088 0.0124 0.7181 0.0254 1.9551 0.0069 0.3468
Buddha 0.0371 1.6535 0.0208 1.0276 0.0120 0.8581 0.0226 0.9629 0.0123 1.0276
Dragon 0.0355 1.5216 0.0189 1.6140 0.0164 1.1037 0.0220 1.7466 0.0147 1.0573
Hand 0.0823 0.4986 0.0067 0.6114 0.0371 1.3391 0.0103 0.7062 0.0065 0.3289
TABLE III: Registration error of different approaches tested on six data sets

5.1 Accuracy and efficiency

Fig. 2 and Table III illustrate multi-view registration results of different approaches tested on six data sets. To view these registration results in a more intuitive way, Fig. 3 illustrates all multi-view registration results in the form of cross-section. As shown in Fig. 3 and Table III, EMPMR can always achieve the most accurate registration for all data sets, except for the Stanford Buddha. For the Stanford Buddha data sets, both MATrICP and EMPMR are able to obtain accurate registration results, where the former is a little better than the latter. For the remaining data sets, accurate registration results may be obtained by some other approaches. As a method probabilistic approach, JRMPC can not obtain the accurate registration results as we expected. When refers to the efficiency, as displayed in Fig. 2, K-means is the most efficient method and EMPMR is comparable with K-means, they are more efficient than the other two approaches.

Fig. 2: Runtime comparison of different approaches tested on six data sets, where [10.3673, 3.2899, 2.6001, 3.8594, 4.026, 8.7427] minutes correspond to 100% runtime of each data set.
Fig. 3: Multi-view registration results in the form of cross-section. (a) Aligned 3D models. (b) Initial results. (c) Results of K-means method. (d) Results of MATrICP method. (e) Results of JRMPC method. (f) Our results.
Fig. 4: Multi-view registration error of EMPMR under varied for four different data sets. (a) Rotation errors. (b) Translation errors.
Fig. 5: Scene reconstruction of the EXBI data set by EMPMR. (a) Scene image of each point set. (b) Initial aligned point sets. (c) Multi-view registration results with noise. (d) Scene reconstruction.

To achieve the multi-view registration, K-means based approach sequentially traverses each point set, then alternatively implements the operations of clustering and rigid transformation estimation. In the clustering operation, it applies the K-means algorithm to clustering all aligned points into the preset number of clusters and utilizes one clustering centroid to represent all points in this cluster, which inevitably causes the loss of information. For the estimation of each rigid transformation, it aligns the point set to clustering centroids, where the correspondence of each data point is searched from all clustering centroids. Since the number of cluster centroids is far less than that of raw points in all point sets, this approach is very efficient but far from accuracy for the multi-view registration due to the information loss.

For the multi-view registration, MATrICP approach recovers all global motions from a set of relative motions generated from the pair-wise registration. As the pair-wise registration problem is easier than the multi-view registration problem, the motion averaging algorithm can transform the latter into the former problem. For accurate registration, the motion averaging algorithm is expected to get more relative motions as its input and the expectation of more relative motions may introduce some unreliable relative motions. However, the motion averaging algorithm is sensitive to unreliable relative motions, and even one relative motion will lead to the failure of multi-view registration. As shown in Table III and Fig. 3, for the Hand dataset, MATrICP approach failures to obtain accurate registration results due to the input of unreliable relative motions.

While JRMPC supposes that all points are drawn from a central Gaussian mixture and so cast the multi-view registration problem into a clustering problem. It utilizes the EM algorithm to simultaneously estimate all GMM parameters and rigid transformations that optimally align point sets. As a probabilistic method, this approach is expected to obtain accurate results in theoretically. However, it requires to estimate a huge number of parameters, whose initial values should be fine-tuned and provided in advance. Without good initial parameters, this approach is unable to obtain the desired registration results, but it is difficult to provide good initial parameters for different data sets. Similar to K-means based approach, the clustering in JRMPC also lead to the information loss, which can reduce the accuracy of registration results. What’s more, this approach requires to establish the correspondences between each point and the center of each GMM, which is very time-consuming. In practice, it is difficult to obtain promising results for multi-view registration. This conclusion is also verified by these experimental results.

Different from JRMPC, EMPMR assumes that each data point is generated from one corresponding GMM, whose centroids are specified by its NNs in other opposite point sets. As EMPMR utilizes equal covariances and equal membership probabilities for all GMM components, it only requires to estimate rigid transformations for multi-view registration. Besides, it only contains a preset parameter , which can be set empirically. To explore more available information, NNs are searched from all other opposite point sets, so it is more likely to obtain satisfactory results for the registration of multi-view point sets. Since this probabilistic method only requires to establish correspondences for each data point, it is more efficient than JRMPC. Accordingly, EMPMR has good performance for multi-view registration in both accuracy and efficiency.

5.2 Parameter sensitivity

In EMPRM, there is a parameter , which requires to be set empirically. Thus, one question arising here is whether the performance of EMPRM is sensitive to or not. To answer this question, we conduct experiments on Stanford Armadillo, Bunny, Buddha, and Dragon data sets to observe the effects on multi-view registration performance with different values of . Experimental results are reported in the form of registration errors, which are recorded in Fig. 4.

Fig. 4 displays experimental results with different values of on three data sets. It can be observed that: 1) the setting of is more likely to obtain accurate results for multi-view registration. 2) The performance of EMPRM only has small variations as long as is chosen in a suitable range, i.e., from 0.0005 to 0.01. Summarily, EMPRM is relatively insensitive to its parameter as long as it is chosen from a suitable range. This makes it easy to apply EMPRM without much effort for parameter tuning.

6 Scene reconstruction

In practice, it is required to deal with different kinds of point sets. To illustrate its ability for the application, EMPMR is tested on the EXBI data set [8] for the scene reconstruction.

The EXBI data set contains 10 RGB-D point sets recorded by the Kinect sensor, which acquires point clouds with associated color information. During the acquirement of this data set, the Kinect sensor was manually moving around an indoor scene. For the scene reconstruction, only information is processed by EMPMR and color information is utilized to assist for the final assessment. Given the initial rigid transformations, EMPMR is utilized to achieve accurate multi-view registration, which is the prerequisite for the scene reconstruction. Fig. 5 displays the scene reconstruction of EMPMR. Since the raw RGB-D contains noises, the multi-view registration results should be filtered for good scene reconstruction. As shown in Fig. 5, EMPMR has the potential for the multi-view registration of RGB-D data and can be applied to the scene reconstruction.

7 Conclusion

Under the expectation-maximization perspective, this paper proposes an effective approach for the registration of multi-view point sets. It assumes that each data point is generated from one GMM, which is specified by all NNs of this data point in other opposite point sets. In this way, the proposed method can treat all point sets on an equal footing: different points are a realization of the same number of GMMs and the multi-view registration is cast into a maximum likelihood estimation problem. To achieve the multi-view registration, the EM-based algorithm is derived to maximize the likelihood function and estimate the rigid transformations for all point sets. Experimental results illustrate its superior performance over state-of-the-art approaches on accuracy and efficiency. What’s more, this approach can be applied to 3D scene reconstruction and 3D mapping of small environments. Our future work will extend this approach to solve the problem of robot 3D mapping.

Acknowledgment

This work is supported by the National Natural Science Foundation of China under Grant nos. 61573273, in part by State Key Laboratory of Rail Transit Engineering Informatization (FSDI) under Grant Nos. SKLKZ19-01 and SKLK19-09. We also would like to thank Andrea Torsello for providing Angel and Hand datasets.

References

  • [1] D. Aiger, N. J. Mitra, and D. Cohen-Or (2008) 4-points congruent sets for robust pairwise surface registration. In ACM transactions on graphics (TOG), Vol. 27, pp. 85. Cited by: §1.
  • [2] F. Arrigoni, B. Rossi, and A. Fusiello (2016) Global registration of 3d point sets via lrs decomposition. In European Conference on Computer Vision, pp. 489–504. Cited by: §2.
  • [3] R. Bergevin, M. Soucy, H. Gagnon, and D. Laurendeau (1996) Towards a general multi-view registration technique. IEEE Transactions on Pattern Analysis and Machine Intelligence 18 (5), pp. 540–547. Cited by: §2.
  • [4] P. J. Besl and N. D. McKay (1992) Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, Vol. 1611, pp. 586–606. Cited by: §1.
  • [5] Y. Chen and G. Medioni (1992) Object modelling by registration of multiple range images. Image and vision computing 10 (3), pp. 145–155. Cited by: §1, §2.
  • [6] D. Chetverikov, D. Stepanov, and P. Krsek (2005) Robust euclidean alignment of 3d point sets: the trimmed iterative closest point algorithm. Image and vision computing 23 (3), pp. 299–309. Cited by: §2.
  • [7] A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt (2017) Bundlefusion: real-time globally consistent 3d reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics (ToG) 36 (3), pp. 24. Cited by: §1.
  • [8] G. D. Evangelidis and R. Horaud (2017) Joint alignment of multiple point sets with batch and incremental expectation-maximization. IEEE transactions on pattern analysis and machine intelligence 40 (6), pp. 1397–1410. Cited by: §2, TABLE III, §5, §6.
  • [9] W. Gao and R. Tedrake (2019) Filterreg: robust and efficient probabilistic point-set registration using gaussian filter and twist parameterization. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 11095–11104. Cited by: §2.
  • [10] V. M. Govindu and A. Pooja (2013) On averaging multiview relations for 3d scan registration. IEEE Transactions on Image Processing 23 (3), pp. 1289–1302. Cited by: §2, TABLE III, §5.
  • [11] R. Guo, J. Zhu, Y. Li, D. Chen, Z. Li, and Y. Zhang (2018) Weighted motion averaging for the registration of multi-view range scans. Multimedia Tools and Applications 77 (9), pp. 10651–10668. Cited by: §2.
  • [12] B. Jian and B. C. Vemuri (2010) Robust point set registration using gaussian mixture models. IEEE transactions on pattern analysis and machine intelligence 33 (8), pp. 1633–1645. Cited by: §2.
  • [13] Z. Jiang, J. Zhu, Y. Li, J. Wang, Z. Li, and H. Lu (2019) Simultaneous merging multiple grid maps using the robust motion averaging. Journal of Intelligent & Robotic Systems 94 (3-4), pp. 655–668. Cited by: §1.
  • [14] C. Jin, J. Zhu, Y. Li, S. Pang, L. Chen, and J. Wang (2018) Multi-view registration based on weighted lrs matrix decomposition of motions. IET Computer Vision 13 (4), pp. 376–384. Cited by: §2.
  • [15] S. Krishnan, P. Y. Lee, J. B. Moore, S. Venkatasubramanian, et al. (2005) Global registration of multiple 3d point sets via optimization-on-a-manifold.. In Symposium on Geometry Processing, pp. 187–196. Cited by: §2.
  • [16] H. Lei, G. Jiang, and L. Quan (2017) Fast descriptors and correspondence propagation for robust global point cloud registration. IEEE Transactions on Image Processing 26 (8), pp. 3614–3623. Cited by: §1, §2.
  • [17] E. Lomonosov, D. Chetverikov, and A. Ekárt (2006) Pre-registration of arbitrarily oriented 3d surfaces using a genetic algorithm. Pattern Recognition Letters 27 (11), pp. 1201–1208. Cited by: §2.
  • [18] B. C. M. Levoy and K. Pull The stanford 3d scanning repository. External Links: Link Cited by: §5.
  • [19] X. Mateo, X. Orriols, and X. Binefa (2014) Bayesian perspective for the registration of multiple 3d views. Computer Vision and Image Understanding 118, pp. 84–96. Cited by: §2.
  • [20] A. Myronenko and X. Song (2009) On the closed-form solution of the rotation matrix arising in computer vision problems. arXiv preprint arXiv:0904.1613. Cited by: §4.2.
  • [21] A. Myronenko and X. Song (2010) Point set registration: coherent point drift. IEEE transactions on pattern analysis and machine intelligence 32 (12), pp. 2262–2275. Cited by: §2.
  • [22] A. Nüchter, J. Elseberg, P. Schneider, and D. Paulus (2010) Study of parameterizations for the rigid body transformations of the scan registration problem. Computer Vision and Image Understanding 114 (8), pp. 963–980. Cited by: §4.2.
  • [23] A. Nuchter, K. Lingemann, and J. Hertzberg (2007) Cached kd tree search for icp algorithms. In Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007), pp. 419–426. Cited by: §4.1.1.
  • [24] S. Rusinkiewicz and M. Levoy (2001) Efficient variants of the icp algorithm.. In 3dim, Vol. 1, pp. 145–152. Cited by: §2.
  • [25] R. B. Rusu, N. Blodow, and M. Beetz (2009) Fast point feature histograms (fpfh) for 3d registration. In 2009 IEEE International Conference on Robotics and Automation, pp. 3212–3217. Cited by: §2.
  • [26] R. Sandhu, S. Dambreville, and A. Tannenbaum (2009) Point set registration via particle filtering and stochastic dynamics. IEEE transactions on pattern analysis and machine intelligence 32 (8), pp. 1459–1473. Cited by: §2.
  • [27] S. Shih, Y. Chuang, and T. Yu (2008) An efficient and accurate method for the relaxation of multiview registration error. IEEE Transactions on Image Processing 17 (6), pp. 968–981. Cited by: §2.
  • [28] Y. Tang and J. Feng (2015) Hierarchical multiview rigid registration. In Computer Graphics Forum, Vol. 34, pp. 77–87. Cited by: §2.
  • [29] A. Torsello, E. Rodola, and A. Albarelli (2011) Multiview registration via graph diffusion of dual quaternions. In CVPR 2011, pp. 2441–2448. Cited by: §2, §5.
  • [30] J. Yang, H. Li, D. Campbell, and Y. Jia (2015) Go-icp: a globally optimal solution to 3d icp point-set registration. IEEE transactions on pattern analysis and machine intelligence 38 (11), pp. 2241–2254. Cited by: §1.
  • [31] F. Yu, J. Xiao, and T. Funkhouser (2015) Semantic alignment of lidar data at city scale. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1722–1731. Cited by: §1.
  • [32] J. Zhu, Z. Jiang, G. D. Evangelidis, C. Zhang, S. Pang, and Z. Li (2019) Efficient registration of multi-view point sets by k-means clustering. Information Sciences 488, pp. 205–218. Cited by: §2, §5.
  • [33] J. Zhu, Z. Li, S. Du, L. Ma, and T. Zhang (2014) Surface reconstruction via efficient and accurate registration of multiview range scans. Optical Engineering 53 (10), pp. 102104. Cited by: §2, TABLE III.