1 Introduction
A variety of applications, including 3D reconstruction, tracking, pose estimation, and object detection, invoke 3D registration as part of their operation [35, 5, 33]. To maximize the accuracy and speed of 3D registration, researchers have developed geometric feature descriptors [22, 11, 41, 9], pose optimization algorithms [38, 45, 24, 50], and endtoend feature learning and registration pipelines [41, 2].
In particular, recent endtoend registration networks have proven to be effective in relation to classical pipelines. However, these endtoend approaches have some drawbacks that limit their accuracy and applicability. For example, PointNetLK [2] uses globally pooled features to encode the entire geometry of a point cloud, which decreases spatial acuity and registration accuracy. Deep closest point [41] makes strong assumptions on the distribution of points and correspondences, which do not hold for partially overlapping 3D scans.
In this work, we propose three modules for robust and accurate registration that resolve these drawbacks: a 6dimensional convolutional network for correspondence confidence estimation, a differentiable Weighted Procrustes method for scalable registration, and a robust optimizer for finetuning the final alignment.
The first component is a 6dimensional convolutional network that analyzes the geometry of 3D correspondences and estimates their accuracy. Our approach is inspired by a number of learningbased methods for estimating the validity of correspondences in 2D [46, 34] and 3D [31]
. These methods stack coordinates of correspondence pairs, forming a vector
for each correspondence . Prior methods treat these dimensional vectors as a set, and apply global set processing models for analysis. Such models largely disregard local geometric structure. Yet the correspondences are embedded in a metric space () that induces distances and neighborhood relationships. In particular, 3D correspondences form a geometric structure in 6dimensional space [8] and we use a highdimensional convolutional network to analyze the 6D structure formed by correspondences and estimate the likelihood that a given correspondence is correct (i.e., an inlier).The second component we develop is a differentiable Weighted Procrustes solver. The Procrustes method [15] provides a closedform solution for rigid registration in . A differentiable version of the Procrustes method by Wang et al. [41] has been used for endtoend registration. However, the differentiable Procrustes method passes gradients through coordinates, which requires time and memory for
keypoints, limiting the number of keypoints that can be processed by the network. We use the inlier probabilities predicted by our first module (the 6D convolutional network) to guide the Procrustes method, thus forming a differentiable Weighted Procrustes method. This method passes gradients through the weights associated with correspondences rather than correspondence coordinates. The computational complexity of the Weighted Procrustes method is linear in the number of correspondences, allowing the registration pipeline to use dense correspondence sets rather than sparse keypoints. This substantially increases registration accuracy.
Our third component is a robust optimization module that finetunes the alignment produced by the Weighted Procrustes solver. This optimization module minimizes a differentiable loss via gradient descent on the continuous representation space [52]. The optimization is fast since it does not require neighbor search in the inner loop [48].
Experimentally, we validate the presented modules on a realworld pairwise registration benchmark [47] and largescale scene reconstruction datasets [18, 6, 32]. We show that our modules are robust, accurate, and fast in comparison to both classical global registration algorithms [50, 35, 45] and recent endtoend approaches [31, 41, 2]. All training and experiment scripts are available at https://github.com/chrischoy/DeepGlobalRegistration.
2 Related Work
We divide the related work into three categories following the stages of standard registration pipelines that deal with realworld 3D scans: featurebased correspondence matching, outlier filtering, and pose optimization.
Featurebased correspondence matching.
The first step in many 3D registration pipelines is feature extraction. Local and global geometric structure in 3D is analyzed to produce highdimensional feature descriptors, which can then be used to establish correspondences.
Traditional handcrafted features commonly summarize pairwise or higherorder relationships in histograms [20, 37, 40, 36, 35]. Recent work has shifted to learning features via deep networks [47, 22]. A number of recent methods are based on global pooling models [12, 11, 49], while others use convolutional networks [14, 9].
Our work is agnostic to the feature extraction mechanism. Our modules primarily address subsequent stages of the registration pipeline and are compatible with a wide variety of feature descriptors.
Outlier filtering. Correspondences produced by matching features are commonly heavily contaminated by outliers. These outliers need to be filtered out for robust alignment. A widely used family of techniques for robust model fitting is based on RANdom SAmple Consensus (RANSAC) [38, 1, 35, 29, 19], which iteratively samples small sets of correspondences in the hope of sampling a subset that is free from outliers. Other algorithms are based on branchandbound [45], semidefinite programming [28, 25], and maximal clique selection [44]
. These methods are accurate, but commonly require longer iterative sampling or more expensive computation as the signaltonoise ratio decreases. One exception is TEASER
[44], which remains effective even with high outlier rates. Other methods use robust loss functions to reject outliers during optimization
[50, 4].Our work uses a convolutional network to identify inliers and outliers. The network needs only one feedforward pass at test time and does not require iterative optimization.
Pose optimization. Pose optimization is the final stage that minimizes an alignment objective on filtered correspondences. Iterative Closest Points (ICP) [3] and Fast Global Registration (FGR) [50] use secondorder optimization to optimize poses. Makadia et al. [26] propose an iterative procedure to minimize correlation scores. Maken et al. [27]
propose to accelerate this process by stochastic gradient descent.
Recent endtoend frameworks combine feature learning and pose optimization. Aoki et al. [2] combine PointNet global features with an iterative pose optimization method [24]. Wang et al. [41, 42]
train graph neural network features by backpropagating through pose optimization.
We further advance this line of work. In particular, our Weighted Procrustes method reduces the complexity of optimization from quadratic to linear and enables the use of dense correspondences for highly accurate registration of realworld scans.
3 Deep Global Registration
3D reconstruction systems typically take a sequence of partial 3D scans as input and recover a complete 3D model of the scene. These partial scans are scene fragments, as shown in Fig. 1. In order to reconstruct the scene, reconstruction systems often begin by aligning pairs of fragments [6]. This stage is known as pairwise registration. The accuracy and robustness of pairwise registration are critical and often determine the accuracy of the final reconstruction.
Our pairwise registration pipeline begins by extracting pointwise features. These are matched to form a set of putative correspondences. We then use a highdimensional convolutional network (ConvNet) to estimate the veracity of each correspondence. Lastly, we use a Weighted Procrustes method to align 3D scans given correspondences with associated likelihood weights, and refine the result by optimizing a robust objective.
The following notation will be used throughout the paper. We consider two point clouds, and , with and points respectively, where . A correspondence between and is denoted as or .
3.1 Feature Extraction
To prepare for registration, we extract pointwise features that summarize geometric context in the form of vectors in metric feature space. Our pipeline is compatible with many feature descriptors. We use Fully Convolutional Geometric Features (FCGF) [9], which have recently been shown to be both discriminative and fast. FCGF are also compact, with dimensionality as low as 16 to 32, which supports rapid neighbor search in feature space.
3.2 Correspondence Confidence Prediction
Given the features and of two 3D scans, we use the nearest neighbor in the feature space to generate a set of putative correspondences or matches . This procedure is deterministic and can be handcrafted to filter out noisy correspondences with ratio or reciprocity tests [50]
. However, we propose to learn this heuristic filtering process through a convolutional network that learns to analyze the underlying geometric structure of the correspondence set.
We first provide a 1dimensional analogy to explain the geometry of correspondences. Let be a set of 1dimensional points and be another such set . Here is a translation of : . If an algorithm returns a set of possible correspondences , then the set of correct correspondences (inliers) will form a line (first 5 pairs), whereas incorrect correspondences (outliers) will form random noise outside the line (last 2 pairs). If we extend this to 3D scans and pointclouds, we can also represent a 3D correspondence as a point in 6dimensional space . The inlier correspondences will be distributed on a lowerdimensional surface in this 6D space, determined by the geometry of the 3D input. We denote as a set of inliers or a set of correspondences that align accurately up to the threshold under the ground truth transformation . Meanwhile, the outliers will be scattered outside the surface . To identify the inliers, we use a convolutional network. Such networks have been proven effective in related dense prediction tasks, such as 3D point cloud segmentation [16, 7]. The convolutional network in our setting is in 6dimensional space [8]. The network predicts a likelihood for each correspondence, which is a point in 6D space . The prediction is interpreted as the likelihood that the correspondence is true: an inlier.
Note that the convolution operator is translation invariant, thus our 6D ConvNet will generate the same output regardless of the absolute position of inputs in 3D. We use a similar network architecture to Choy et al. [9] to create a 6D convolutional network with skip connections within the spatial resolution across the network. The architecture of the 6D ConvNet is shown in Fig. 2. During training, we use the binary crossentropy loss between the likelihood prediction that a correspondence is an inlier, , and the groundtruth correspondences to optimize the network parameters:
(1) 
where and is the cardinality of the set of putative correspondences.
3.3 Weighted Procrustes for
The inlier likelihood estimated by the 6D ConvNet provides a weight for each correspondence. The original Procrustes method [15] minimizes the mean squared error between corresponding points and thus gives equal weight to all correspondences. In contrast, we minimize a weighted mean squared error . This change allows us to pass gradients through the weights, rather than through the position [41], and enables the optimization to scale to dense correspondence sets.
Formally, Weighted Procrustes analysis minimizes:
(2)  
(3)  
(4) 
where , , and . is a list of indices that defines the correspondences . is the weight vector and denotes the normalized weight after a nonlinear transformation that applies heuristic prefiltering. forms the diagonal weight matrix.
Theorem 1
: The and t that minimize the squared error are and where , , , and .
Sketch of proof. First, we differentiate w.r.t. and equate the partial derivative to 0. This gives us . Next, we substitute on Eq. 4 and do the same for . Then, we substitute into the squares. This yields and expanding the term results in two squares plus
. We maximize the last negative term whose maximum is the sum of all singular values, which leads to
. The full derivation is in the supplement.We can easily extend the above theorem to incorporate a scaling factor , or anisotropic scaling for tasks such as scantoCAD registration, but in this paper we assume that partial scans of the same scene have the same scale.
The Weighted Procrustes method generates rotation and translation as outputs that depend on the weight vector . In our current implementation, and are directly sent to the robust registration module in Section 4 as an initial pose. However, we briefly demonstrate that they can also be embedded in an endtoend registration pipeline, since Weighted Procrustes is differentiable. From a toplevel loss function of and , we can pass the gradient through the closedform solver, and update parameters in downstream modules:
(5) 
where can be defined as the combination of differentiable rotation error (RE) and translation error (TE) between predictions and groundtruth :
(6)  
(7) 
or the Forbenius norm of relative transformation matrices defined in [2, 41]. The final loss is the weighted sum of , , and .
4 Robust Registration
In this section, we propose a finetuning module that minimizes a robust loss function of choice to improve the registration accuracy. We use a gradientbased method to refine poses, where a continuous representation [52] for rotations is adopted to remove discontinuities and construct a smooth optimization space. This module initializes the pose from the prediction of the Weighted Procrustes method. During iterative optimization, unlike Maken et al. [27], who find the nearest neighbor per point at each gradient step, we rely on the correspondence likelihoods from the 6D ConvNet, which is estimated only once per initialization.
In addition, our framework naturally offers a failure detection mechanism. In practice, Weighted Procrustes may generate numerically unstable solutions when the number of valid correspondences is insufficient due to small overlaps or noisy correspondences between input scans. By computing the ratio of the sum of the filtered weights to the total number of correspondences, i.e. , we can easily approximate the fraction of valid correspondences and predict whether an alignment may be unstable. When this fraction is low, we resort to a more timeconsuming but accurate registration algorithm such as RANSAC [38, 1, 35] or a branchandbound method [45] to find a numerically stable solution. In other words, we can detect when our system might fail before it returns a result and fall back to a more accurate but timeconsuming algorithm, unlike previous endtoend methods that use globally pooled latent features [2]
or a singly stochastic matrix
[41] – such latent representations are more difficult to interpret.4.1 SE(3) Representation and Initialization
We use the 6D representation of 3D rotation proposed by Zhou et al. [52], rather than Euler angles or quaternions. The new representation uses 6 parameters and can be transformed into a orthogonal matrix by
(8) 
where are , , and , and denotes L2 normalization. Thus, the final representation that we use is which are equivalent to using Eq. 8.
To initialize , we simply use the first two columns of the rotation matrix , i.e., , . For convenience, we define as though this inverse function is not unique as there are infinitely many choices of that map to the same .
4.2 Energy Minimization
We use a robust loss function to finetune the registration between predicted inlier correspondences. The general form of the energy function is
(9) 
where and are defined as in Eq. 3 and is a prefiltering function. In the experiments, we use , which clips weights below
elementwise as neural network outputs bounded logit scores.
is a pointwise loss function between and ; we use the Huber loss in our implementation. The energy function is parameterized by and which in turn are represented as . We can apply firstorder optimization algorithms such as SGD, Adam, etc. to minimize the energy function, but higherorder optimizers are also applicable since the number of parameters is small. The complete algorithm is described in Alg. 1.5 Experiments
We analyze the proposed model in two registration scenarios: pairwise registration where we estimate an transformation between two 3D scans or fragments, and multiway registration which generates a final reconstruction and camera poses for all fragments that are globally consistent. Here, pairwise registration serves as a critical module in multiway registration.
For pairwise registration, we use the 3DMatch benchmark [47] which consists of 3D point cloud pairs from various realworld scenes with ground truth transformations estimated from RGBD reconstruction pipelines [17, 10]. We follow the train/test split and the standard procedure to generate pairs with at least 30% overlap for training and testing [12, 11, 9]. For multiway registration, we use the simulated Augmented ICLNUIM dataset [6, 18] for quantitative trajectory results, and Indoor LiDAR RGBD dataset [32] and Stanford RGBD dataset [6] for qualitative registration visualizations. Note in this experiment we use networks trained on the 3DMatch training set and do not finetune on the other datasets. This illustrates the generalization abilities of our models. Lastly, we use KITTI LIDAR scans [13] for outdoor pairwise registration. As the official registration splits do not have labels for pairwise registration, we follow Choy et al. [9] to create pairwise registration train/val/test splits.
For all indoor experiments, we use 5cm voxel downsampling [35, 51], which randomly subsamples a single point within each 5cm voxel to generate point clouds with uniform density. For safeguard registration, we use RANSAC and the safeguard threshold , which translates to 5% of the correspondences should be valid. We train learningbased stateoftheart models and our network on the training split of the 3DMatch benchmark. During training, we augment data by applying random rotations varying from 180 to 180 degrees around a random axis. Groundtruth pointwise correspondences are found using nearest neighbor search in 3D space. We train the 6dimensional ConvNet on a single Titan XP with batch size 4. SGD is used with an initial learning rate and an exponential learning rate decay factor 0.99.
5.1 Pairwise Registration
In this section, we report the registration results on the test set of the 3DMatch benchmark [47], which contains 8 different scenes as depicted in Fig. 3. We measure translation error (TE) defined in Eq. 6, rotation error (RE) defined in Eq. 7, and recall. Recall is the ratio of successful pairwise registrations and we define a registration to be successful if its rotation error and translation error are smaller than predefined thresholds. Average TE and RE are computed only on these successfully registered pairs since failed registrations return poses that can be drastically different from the ground truth, making the error metrics unreliable.
Recall  TE (cm)  RE (deg)  Time (s)  
Ours w/o safeguard  85.2%  7.73  2.58  0.70 
Ours  91.3%  7.34  2.43  1.21 
FGR [50]  42.7%  10.6  4.08  0.31 
RANSAC2M [35]  66.1%  8.85  3.00  1.39 
RANSAC4M  70.7%  9.16  2.95  2.32 
RANSAC8M  74.9%  8.96  2.92  4.55 
GoICP [45]  22.9%  14.7  5.38  771.0 
Super4PCS [29]  21.6%  14.1  5.25  4.55 
ICP (P2Point) [51]  6.04%  18.1  8.25  0.25 
ICP (P2Plane) [51]  6.59%  15.2  6.61  0.27 
DCP [41]  3.22%  21.4  8.42  0.07 
PointNetLK [2]  1.61%  21.3  8.04  0.12 
ElasticFusion [43]  InfiniTAM [21]  BADSLAM [39]  Multiway + FGR [50]  Multiway + RANSAC [51]  Multiway + Ours  
Living room 1  66.61  46.07  fail  78.97  110.9  21.06 
Living room 2  24.33  73.64  40.41  24.91  19.33  21.88 
Office 1  13.04  113.8  18.53  14.96  14.42  15.76 
Office 2  35.02  105.2  26.34  21.05  17.31  11.56 
Avg. Rank  3  5  5  3.5  2.5  2 
We compare our methods with various classical methods [50, 35, 45] and stateoftheart learning based methods [41, 42, 2, 31]. All the experiments are evaluated on an Intel i77700 CPU and a GTX 1080Ti graphics card except for GoICP [45] tested on an Intel i75820K CPU. In Table 1, we measure recall with the TE threshold 30cm which is typical for indoor scene relocalization [30], and RE threshold 15 degrees which is practical for partially overlapping scans from our experiments. In Fig. 4, we plot the sensitivity of recall on both thresholds by changing one threshold and setting the other to infinity. Fig. 5 includes detailed statistics on separate test scenes. Our system outperforms all the baselines on recall by a large margin and achieves the lowest translation and rotation error consistently on most scenes.
Classical methods. To compare with classical methods, we evaluate pointtopoint ICP, Pointtoplane ICP, RANSAC [35], and FGR [50], all implemented in Open3D [51]
. In addition, we test the opensource Python bindings of GoICP
[45] and Super4PCS [29]. For RANSAC and FGR, we extract FPFH from voxeldownsampled point clouds. The results are shown in Table 1.ICP variants mostly fail as the dataset contains challenging 3D scan sequences with small overlap and large camera viewpoint change. Super4PCS, a samplingbased algorithm, performs similarly to GoICP, an ICP variant with branchandbound search.
Featurebased methods, FGR and RANSAC, perform better. When aligning 5cmvoxeldownsampled point clouds, RANSAC achieves recall as high as 70%, while FGR reaches 40%. Table 1 also shows that increasing the number of RANSAC iterations by a factor of 2 only improves performance marginally. Note that our method is about twice as fast as RANSAC with 2M iterations while achieving higher recall and registration accuracy.
Learningbased methods. We use 3DRegNet [31], Deep Closest Point (DCP) [41], PRNet [42], and PointNetLK [2] as our baselines. We train all the baselines on 3DMatch with the same setup and data augmentation as ours for all experiments.
For 3DRegNet, we follow the setup outlined in [31], except that we do not manually filter outliers with ground truth, and train and test with the standard realistic setup. We find that the registration loss of 3DRegNet does not converge during training and the rotation and translation errors are consistently above 30 degrees and 1m during test.
We train Deep Closest Point (DCP) with 1024 randomly sampled points for each point cloud for 150 epochs
[41]. We initialize the network with the pretrained weights provided by the authors. Although the training loss converges, DCP fails to achieve reasonable performance for point clouds with partial overlap. DCP uses a singly stochastic matrix to find correspondences, but this formulation assumes that all points in point cloud have at least one corresponding point in the convex hull of point cloud . This assumption fails when some points in have no corresponding points in , as is the case for partially overlapping fragments. We also tried to train PRNet [42]on our setup, but failed to get reasonable results due to random crashes and highvariance training losses.
Lastly, we finetune PointNetLK [2] on 3DMatch for 400 epochs, starting from the pretrained weights provided by the authors. PointNetLK uses a single feature that is globally pooled for each point cloud and regresses the relative pose between objects, and we suspect that a globally pooled feature fails to capture complex scenes such as 3DMatch.
In conclusion, while working well on objectcentric synthetic datasets, current endtoend registration approaches fail on realworld data. Unlike synthetic data, real 3D point cloud pairs contain multiple objects, partial scans, selfocclusion, substantial noise, and may have only a small degree of overlap between scans.
5.2 Multiway Registration
Multiway registration for RGBD scans proceeds via multiple stages. First, the pipeline estimates the camera pose via offtheshelf odometry and integrates multiple 3D scans to reduce noise and generate accurate 3D fragments of a scene. Next, a pairwise registration algorithm roughly aligns all fragments, followed by multiway registration [6] which optimizes fragment poses with robust pose graph optimization [23].
We use a popular opensource implementation of this registration pipeline [51] and replace the pairwise registration stage in the pipeline with our proposed modules. Note that we use the networks trained on the 3DMatch training set and test on the multiway registration datasets [18, 32, 6]; this demonstrates crossdataset generalization.
We test the modified pipeline on the Augmented ICLNUIM dataset [6, 18] for quantitative trajectory results, and Indoor LiDAR RGBD dataset [32] and Stanford RGBD dataset [6] for qualitative registration visualizations. We measure the absolute trajectory error (ATE) on the Augmented ICLNUIM dataset with simulated depth noise. As shown in Table 2, compared to stateoftheart online SLAM [43, 21, 39] and offline reconstruction methods [50], our approach yields consistently low error across scenes.
For qualitative results, we compare pairwise fragment registration on these scenes against FGR and RANSAC in Fig. 6. Full scene reconstruction results are shown in the supplement.
5.3 Outdoor LIDAR Registration
We use outdoor LIDAR scans from the KITTI dataset [13] for registration, following [9]. The registration split of Choy et al. [9] uses GPSIMU to create pairs that are at least 10m apart and generated groundtruth transformation using GPS followed by ICP to fix errors in GPU readings. We use FCGF features [9] trained on the training set of the registration split to find the correspondences and trained the 6D ConvNet for inlier confidence prediction similar to how we trained the system for indoor registration. We use voxel size 30cm for downsampling point clouds for all experiments. Registration results are reported in Tab. 3 and visualized in Fig. 7.
Recall  TE (cm)  RE (deg)  Time (s)  
FGR [50]  0.2%  40.7  1.02  1.42 
RANSAC [35]  34.2%  25.9  1.39  1.37 
FCGF [9]  98.2%  10.2  0.33  6.38 
Ours  96.9%  21.7  0.34  2.29 
Ours + ICP  98.0%  3.46  0.14  2.51 
6 Conclusion
We presented Deep Global Registration, a learningbased framework that robustly and accurately aligns realworld 3D scans. To achieve this, we used a 6D convolutional network for inlier detection, a differentiable Weighted Procrustes algorithm for scalable registration, and a gradientbased optimizer for pose refinement. Experiments show that our approach outperforms both classical and learningbased registration methods, and can serve as a readytouse plugin to replace alternative registration methods in offtheshelf scene reconstruction pipelines.
References
 [1] (2008) 4points congruent sets for robust pairwise surface registration. ACM Transactions on Graphics. Cited by: §2, §4.
 [2] (2019) PointNetLK: robust & efficient point cloud registration using PointNet. In CVPR, Cited by: §1, §1, §1, §2, §3.3, §4, §5.1, §5.1, §5.1, Table 1.
 [3] (1992) A method for registration of 3d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence. Cited by: §2.
 [4] (2013) Sparse iterative closest point. In Eurographics, Cited by: §2.
 [5] (2010) 3D deformable face tracking with a commodity depth camera. In ECCV, Cited by: §1.
 [6] (2015) Robust reconstruction of indoor scenes. In CVPR, Cited by: §1, §3, Figure 6, §5.2, §5.2, §5.2, §5.

[7]
(2019)
4D spatiotemporal ConvNets: minkowski convolutional neural networks
. In CVPR, Cited by: §3.2. 
[8]
(2020)
Highdimensional convolutional networks for geometric pattern recognition
. In CVPR, Cited by: §1, §3.2.  [9] (2019) Fully convolutional geometric features. In ICCV, Cited by: §1, §2, §3.1, §3.2, §5.3, Table 3, §5.
 [10] (2017) BundleFusion: realtime globally consistent 3D reconstruction using onthefly surface reintegration. ACM Transactions on Graphics. Cited by: §5.

[11]
(2018)
PPFFoldNet: unsupervised learning of rotation invariant 3D local descriptors
. In ECCV, Cited by: §1, §2, §5.  [12] (2018) PPFNet: global context aware local features for robust 3D point matching. In CVPR, Cited by: §2, §5.
 [13] (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR, Cited by: §5.3, Table 3, §5.
 [14] (2019) The perfect match: 3D point cloud matching with smoothed densities. In CVPR, Cited by: §2.
 [15] (1975) Generalized procrustes analysis. Psychometrika. Cited by: §1, §3.3.
 [16] (2018) 3D semantic segmentation with submanifold sparse convolutional networks. In CVPR, Cited by: §3.2.
 [17] (2017) Finetocoarse global registration of RGBD scans. In CVPR, Cited by: §5.
 [18] (2014) A benchmark for RGBD visual odometry, 3D reconstruction and SLAM. In ICRA, Cited by: §1, §5.2, §5.2, §5.
 [19] (2015) Registration with the point cloud library: a modular framework for aligning in 3D. RAL. Cited by: §2.
 [20] (1999) Using spin images for efficient object recognition in cluttered 3D scenes. Pattern Analysis and Machine Intelligence. Cited by: §2.
 [21] (2016) Realtime largescale dense 3D reconstruction with loop closure. In ECCV, Cited by: §5.2, Table 2.
 [22] (2017) Learning compact geometric features. In ICCV, Cited by: §1, §2.
 [23] (2011) G2o: a general framework for graph optimization. In ICRA, Cited by: §5.2.
 [24] (1981) An iterative image registration technique with an application to stereo vision. In DARPA Image Understanding Workshop, Cited by: §1, §2.
 [25] (2003) A global solution to sparse correspondence problems. Pattern Analysis and Machine Intelligence. Cited by: §2.
 [26] (2006) Fully automatic registration of 3D point clouds. In CVPR, Cited by: §2.
 [27] (2019) Speeding up iterative closest point using stochastic gradient descent. In ICRA, Cited by: §2, §4.
 [28] (2016) Point registration via efficient convex relaxation. ACM Transactions on Graphics. Cited by: §2.
 [29] (2014) Super 4PCS fast global pointcloud registration via smart indexing. Computer Graphics Forum. Cited by: §2, §5.1, Table 1.
 [30] (2015) ORBSLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics. Cited by: §5.1.
 [31] (2019) 3DRegNet: a deep neural network for 3D point registration. arXiv. Cited by: §1, §1, §5.1, §5.1, §5.1.
 [32] (2017) Colored point cloud registration revisited. In ICCV, Cited by: §1, Figure 6, §5.2, §5.2, §5.
 [33] (2014) Realtime and robust hand tracking from depth. In CVPR, Cited by: §1.
 [34] (2018) Deep fundamental matrix estimation. In ECCV, Cited by: §1.
 [35] (2009) Fast point feature histograms (FPFH) for 3D registration. In ICRA, Cited by: Figure 1, (a)a, §1, §1, §2, §2, §4, Figure 6, §5.1, §5.1, Table 1, Table 3, §5.
 [36] (2008) Aligning point cloud views using persistent feature histograms. In IROS, Cited by: §2.
 [37] (2014) SHOT: unique signatures of histograms for surface and texture description. CVIU. Cited by: §2.
 [38] (2007) Efficient RANSAC for pointcloud shape detection. Computer Graphics Forum. Cited by: §1, §2, §4.
 [39] (2019) BAD SLAM: bundle adjusted direct RGBD SLAM. In CVPR, Cited by: §5.2, Table 2.
 [40] (2010) Unique shape context for 3D data description. In ACM Workshop on 3D Object Retrieval, Cited by: §2.
 [41] (2019) Deep closest point: learning representations for point cloud registration. In ICCV, Cited by: Figure 1, (c)c, §1, §1, §1, §1, §2, §3.3, §3.3, §4, §5.1, §5.1, §5.1, Table 1.

[42]
(2019)
PRNet: selfsupervised learning for partialtopartial registration
. In NeurIPS, Cited by: §2, §5.1, §5.1, §5.1.  [43] (2015) ElasticFusion: dense SLAM without a pose graph. In Robotics: Science and Systems, Cited by: §5.2, Table 2.
 [44] (2019) A polynomialtime solution for robust registration with extreme outlier rates. In Robotics: Science and Systems, Cited by: §2.
 [45] (2015) GoICP: a globally optimal solution to 3D ICP pointset registration. Pattern Analysis and Machine Intelligence. Cited by: §1, §1, §2, §4, §5.1, §5.1, Table 1.
 [46] (2018) Learning to find good correspondences. In CVPR, Cited by: §1.
 [47] (2017) 3DMatch: learning local geometric descriptors from RGBD reconstructions. In CVPR, Cited by: Figure 1, §1, §2, Figure 3, §5.1, §5.
 [48] (1994) Iterative point matching for registration of freeform curves and surfaces. IJCV. Cited by: §1.
 [49] (2019) NMNet: mining reliable neighbors for robust feature correspondences. In CVPR, Cited by: §2.
 [50] (2016) Fast global registration. In ECCV, Cited by: Figure 1, (b)b, §1, §1, §2, §2, §3.2, Figure 6, §5.1, §5.1, §5.2, Table 1, Table 2, Table 3.
 [51] (2018) Open3D: a modern library for 3D data processing. arXiv. Cited by: §5.1, §5.2, Table 1, Table 2, §5.
 [52] (2019) On the continuity of rotation representations in neural networks. In CVPR, Cited by: §1, §4.1, §4.
Comments
There are no comments yet.