Do not Omit Local Minimizer: a Complete Solution for Pose Estimation from 3D Correspondences

04/03/2019 ∙ by Lipu Zhou, et al. ∙ Microsoft Carnegie Mellon University 0

Estimating pose from given 3D correspondences, including point-to-point, point-to-line and point-to-plane correspondences, is a fundamental task in computer vision with many applications. We present a complete solution for this task, including a solution for the minimal problem and the least-squares problem of this task. Previous works mainly focused on finding the global minimizer to address the least-squares problem. However, existing works that show the ability to achieve global minimizer are still unsuitable for real-time applications. Furthermore, as one of contributions of this paper, we prove that there exist ambiguous configurations for any number of lines and planes. These configurations have several solutions in theory, which makes the correct solution may come from a local minimizer. Our algorithm is efficient and able to reveal local minimizers. We employ the Cayley-Gibbs-Rodriguez (CGR) parameterization of the rotation to derive a general rational cost for the three cases of 3D correspondences. The main contribution of this paper is to solve the resulting equation system of the minimal problem and the first-order optimality conditions of the least-squares problem, both of which are of complicated rational forms. The central idea of our algorithm is to introduce intermediate unknowns to simplify the problem. Extensive experimental results show that our algorithm significantly outperforms previous algorithms when the number of correspondences is small. Besides, when the global minimizer is the solution, our algorithm achieves the same accuracy as previous algorithms that have guaranteed global optimality, but our algorithm is applicable to real-time applications.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Estimating the pose from 3D correspondences, point-to-point, point-to-line and point-to-plane correspondences, is known as the 3D registration problem in the literature [25, 24, 26, 40, 4, 41]. It is one of the fundamental problems in computer vision with a wide range of applications, such as simultaneous localization and mapping (SLAM) [45, 44, 23, 30, 37], extrinsic calibration [47, 38, 22, 11, 49] and iterative closes point (ICP) framework [3]. Besides, some camera pose estimation problems, such as the perspective-n-point (PnP) problem [12, 16] and the perspective-n-line (PnL) problem [42, 29], can be transformed to a 3D registration problem [41]. However, the research on this problem is not as thorough as other pose estimation problems.

Previous works mainly focus on solving the least-squares problem. Although large progress has been made, achieving the optimal solution and real-time performance is still a challenge. Some algorithms [26, 4] are capable of finding the global minimum, however, their running time makes them not suitable for real-time applications. Recently, Wientapper [41] provided an efficient algorithm, but their algorithm can not achieve the global optimality in theory, and has the risk of no solution. Furthermore, previous works generally assume that the correct solution is the global minimizer. However, we find that the correct solution may come from a local minimizer in certain configurations.

Minimal solution is important to eliminate outliers in the RANSAC framework

[9]

. As the pose has 6 degrees of freedom (DoF), any combinations of 3D correspondences providing 6 independent constraints form a minimal configuration. Previous algorithms are proposed to solve these minimal configurations case by case

[6, 32, 31]. The contributions of this paper are as follows:

First, we prove that there exist ambiguous configurations for any number of plane and line correspondences, which leads to multiple solutions. When a configuration approximates ambiguity, the correct solution of a problem may come from a local minimum. Therefore, previous works [25, 26, 4] that only compute the global minimizer will fail in this case. Revealing local minimizers is essential for an algorithm to handle all the configurations.

Second, we present an efficient and accurate solution for the least-squares 3D registration problem. We use the CGR parametrization to represent the rotation, which generates a rational cost function. We derive its first-order optimality conditions. They form a high order polynomial system, and are hard to solve. Four intermediate unknowns are introduced to relax the original problem, which results in much simpler first-order optimality conditions. Gröbner basis method [8] is applied to solve this equation system. Then we refine the solution by the Newton-Raphson method.

Third, we present an unified solution for the potential minimal configurations. Previous algorithms are proposed to solve the minimal problems case by case [6, 32, 31]. We also use the CGR parametrization to represent the rotation. which generates a third order equation system for the minimal configurations. Three intermediate unknowns are introduced to simplify the equation system. We introduce a novel hidden variable method [8] to solve the resulting second order equation system for the rotation matrix. Then the translation can be computed from a linear system.

We evaluate our algorithm with synthetic and real data. Extensive experimental results show that recovering the local minimal is essential, especially when the number of correspondences is small. For a small , our algorithm significantly outperforms previous works [25, 26, 4, 41]. Besides, experimental results verify that our algorithm can converge to the global minimizer with the same accuracy as the previous work with guaranteed globally optimality [26, 4], however, our algorithm is much faster.

2 Related Works

This paper focuses on the pose estimation from point-to-point, point-to-line and point-to-plane correspondences correspondences. Pose estimation from point-to-point correspondences has been solved in early works [1, 13, 14]. There exist closed-form solutions for this problem. However, estimating pose from point-to-line and point-to-plane correspondences is more complicated. These 3D correspondences actually yield similar distance function, which is employed by previous works to construct a general cost function [25, 24, 26, 40, 4, 41]. The difference of various cost functions lies in the parameterization of the rotation matrix. The raw rotation matrix is adopted in [24, 4], and the quaternion [13] is used in [25, 26, 40, 41]. Previous works mainly focus on finding the global minimizer of the resulting cost function. In [26], they proposed a provably optimal algorithm. They employed convex underestimators with branch-and-bound methods to iteratively compute the global minimum of the cost function. Although this algorithm can guarantee global optimality, it is very time-consuming. Olsson [24] reduced the computational complexity by applying the Lagrangian dual relaxation to approximate the cost function. Their experimental results showed that a single convex semidefinite program can well approximate the original problem. Recently, an improved result is obtained by a strengthened Lagrangian dual relaxation [4]. Although without theoretical guarantees, their experimental results show that this algorithm can achieve the global minimum. Although these algorithms [24, 4] have made progress in efforts to reduce the computational time, they are still not suitable to demanding real-time applications. Recently, Wientapper [41] provided an efficient algorithm based on Gröbner basis polynomial solver. They introduced the first order derivatives of the norm-one constraint of the quaternion into the first order optimality conditions of the cost function as [16], rather than apply the Lagrangian formulation for the norm-one constraint of the quaternion. This results in an efficient solution, however, their result is not optimal in theory. Furthermore, as the number of equations is greater than the number of unknowns, the equation system may have no solution. They introduced 4 pre-rotations to solve the problem 4 times to handle this problem. The pre-rotation is to make the no-solution problem happen less likely, however, it can not fully solve this problem in theory.

Minimal solution is essential in the RANSAC framework [9] for robust pose estimation. Any combination of the 3D correspondences resulting in 6 independent constraints forms a minimal configuration. In the literature, specific algorithms have been proposed to solve certain configurations. In [7], a minimal solution for three line-to-plane correspondences were proposed. In [39, 48], they studied the minimal solution of a more specific line-to-plane configuration, , lines are all on a plane. As one line-to-plane correspondence can be treated as two point-to-plane correspondences, these problems can be formulated as a point-to-plane registration problem. Naroditsky [22] presented a solution to 6 point-to-plane correspondences. Ramalingam [31] solved all the point-to-plane minimal configurations. They designed specific intermediate coordinate systems to simplify each minimal configuration. This method is not convenient in the application, as it needs to implement a bunch of algorithms.

3D registration without correspondence is an related but more complicated problem. Several works [20, 27, 43, 50] presented globally optimal solution for 3D point registration without correspondence. The ICP framework [3] gives a general way to find correspondences and pose at the same time. Although this framework was originally designed for points, other 3D models can also be introduced into this framework [35, 5, 36]. The ICP framework iteratively finds the nearest elements as correspondences then calculates the pose until it converges. Efficiently and accurately estimating the pose from current nearest 3D correspondences is critical for the ICP framework.

3 Problem Formulation

In this paper, we use italic, boldfaced lowercase and boldfaced uppercase letters to represent scalars, vectors and matrices, respectively. The paper focuses on the problem of pose estimation from point-to-plane, point-to-line and point-to-point correspondences. We first consider the point-to-plane correspondence. Suppose we have a point

in one coordinate system and the corresponding plane in another coordinate system represented by the plane’s norm-one normal and a point lying on it. Then the scalar residual between a point and a plane can be written as

(1)

For the point-to-line correspondence, we can calculate the 3-dimensional residual vector between a point and the corresponding line represented by the norm-one direction and a point on it as

(2)

where is the identify matrix. Lastly, a point-to-point correspondence yields a -dimensional residual vector

(3)

and have 6 DoF. Our problem is to calculate and from any configuration of point-to-plane, point-to-line, and point-to-point correspondences that has at least 6 effective constraints. Let use define as the number of effective correspondences. When , this is a least squares problem described in section 4. When , this forms a minimal problem solved in section 5.

4 Least Squares Solution

Using the notation in (1), (2) and (3), we define our cost function as follows

(4)

where represents the determinant of .

4.1 Ambiguous Configurations

In previous works, the global minimizer is treated as the optimal solution. It is well known that there is an unique solution for at least three point-to-point correspondences, except for the degenerate configuration, all the points are collinear. However, for plane and line, there exist configurations which have several solutions. We call them as ambiguous configurations. The ambiguous configuration differs from the degenerate configuration. The ambiguous configuration has several solutions, but the degenerate configuration has infinite solutions. Formally, we have the following lemma for point-to-line correspondences. Proof is in the supplementary material.

Lemma 1.

For any points and any and , there exist lines to make and are exact solutions for the point-to-line correspondences.

Similarly, we have lemma 2 for point-to-plane correspondences.

Lemma 2.

For any points and any , and , there exist planes to make , and are exact solutions for the point-to-plane correspondences.

Finally, we have the following theorem for point-to-line and point-to-plane correspondences.

Theorem 1.

For any points on lines and points on planes and any and , there exist lines and planes to make and are exact solutions for the point-to-line and point-to-plane correspondences.

When measurements approximate an ambiguous configuration, a prior is required to identify the correct solution. The prior is generally available in real applications. For instance, we generally have a rough estimation of the pose between two sensors in the extrinsic calibration problem [47, 38, 22, 11, 49]. For the pose estimation in SLAM [45, 44, 23, 30, 37], current pose should be consistent with the previous motion trajectory.

Previous works [25, 24, 26, 4] using convex approximation can only find the global minimizer. However, the global minimizer may not be the correct solution of a problem if measurements approximate an ambiguous configuration. Therefore, we can not simply omit local minimizers.

4.2 Rotation Parameterization

Solving for the rotation matrix is the crux for pose estimation. Previous works [25, 26, 4, 41] adopt non-minimal representations for , which results in additional quadratic constraints in the minimization problem. This paper adopts the CGR parametrization [12, 21] which gives a minimal representation for the rotation matrix, removing the quadratic constraints in (4) . The CGR parametrization expresses a rotation matrix as

(5)

where is a 3-dimensional vector, and

is the skew-symmetric matrix of

.

4.3 Rational Formulation of Residual

We will show that the residuals from point-to-point, point-to-line and point-to-plane correspondences have a general form

(6)

where and are two -dimensional vectors and is a scalar. We first consider the point-to-plane residual in (1). It is obvious that and . Let us then consider the point-to-line residual. We define and , where and are the three rows of . Using this notation, equation (2) can be written as

(7)

It is easy to find that and . Lastly, we consider the point-to-point correspondence. Similar to the point-to-line correspondence, we define and . Substituting this notation into (3), it is easy to find

(8)

The similarity between (8) and (7) is obvious. Therefore, also has the general form (6).

Let . Substituting (5) into the general residual in (6) and adding the tree terms together yields

(9)

where is a third order polynomial with terms .

4.4 First-Order Optimality Conditions

Now we consider the least-squares problem (4). Using (9), the squared residual can be represented as

(10)

where is a 6th order polynomial in and . For the point-to-line distance in (4), we have

(11)

As mentioned above, and all have the same form as . Therefore, the point-to-line distance will have the same form as (10). Similarly, will also yield the same form as (10). After the summation of squared residuals in (4), we know that would have the same rational form as (10). We write it as

(12)

where is a 6th order polynomial function in and . To find the critical points of (12), we first calculate its first order optimality conditions as follows:

(13)

Canceling the denominator of (13) yields

(14)

and are of degree 5. Therefore, and are of degree 7 and 5, respectively. Although the Gröbner basis method gives a general way to solve the polynomial system, it is computationally demanding and numerically unstable to apply it to a high order polynomial system, as the experimental results in Fig. 1. The Newton-Raphson method provides an alternative way to find the roots of (14). Denote an equation system as . For the th iteration, the Newton-Raphson method updates the solution as

(15)

This iterative method requires an initial solution and . In the next section, we will describe how to calculate an accurate and .

4.5 Initial Estimation from Relaxation

The difficulty of solving (14) lies in the denominator of CGR parameterization (5). To solve this problem, we introduce the following intermediate variable

(16)

Substituting (16) into (9) will transform (9) from a rational function to a polynomial function. Furthermore, we define

(17)

Using (16) and (17), the residual (9) can be expressed as

(18)

where . This formulation turns the rational function (9) into a polynomial function (18) that is easier to handle. Furthermore, in (18) is decoupled from , and (18) is linear in .

Now we consider the least-squares problem (4) for the new unknowns. We stack the residuals from point-to-plane, point-to-line, and point-to-point correspondences to get

(19)

where is a matrix and is a matrix. Then (4) can be written as

(20)

As (19) is linear for , then has a closed-form solution as

(21)

Substituting (21) into (20), we derive a cost function only involving as

(22)

where . The elements in the vector are second order monomials except for the constant term. Thus (22) is a fourth order polynomial function. We compute the first order optimality conditions to get all the stationary points. This first-order condition contains four third order polynomials for and as

(23)

According to Bézout’s theorem [8], there exist at most solutions. We find that the polynomial system in (23) only contains third and first degree monomials. This equation system stratifies the 2-fold symmetry [18, 2]. That is to say if any nontrivial is a solution, is also a solution. Thus there are at most 40 independent solutions. We adopt the algorithm introduced in [19] to generate the polynomial solver which can utilize this property to yield an efficient solution.

After solving (23), we are then able to compute from (21). The CGR parameters and can be recovered from the definitions in (16) and (17). The above formulation treats as independent unknowns. However, they are related as they are functions of and . To recover the minimizer of (12), we refine the solution using the Newton-Raphson iteration method in (15). We summarize our least-squares solution in Algorithm 1.

Input: point-to-plane, point-to-line, and point-to-point correspondences
Output: and
1. Compute the coefficient matrices and in (19). 2. Compute in (22). 3. Compute the first order optimal conditions (23) 4. Solve the equation system (23) for . 5. Recover using (17). 6. Compute from (21). 7. Refine the solution by Newton-Raphson iteration (15).

Algorithm 1 Least-squares solution

5 Minimal Solution

Unlike most pose estimation problems having an unique minimal configuration, our problem has multiple minimal configurations. Specifically, any combination of 3D correspondences with 6 effective constraints forms a minimal configuration. Table 1 lists the minimal configurations. We use the pattern PtLPl to represent point-to-point, point-to-line and point-to-plane correspondences. As one point-to-point, point-to-line and point-to-plane correspondence provide 3, 2 and 1 constraints, respectively, it is easy to verify that the configurations in Table 1 are minimal configurations, except for the case PtLPl (2 point-to-point and 1 point-to-plane correspondences). It seems that PtLPl has 7 constraints, as 2 point-to-point correspondences seemingly provide 6 constraints and 1 point-to-plane correspondence provides 1 constraint. However, the collinear configuration is degenerate for point-to-point correspondences [1]. Two point-to-point correspondences form a collinear configuration. One additional constraint is required to recover and . We propose an unified solution for these minimal cases.

name #point #line #plane
PtLPl 0 0 6
PtLPl 0 1 4
PtLPl 1 0 3
PtLPl 0 2 2
PtLPl 1 1 1
PtLPl 2 0 1
PtLPl 0 3 0
Table 1: Minimal configurations.

Let us first consider the rational equation from (9). Its numerator, , is a third order polynomial. Assume that we have 6 equations in the form of (9). This will lead to an equation system with 6 third order equations, which is hard to solve. We can not use the intermediate unknowns defined in (16) and (17) to the minimal problem, since this will result in 7 unknowns, but there are only 6 constraints. Here we introduce 3 new intermediate unknowns as

(24)

Define . Then (9) can be rewritten as

(25)

where . Define the numerator of (25). Eliminating the denominator of (25), we have

(26)

This is a quadratic equation in and a linear equation in the new unknowns . Compared to the original equation from the numerator of (9), the new one is much simpler. For the minimal problem, we have 6 effective equations as (26). Let us divide the 6 equations into two groups. Each group has 3 equations as

(27)

We can easily calculate using the first three equations as

(28)

Substituting (28) into the second part of (27), we have

(29)

where () is an element of .

5.1 Hidden Variable Polynomial Solver

Here we present a hidden variable method [8] to solve (29). We treat one unknown as a constant. This unknown is called the hidden variable. Without loss of generality, we treat and as unknowns, and as a constant. Thus (29) can be rewritten as an equation system in and as

(30)

where , , . Then we introduce an auxiliary variable to convert to a homogeneous equation , , every monomial in is of degree 2. This leads to the following homogeneous equation system:

(31)

If we treat , and as variables, we can rewrite (31) as a linear homogeneous system in , and as follows

(32)

where , , , . Let us denote the coefficient matrix of (32) as . If the polynomial system (31) has a nontrivial solution, there will exist a nontrivial solution for the homogeneous linear system (32). According to the algebraic theory, (32) has a nontrivial solution if and only if the determinant of is zero, , . Define . has the same form as in (31). Similarly, we can construct two homogeneous linear systems for and . This will yield two homogeneous equations and , respectively. Stacking () together, we have a homogeneous linear system as

(33)

where is a matrix with polynomials in as elements, and . If (33) has a nontrivial solution, if and only if . This is an eighth order polynomial equation in . We can get by solving this equation. Comparing (30) and (31), we can find that if is a solution of (30), is a solution of (31), and vice versa. After we get the solution of , we can back-substitute it into (33) and set , then solve the resulting linear system for and . After we get , we can recover by (5) and calculate using (28). Then can be recovered from (24). We summarize our minimal solution in Algorithm 2.

Input: a minimal configuration
Output: and
1. Compute the coefficient matrices in (27). 2. Compute the coefficients in (29). 3. Solve (29) for by the hidden variable method described in section 5.1 and recover using (5). 4. Compute using (28) and recover by (24).

Algorithm 2 Minimal solution

6 Experiments

In this section, we compare our algorithm with the state-of-the-art algorithms, including Briales’s algorithm [4], BnB [26], Olsson’s algorithm [24] and Wientapper’s algorithm [41]. We also consider the point-to-point case alone, as it has closed-form solution. We adopt Arun’s algorithm [1] implemented in OpenGV [15]. The algorithms are evaluated in terms of accuracy and computational time using synthetic and real data.

Method KITTI 03 KITTI 04 KITTI 07
Err () Err () Time (s) Err () Err () Time (s) Err () Err () Time (s)
Olsson [24] 2.03e-1 5.46e-1 2.22 5.75e-1 5.76e-1 1.88 3.60e-1 2.53e-1 2.08
Wientapper [41] 2.00e-1 5.72e-1 0.58 4.32e-1 3.49e-1 0.55 3.61e-1 2.58e-1 0.56
Briales [4] 1.96e-1 5.61e-1 2.16 4.31e-1 3.37e-1 1.88 3.60e-1 2.53e-1 2.03
Ours 1.95e-1 5.61e-1 0.33 4.31e-1 3.37e-1 0.30 3.60e-1 2.53e-1 0.30
Table 2: Experimental results on the 03, 04, and 07 sequences of the KITTI dataset [10].

6.1 Experiments with Synthetic Data

Our synthetic data is generated as [4]. Specifically, each geometric element is determined by randomly sampling a point within a sphere of radius . For lines and planes, a random unit direction and normal are generated. We uniformly sample the Euler angles of the rotation matrix ( and

). The translation elements are uniformly distributed within

. We use and to represent the estimated rotation and translation and use and to represent the ground truth. The rotation error is evaluated by the angle of the axis-angle representation of , and the translation error by . We consider the effective number of correspondences as [4]. Specifically, the effective number of correspondences for point-to-plane, point-to-line, and point-to-point correspondences is calculated as . Given an , we randomly generate a combination of , and whose effective number of correspondences is .

Algorithm time ()
Our least-squares algorithm 2.95
DLSSol 20.0
Our minimal algorithm 0.296
DMinSol 0.599
Table 3: Computational time. For the least-squares problem, we consider the time to solve (13), which is independent of the number of correspondences.

Effect of intermediate unknowns  We introduce intermediate unknowns to simplify the least-squares and the minimal problems. To verify their benefit, we evaluate the performance of the direct least-squares solution (DLSSol) from solving (13), and the direct minimal solution (DMinSol) from solving the equation system from the numerator of (9) for the minimal problem. We employed the algorithm in [19] to generate the solvers. For the least-squares problem, we run 2000 trails for each . Fig. 1 shows the results. Directly solving (13) can recover the global minimizer of (4) in theory. However, the large mean errors of DLSSol verify that this polynomial solver is very unstable. For the minimal problem, we run 20000 trails for the 6 point-to-plane configuration. Fig. 2 shows the results. The DMinSol has a much longer tail than our algorithm.

Table 3 lists the average computational time of different algorithms. For the least-squares problem, we compare the computational time of solving the first order optimality conditions (13). This time is independent of the number of correspondences. As shown in Table 3, our least-squares solution is about 7 times faster than DLSSol, and our minimal solution is about 2 times faster than DMinSol.

The above results verify that our intermediate unknowns can increase the numeric stability as well as reduce the computational time.

Figure 1: Compare our least-squares solution with direct least-squares solution (DLSSol) which solves the polynomial system (14) using the Gröbner basis method [19].
Figure 2: Compare our minimal solution with direct minimal solution (DMinSol) that solves the polynomial system (9) using the Gröbner basis method [19].

Least-squares solution  We conduct experiments to evaluate the performances of different algorithms under varying number of correspondences, increasing level of noise and computational time. The results of all experiments are from 100 independent trials as [4].

The first experiment considers a fixed noise level and an increasing number of correspondences. Let us denote the standard deviation of a zero mean Gaussian noise distribution as

. We set . varies from to . Fig. 3 shows the result. For the pose-to-point case, most of the algorithms can recover the optimal solution except for [24] when the number of points is small. For the mixed correspondences, the results of BnB [26] and Briales’s algorithm [4] overlap. This is consistent to the results in [4]. Our algorithm significantly outperforms previous works when is small as the ambiguous case more likely happens for a small . When is large, our algorithm achieves the same accuracy as [26] that has guaranteed global optimality. Wientapper’s algorithm [41] can not find the optimal solution even is large.

Figure 3: Rotation and translation errors for increasing number of correspondences. The noise level is fixed to 0.05m.

In the second experiment, is from to , stepping by 0.02m. The results are illustrated in Fig. 4. For the point-to-point case, most of the algorithms provide the optimal solution as [1], except for [41] when . For the mixed case, BnB, Briales’s algorithm and our algorithm present better results than other algorithms. Our algorithm gives a slightly better result than BnB and Briales’s algorithm when .

Figure 4: Rotation and translation errors for increasing noise level. The number of point correspondences is fixed to 5 and the number of effective correspondences is set to 10.

In the last experiment, we compare the computational time of different algorithms. We vary the effective number of correspondences from 10 to 2000. For every , we run each algorithm 100 times to calculate the average running time. Fig. 5 provides the result. We did not run the BnB algorithm [26], because it is too slow for a large . It is clear that our algorithm is the fastest one among the compared algorithms. Wientapper’s algorithm [41] is efficient. But their algorithm needs to compute a problem 4 times, thus the gap between the computational time of two algorithms increases as enlarges. Actually, could be 10 times larger in real applications than our simulation, as shown in Table 4.

Figure 5: Computational time of different algorithms.

Minimal solution  The equation system (29) is important for the estimation of . Actually, Ramalingam [31] formulated some point-to-plane configurations as a quadratic equation system as (29). We first consider the numeric stability of the polynomial solver for (29). We compare our hidden variable method, E3Q3 [17], and the Gröbner basis polynomial solver generated by [19]. We randomly generate a real solution and the coefficients of (29). Then we substitute this real solution into (29) to calculate the constant terms of (29). We run each algorithm 20000 trails and compute the estimation error of this real solution. Fig. 6 (a) shows the estimation error histograms of the compared algorithms. It is clear that our hidden variable method is more stable than other algorithms. The histograms of E3Q3 and the Gröbner basis method have long tails. This is because E3Q3 and the Gröbner basis method require to compute the inverse of a matrix. When a matrix approximates a singular matrix, the performance of these algorithms will degrade. Fig. 6

(b) demonstrates the performance of E3Q3 under a degenerate case. We generate nearly singular matrix for E3Q3, whose least singular value is in

. In this situation, E3Q3 performs very bad.

We also evaluate the numeric stability of our unified minimal solution. We run 2000 independent trails for each minimal configuration. As demonstrated in Fig. 7, our algorithm provides accurate solutions for all minimal configurations. This avoids solving each configuration case by case.

Figure 6: Numeric stability of polynomial solvers for (29).
Figure 7: Numeric stability of our minimal solution.
Sequences Average Correspondences Per Frame
#P-to-P #P-to-L #P-to-PL
KITTI 03 63 17 21117
KITTI 04 37 30 19994
KITTI 07 111 40 20126
Table 4: Average numbers of point-to-point(P-to-P), point-to-pine(P-to-L), and point-to-plane(P-to-PL) correspondences per frame of 03, 04 and 07 KITTI sequences.

6.2 Experiments with Real Data

We generated our real-world dataset from the KITTI dataset [10]. Our dataset contains point-to-point, point-to-line and point-to-plane correspondences described in Table 4. There are more than 20000 correspondences in each frame, with the majority of them being point-to-plane correspondences. We evaluate the accuracy and running time of different algorithms on this dataset.

For each frame, 2D feature points are detected by the ORB feature detector [34]. We project LiDAR points into the image plane, and select the LiDAR points around an ORB feature. Then, we fit a local plane to these LiDAR points. Finally, we calculate the 3D coordinates of an ORB feature by calculating the intersection of the back-projection ray of the ORB feature and the local plane. We use the ORB descriptors to match 2D feature points, and obtain the 3D point-to-point correspondences.

Next, we generate point-to-line correspondences. For each frame, 2D lines are detected by the Line Segment Detector (LSD) [33] and described by the Line Band Descriptor (LBD) [46]. A 2D line is represented by two 2D endpoints. The 3D endpoints of a line is generated as the 2D feature points described above. We then generate 3D line correspondences by matching their LBD features. Given a line-to-line correspondence, two point-to-line correspondences are generated for each of its endpoints.

Finally, we calculated point-to-plane correspondences. For each frame, planes are extracted from LiDAR points by the region growing algorithm [28]. To match a plane point to a previous plane, we find the nearest LiDAR point in previous planes.

We evaluate the performance of different algorithms on sequences 03, 04, and 07 of the KITTI dataset. We did not test BnB [26], as it is extremely slow on this large dataset. Our algorithm achieves the same or slightly better result as the state-of-the-art algorithm [4] while being around 7 times faster, as shown in Table  2.

7 Conclusions

In this paper, we present an efficient and accurate least-squares solution and an unified minimal solution for the 3D registration problem. We proof that there exist ambiguous configurations for any number of point-to-plane and point-to-line correspondences. This requires an algorithm has the ability to reveal local minimizers. We use the CGR parameterization to represent the rotation, removing the quadratic constraints on the rotation. However, this results in a rational form residual which is hard to solve. We introduce several intermediate variables to simplify the first-order optimality conditions of the least-squares problem, and the equation system of the minimal configuration. We evaluate our algorithm through synthetic and real data. The experimental results show that computing local minimizers is essential, especially when is small. Besides, our algorithm is as accurate as previous globally optimal solutions when is large, but is much faster.

References

  • [1] K. S. Arun, T. S. Huang, and S. D. Blostein. Least-squares fitting of two 3-d point sets. IEEE Transactions on pattern analysis and machine intelligence, (5):698–700, 1987.
  • [2] E. Ask, Y. Kuang, and K. Åström. Exploiting p-fold symmetries for faster polynomial equation solving. In

    Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012)

    , pages 3232–3235. IEEE, 2012.
  • [3] P. J. Besl and N. D. McKay. Method for registration of 3-d shapes. In Sensor Fusion IV: Control Paradigms and Data Structures, volume 1611, pages 586–607. International Society for Optics and Photonics, 1992.
  • [4] J. Briales, J. Gonzalez-Jimenez, et al. Convex global 3d registration with lagrangian duality. In International Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  • [5] A. Censi. An icp variant using a point-to-line metric. In Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on, pages 19–25. IEEE, 2008.
  • [6] H. H. Chen. Pose determination from line-to-plane correspondences: Existence condition and closed-form solutions. In [1990] Proceedings Third International Conference on Computer Vision, pages 374–378. IEEE, 1990.
  • [7] H. H. Chen. Pose determination from line-to-plane correspondences: existence condition and closed-form solutions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6):530–541, Jun 1991.
  • [8] D. A. Cox, J. Little, and D. O’shea. Using algebraic geometry, volume 185. Springer Science & Business Media, 2006.
  • [9] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
  • [10] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The kitti dataset. International Journal of Robotics Research (IJRR), 2013.
  • [11] R. Gomez-Ojeda, J. Briales, E. Fernandez-Moral, and J. Gonzalez-Jimenez. Extrinsic calibration of a 2d laser-rangefinder and a camera based on scene corners. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 3611–3616. IEEE, 2015.
  • [12] J. A. Hesch and S. I. Roumeliotis. A direct least-squares (dls) method for pnp. In 2011 International Conference on Computer Vision, pages 383–390. IEEE, 2011.
  • [13] B. K. Horn. Closed-form solution of absolute orientation using unit quaternions. Josa a, 4(4):629–642, 1987.
  • [14] B. K. Horn, H. M. Hilden, and S. Negahdaripour. Closed-form solution of absolute orientation using orthonormal matrices. JOSA A, 5(7):1127–1135, 1988.
  • [15] L. Kneip and P. Furgale. Opengv: A unified and generalized approach to real-time calibrated geometric vision. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1–8. IEEE, 2014.
  • [16] L. Kneip, H. Li, and Y. Seo. Upnp: An optimal o (n) solution to the absolute pose problem with universal applicability. In European Conference on Computer Vision, pages 127–142. Springer, 2014.
  • [17] Z. Kukelova, J. Heller, and A. Fitzgibbon. Efficient intersection of three quadrics and applications in computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1799–1808, 2016.
  • [18] V. Larsson and K. Åström. Uncovering symmetries in polynomial systems. In European Conference on Computer Vision, pages 252–267. Springer, 2016.
  • [19] V. Larsson, K. Astrom, and M. Oskarsson. Efficient solvers for minimal problems by syzygy-based reduction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 820–829, 2017.
  • [20] H. Li and R. Hartley. The 3d-3d registration problem revisited. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–8. IEEE, 2007.
  • [21] F. M. Mirzaei and S. I. Roumeliotis. Optimal estimation of vanishing points in a manhattan world. In 2011 International Conference on Computer Vision, pages 2454–2461. IEEE, 2011.
  • [22] O. Naroditsky, A. Patterson, and K. Daniilidis. Automatic alignment of a camera with a line scan lidar system. In 2011 IEEE International Conference on Robotics and Automation, pages 3429–3434. IEEE, 2011.
  • [23] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In 2011 IEEE International Symposium on Mixed and Augmented Reality, pages 127–136. IEEE, 2011.
  • [24] C. Olsson and A. Eriksson. Solving quadratically constrained geometrical problems using lagrangian duality. In 2008 19th International Conference on Pattern Recognition, pages 1–5. IEEE, 2008.
  • [25] C. Olsson, F. Kahl, and M. Oskarsson. The registration problem revisited: Optimal solutions from points, lines and planes. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 1, pages 1206–1213. IEEE, 2006.
  • [26] C. Olsson, F. Kahl, and M. Oskarsson. Branch-and-bound methods for euclidean registration problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):783–794, 2009.
  • [27] C. Papazov and D. Burschka. Stochastic global optimization for robust point set registration. Computer Vision and Image Understanding, 115(12):1598–1609, 2011.
  • [28] J. Poppinga, N. Vaskevicius, A. Birk, and K. Pathak. Fast plane detection and polygonalization in noisy 3d range images. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3378–3383. IEEE, 2008.
  • [29] B. Přibyl, P. Zemčík, and M. Čadík.

    Absolute pose estimation from line correspondences using direct linear transformation.

    Computer Vision and Image Understanding, 161:130–144, 2017.
  • [30] P. F. Proença and Y. Gao. Probabilistic rgb-d odometry based on points, lines and planes under depth uncertainty. Robotics and Autonomous Systems, 104:25–39, 2018.
  • [31] S. Ramalingam and Y. Taguchi. A theory of minimal 3d point to 3d plane registration and its generalization. International journal of computer vision, 102(1-3):73–90, 2013.
  • [32] S. Ramalingam, Y. Taguchi, T. K. Marks, and O. Tuzel. P2: A minimal solution for registration of 3d points to 3d planes. In European Conference on Computer Vision, pages 436–449. Springer, 2010.
  • [33] G. Randall, J. Jakubowicz, R. G. von Gioi, and J. Morel. Lsd: A fast line segment detector with a false detection control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32:722–732, 12 2008.
  • [34] E. Rublee, V. Rabaud, K. Konolige, and G. R. Bradski. Orb: An efficient alternative to sift or surf. 2011.
  • [35] A. Segal, D. Haehnel, and S. Thrun. Generalized-icp. In Robotics: science and systems, volume 2, pages 742–749. IEEE, 2015.
  • [36] J. Serafin and G. Grisetti. Nicp: Dense normal based point cloud registration. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 742–749. IEEE, 2015.
  • [37] Y. Taguchi, Y.-D. Jian, S. Ramalingam, and C. Feng. Point-plane slam for hand-held 3d sensors. In 2013 IEEE International Conference on Robotics and Automation, pages 5182–5189. IEEE, 2013.
  • [38] R. Unnikrishnan and M. Hebert. Fast extrinsic calibration of a laser rangefinder to a camera. Robotics Institute, Pittsburgh, PA, Tech. Rep. CMU-RI-TR-05-09, 2005.
  • [39] F. Vasconcelos, J. P. Barreto, and U. Nunes. A minimal solution for the extrinsic calibration of a camera and a laser-rangefinder. IEEE transactions on pattern analysis and machine intelligence, 34(11):2097–2107, 2012.
  • [40] F. Wientapper and A. Kuijper. Unifying algebraic solvers for scaled euclidean registration from point, line and plane constraints. In Asian Conference on Computer Vision, pages 52–66. Springer, 2016.
  • [41] F. Wientapper, M. Schmitt, M. Fraissinet-Tachet, and A. Kuijper. A universal, closed-form approach for absolute pose problems. Computer Vision and Image Understanding, 173:57–75, 2018.
  • [42] C. Xu, L. Zhang, L. Cheng, and R. Koch. Pose estimation from line correspondences: A complete analysis and a series of solutions. IEEE transactions on pattern analysis and machine intelligence, 39(6):1209–1222, 2017.
  • [43] J. Yang, H. Li, and Y. Jia. Go-icp: Solving 3d registration efficiently and globally optimally. In Proceedings of the IEEE International Conference on Computer Vision, pages 1457–1464, 2013.
  • [44] J. Zhang and S. Singh. Loam: Lidar odometry and mapping in real-time. In Robotics: Science and Systems, volume 2, page 9, 2014.
  • [45] J. Zhang and S. Singh. Visual-lidar odometry and mapping: Low-drift, robust, and fast. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 2174–2181. IEEE, 2015.
  • [46] L. Zhang and R. Koch. An efficient and robust line segment matching approach based on lbd descriptor and pairwise geometric consistency. Journal of Visual Communication and Image Representation, 24(7):794–805, 2013.
  • [47] Q. Zhang and R. Pless. Extrinsic calibration of a camera and laser range finder (improves camera calibration). In 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), volume 3, pages 2301–2306. IEEE, 2004.
  • [48] L. Zhou. A new minimal solution for the extrinsic calibration of a 2d lidar and a camera using three plane-line correspondences. IEEE Sensors Journal, 14(2):442–454, 2014.
  • [49] L. Zhou, Z. Li, and M. Kaess. Automatic extrinsic calibration of a camera and a 3d lidar using line and plane correspondences. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5562–5569. IEEE, 2018.
  • [50] Q.-Y. Zhou, J. Park, and V. Koltun. Fast global registration. In European Conference on Computer Vision, pages 766–782. Springer, 2016.

Appendix A Proof for lemma 1, lemma 2 and theorem 1

As we mentioned in the paper, point-to-line and point-to-plane correspondences have ambiguous configurations. Here we prove lemma 1, lemma 2 and theorem 1 in our paper.

Lemma 1. For any points and any and , there exist lines to make and are exact solutions for the point-to-line correspondences.

Proof.

Define the th point of an arbitrary point set with points. For any and , we can have and . Then we can use and to define a line with direct and a point on it, as demonstrated in Fig. 8 (a). According to how we construct , we know that passes through and . Therefore and are two solutions for the point-to-line . ∎

Lemma 2. For any points and any , and , there exist planes to make , and are exact solutions for the point-to-plane correspondences.

Proof.

Define the th point of an arbitrary point set with points. For any , and , we can have , and . Then we can find a plane passing through , and , as demonstrated in Fig. 8 (b). According to how we construct , we know that , and are three solutions for the point-to-plane correspondences .

Figure 8: A schematic of point-to-line ambiguous configuration (a) and point-to-plane ambiguous configuration (b). In (a), and will transform to , In (b), , and will transform to
Figure 9: A schematic of point-to-line and point-to-plane ambiguous configuration. and will transform to , and transform to

Theorem 1. For any points on lines, points on planes and any and , there exist lines and planes to make and are exact solutions for the point-to-line and point-to-plane correspondences.

Proof.

We first consider the points on lines. Define is the th point. According to Lemma 1, we can find lines to make and are exact solutions for the point-to-line correspondences .

Then we consider the points on planes. For the th point within them, we can define and . Let us denote as a plane passing through the line defined by and . According to how we construct , we know that and are two solutions for the point-to-plane correspondences .

Therefore, and are the two solutions for the point-to-line correspondences