1 Introduction
Many robotics systems rely on cameras and laser range finders to compute environment geometry [1]. Two dimensional (2D) Laser Range Finders (LRFs) which measure depth along a single plane are commonly used due to their low weight, low cost and low power requirements.
Taking advantage of measurements from an LRF or a LIDAR combined with a camera, however, requires precise knowledge of the relative pose (orientation and position) between them. This is a classical extrinsic calibration problem where the objective is to determine the transformation between two coordinate frames. Establishing correspondences between two sensors is easier for 3D LIDARs since distinct features can be identified both among laser points and in the camera image. Existing methods include 3D LIDARcamera calibration by using a circlebased calibration target [2] and an arbitrary trihedron [3].
Extrinsic calibration of a 2D LRF is more challenging because a 2D LRF produces only a single scanning plane for each pose which is invisible from the regular camera. This makes it difficult to find correspondences. Therefore, additional constraints must be used. We note that if we were given the correspondences between laser points and their images (e.g. IR camera) the extrinsic calibration problem reduces to a standard PP (PerspectivenPoint) computation [4]. However, in our case, these correspondences are unknown. We study this extrinsic calibration problem and make the following contributions:

We show that a single observation of two noncoplanar triangles sharing a common side (See Fig. 1) suffices to unambiguously solve the calibration problem.

Even though planar, triangular or Vshaped rectangular patterns have already been proposed to solve the calibration problem, we show that previous methods do not sufficiently constrain the calibration problem to allow for a unique solution. Therefore, they rely on a large number of measurements and a good initial estimate.

We present a robust method to calibrate from a single observation in the presence of noise.

For additional accuracy, we show that by using only a few additional observations, our method achieves significantly smaller error than existing methods.
2 Related Work
One of the earliest methods for LRFcamera calibration is presented by Zhang and Pless [5]
using pointsonplane constraints (laser points must lie on the plane of the calibration pattern). The weakness of this approach is that the cost function in the nonlinear optimization only focuses on the distance between the laser points and the corresponding plane, which only constrains two out of six degrees of freedom. However, the remaining four degrees of the relative pose between the camera and the LRF have ambiguity (See Fig.
2). Consequently, the solution may converge to a local minimum. Therefore, this method requires a large number of different observations with a wide range of views (more than 20 snapshots) for accuracy.Vasconcelos et al. [6] and Zhou [7] presented minimal solutions to address disadvantages in [5]. In contrast to our method, both techniques require multiple observations (at least three) and have inherent degeneracy (two planes are parallel or there exists a dangerous cylinder).
With pointonline constraints (laser points must lie on intensity edges or lines obtained from the image), the approaches proposed in [8] and [9] respectively use a black triangular board and a Vshaped calibration target. The results from these two methods are not accurate due to sparse sampling of laser points. Further, a large number (usually more than 100) of images are needed to compensate for the lack of constraints for each observation. Using triangular board [8] turns out to be one way to improve the constraints mentioned above in [5]: the constraint “points on the border lines” removes the ambiguity of the horizontal translation and “triangular plane” removes the ambiguity of the vertical translation for the relative LRFcamera pose. However, there are still two degrees of freedom that remain ambiguous (See Fig. 2). Essentially, the drawback is that the constraints are imposed on image: there must be a total of four degrees of freedom from views of depth and orientation (2 linear geometry constraints plus 3 nonlinear constraints for rotation matrix to solve nine unknowns). Additional details are explained in Section 4.
Based on the ideas in [8] and [9] (minimizing the projection distance on the image between intersected laser points and the feature lines), the authors in [10] also propose to use a Vshaped calibration target. They increase the laser points’ sampling for each observation by introducing more feature lines and virtual endpoints, but the same drawback still exists as in [9] and [10] (See Fig. 2). Therefore, they still need a large amount (around 50) of different observations to achieve a reasonable result. Besides, the convergence of further optimization requires a good initial estimate, which is not guaranteed in these methods.
The method in [11] provides an analytical solution using a white board with a black band in the middle. It needs only six different observations. Similarly, the authors in [12] give an analytical solution to this problem using a white board with a black line in the middle. Compared with [11], it further computes the optimal leastsquares solution to improve the robustness to noise. However, both of these two methods still cannot avoid using a large number of different observations for accuracy because of the insufficient constraints for each observation (See Fig. 2). In practice, it is not robust to deal with these ambiguities through a large numbers of snapshots from different valid views.
The work described in [13], presents an approach only requiring the observation of a scene corner (orthogonal trihedron) commonly found in any humanmade environments. Moreover, this method builds linetoplane and pointtoplane constraints, which are enough for each observation. The authors in [14] further extend this work by deriving a robust minimal solution. However, its assumption of three orthogonal planes highly depends on the accuracy of three right angles which are difficult to be made exactly in practice. Further, when multiple observations from different views are needed for additional accuracy, the sharp angle between the laser beams and walls on two sides often affects the laser measurement: scanned laser line on the wall is curved. Our method accommodates an arbitrary obtuse angle in our calibration target so that it can adjust the view angle between the pattern and linear beams.
In contrast to previous methods, our proposed method uses a triangular Vshaped calibration target with checkerboards on it, which are involved in camera calibration. Our method builds sufficient constraints, which guarantee the uniqueness of the solution for each observation. In theory, we can use only one snapshot to calibrate the cameraLRF rig. In practice, an accurate result can be achieved with only a few images (previous methods require 30 or more). Further, the angle between two triangular boards of our calibration target can be arbitrary which makes it convenient to build.
3 Problem Formulation
Our calibration setup is shown in Fig. 3. A Vshaped calibration target is formed by two triangular boards with a checkerboard on each triangle. , , and are four corners of the target. For a given rigidly connected LRFcamera pair, and are respectively the LRF’s and the camera’s frames of reference. We define the triangles as and , and let and . For simplicity, the notations we use are the same ones to refer the planes they lie in. In contrast to the methods proposed in [9] and [10], the angle between and is not required to be a fixed value. It could be an arbitrary obtuse angle. For each observation, the scanning plane of the LRF intersects with the three sides , and at points , and respectively in the LRF frame. Moreover, the camera and LRF should be either synchronized or held stationary during data collection. Each observation of the calibration target consists of an image acquired from the camera and a single scan obtained from the LRF. The features to be extracted from a single observation are: three laser points , and
in the LRF frame; four unit normal vectors
, , and of respective planes , , and in the camera frame; and two distances and from camera to planes andrespectively. Further details of feature extraction are described in Section
6.Problem Definition: The objective is to obtain the relative pose between these two sensors: the orientation and position of the LRF with respect to (w.r.t) the camera frame. Without loss of generality, the laser scanning plane is defined as the plane . Thus, a 3D laser point represented as can be transformed to the point in the camera frame by .
The camera is modeled by the standard pinhole model. We ignore lens distortions in the rest of the paper, and assume that the images have already been undistorted, e.g. using the functions from MATLAB Camera Calibration Toolbox [15].
4 Uniqueness Of The Solution
In this section, we show how the features from a single observation constrain the calibration problem to a unique solution.
4.1 Constraints
A single laser scan consists of a depth value for each angle at which the depth was sensed. In the LRF frame, we assume that the sensor is at its origin . Let us express the feature points as , where are the indices of the feature points. Let the feature norm vectors of planes be , where . We now have a correspondence between a 3D point in LRF frame and a plane in camera frame. Thus, our constraint is that the laser point, transformed to the camera frame, must lie on the corresponding plane, which can be divided into three parts.
First, laser points and must respectively lie on the planes and . Then, the first two constraints have the form
(1) 
where and are the unknowns. Second, for laser points and , they must both lie on the plane . Then, we have other two constraints
(2) 
Similarly, laser points and must both lie on the plane . This gives two more constraints:
(3) 
Let , and be the three columns of . Since we set up the LRF coordinate system in a way that the second coordinate of any laser point is zero, we do not have an explicit dependence on . Once we solve for and , can be obtained by
(4) 
Since is an orthonormal matrix, we have three further constraints
(5) 
To summarize, we have nine unknowns (in , and ) and a system of six linear and three nonlinear equations. In the next section, we show that these constraints are independent and hence sufficient to obtain a solution.
4.2 Proof of Uniqueness
For a single observation of the calibration target, our method builds up a system of Eqs. (1)(5). In order to prove the proposed method does not induce any ambiguity, the nine equations must be independent. We show that the first six linear equations are linearly independent. Since the other three nonlinear equations have no relationship with geometry constraints, they are independent from the first six linear equations.
From the constraints formulation, the six linear equations can be expressed as the following form
(6) 
where is the vector of unknowns with , and , is the distance vector denoted as , and is the coefficient matrix whose elements are expressed using components from and as defined in Subsection 4.1. Lemma 1 below states that the three unit vectors , and are linearly independent, which means they span the entire 3D space.
Lemma 1
Suppose is the normal vector of plane for defined as in Fig. 3, these normal vectors in any cardinality three subset of are linearly independent (totally four subsets: I. , and ; II. , and ; III. , and ; IV. , and .).
The proof is presented in Appendix 10.
As a corollary, the unit vector can be expressed as the combination of first three unit vectors . This allows us to perform Gaussian elimination on Eq. (6) as follows:

Keep and unchanged, and let ;

Let and ;

Let .
Here, is transformed as
(7) 
with submatrices , , and where
(8) 
with , , and . Since laser features , and are extracted from two distinct line segments, their coordinates cannot be equal otherwise these three points are on a same plane from an invalid observation. Therefore, , and can be calculated. After Gaussian elimination, the distance vector is transformed into a new vector denoted as , where .
Let us first take a close look at the structure of . Since we know that unit vectors , and are linearly independent (Lemma 1), matrix is nonsingular such that we can reduce it to an upper triangular matrix. Thus, the first three linear equations are independent. Next, the unit vectors , and are also linearly independent (Lemma 1). Then, matrix is also nonsingular and can be reduced into an upper triangular matrix, which means the last three linear equations are also independent. From the procedure above, we just reduce the to a matrix which has a lower triangular corner with zero elements, just shown as the following
(9) 
where represents a upper triangular matrix and represents a square matrix. So from the matrix structure in Eq. (9), we can conclude that the six linear equations for geometry constraints are linearly independent, which means plus the other three nonlinear equations we can solve for nine unknown components in , and respectively. Hence, there is no ambiguity in our proposed method in which the relative pose between camera and LRF is determined from a single snapshot of the calibration target.
5 Analytical Solution
In this section, we first present how to obtain the solution for the extrinsic calibration of the cameraLRF system from just a single observation of the calibration target. Then, we show the solution from multiple observations which is needed to reduce the effect of noise. Note that an analytical solution is obtained in our constraints system, which is more general than the closedform solution in [11]. Moreover, we present a strategy to exclude fake solutions from cheirality check.
5.1 From a Single Observation
We outline seven steps to solve the polynomial system (Eqs. (1)(5)). For convenience, the geometry constraints (Eqs. (1)(3)) are reformulated as the following
(10) 
where the parameters , and are defined as
(11) 
STEP 1: The problem is reformulated in the view of nonlinear optimization as stated in [12] shown below
(12) 
where . From the reformulated problem (12), the optimal solution for is obtained as shown below
(13) 
where and .
Lemma 2
is a nonsingular matrix and thus invertible.
Lemma 2 is proved in Appendix 11. Since for a laser point , we arrange the expression of in (13) to the form
(14) 
in which and .
STEP 2: With , , and defined as
(15) 
we substitute (14) in constraints (10) and obtain
(16) 
where , and . Then, is further expressed in terms of
(17) 
where and .
Note that is invertible. The proof is straightforward as the following. We first assume the matrix is noninvertible thus rank deficient. From (16), we have , which is a system for solving . Then for any given (thus is given), the rankdeficient results in infinite solutions for (which is ). It means our system has ambiguity, which is in contradiction to the uniqueness proof in Section 4. Hence, is invertible.
STEP 3: Now we can eliminate by substituting (17) in the three remaining second order constraints (5). After full expansion, we have the following
(18)  
(19)  
(20) 
where the coefficients and the constant are computed in a closed form in terms of the components of and . To solve the polynomial system (Eqs. (18)(20)), we aim to obtain a univariate polynomial in using Macaulay resultant [16]. This multivariate resultant is the ratio of two determinants, the denominator (21) and numerator (22)
(21) 
(22) 
with elements , , , , and defined as
(23) 
We set this resultant to , and obtain the univariate polynomial equation
(24) 
where the coefficients are computed in closedform of the coefficients of Eqs. (18)(20).
Although the eighthdegree (higher than four) univariate polynomial
does not have a closedform roots expression, we can obtain its all roots by computing the eigenvalues of its companion matrix
[17]. For numerical stability, we approximate the roots through an iterated method [18] which uses a generalized companion matrix constructed from and initialized by . Here, and are expressed as(25) 
where , and . is first initialized as the eigenvalues of . Then for each iteration, is updated as the eigenvalues of until is converged. Eight possible solutions for are obtained.
STEP 4: Each solution for (numeric value ) corresponds to a single solution for the rest of the unknowns. For numerical stability, we compute the Sylvester resultant [16] of Eq. (18) and Eq. (20) w.r.t . With the determinant of this resultant set to zero, we obtain a quartic polynomial in
(26) 
where . Similarly, we compute the Sylvester resultant of Eq. (19) and Eq. (20) w.r.t , and set its determinant to zero to obtain another quartic polynomial in . To solve this overdetermined system, we aim to minimize the sum of the squares of and , and thus set the derivative of + w.r.t to zero to get a seventhdegree polynomial. Seven solutions for obtained by iterated method mentioned above are tested if both and .
After substituting the numeric solutions and into Eqs. (18)(20), we perform the same optimization method to solve the overdetermined system for . Note that we have a closedform roots expression for the thirddegree polynomial obtained from the derivative of the cost function w.r.t (the sum of the squares of three polynomials in (18)(20)). We only keep the solution if all Eqs. (18)(20) hold.
STEP 5: After obtaining , can be calculated from Eq. (17) and can be retrieved from Eq. (4). Finally, can be obtained using Eq. (13).
Eight possible solutions give us up to four real solutions. Four complex solutions can be eliminated as the following. We square all the elements of , , and , and check if they all have nonnegative real parts.
STEP 6: In practice, while the solution for fails to deliver a real solution in the presence of noise, we use its projection on real domain as the initial value. Eqs. (18)(20) are then treated as a whole for , which can be solved using the TrustRegion Dogleg method [19] [20]. At each iteration , the trust region subproblem here is
(27) 
where is updated, and the Jacobian is defined as . The step to obtain is then constructed as
(28) 
where is the largest value such that . The Cauchy step and the GaussNewton step are respectively defined as and , where and .
STEP 7: We must choose the unique solution from fake solutions through the cheirality check. Fig. 4 shows four possible real solutions, which are , , and in terms of a combination of the LRF frame and the camera frame.
The unique solution is determined as the following. Both the LRF and the camera must face towards the same direction, and three laser points , and after transformation to the camera frame must be in front of the camera.
5.2 From Multiple Observations
It is always reasonable to use multiple observations from different views to suppress noise. Similarly, the problem in this case is formulated as following
(29) 
where and is the total number of different multiple observations. The solution is obtained following the seven steps mentioned in 5.1. As the rotation is represented by the Rodrigues’ formula, the calibration result is further refined using LevenbergMarquardt (LM) method [21] [22] with the accurate initial value from our solution in (29).
6 Details of the Calibration Approach
We explain below how to accurately extract the features required for our method. These are four normal vectors () and two distances () from the camera, and three laser edge points () from the LRF.
6.1 Data from the LRF
Instead of being concave, our calibration can be built in a convex shape as shown in Fig. 5. We can put the calibration target on a planar object (such as table, wall and the ground) such that , and are accurately determined by the intersection between line segments. Specifically, the laser intersection with the objects are first segmented into four parts (e.g. using the RamerDouglasPeucker algorithm (RDP) algorithm [23]). We then fit a line to each segment using the total least squares method [24]. is thus obtained by the intersection of lines and . Similarly, and are respectively determined by the other two pairs and .
6.2 Data from the Camera
Using our calibration target with two checkerboards, the camera calibration is first performed by the MATLAB Toolbox [15]. Two normal vectors and of planes and can be obtained directly from the calibration toolbox. Each checkerboard is on the plane in its own world frame. For example, we have the rotation and translation from the calibration result, which represent the orientation and position of the plane w.r.t the camera. is just minus the 3rd column of , and the distance from the camera to is . Similarly, and can be also calculated.
From the image (See Fig. 5), the unit line directions of , and are projected as , and as , and the projection point of the vertex is . As stated in [13], we perform a weighted optimization initialized by the LSD [25] line detector to accurately estimate the set . Specifically, only the pixels within a rectangle region are considered for fitting . Let be the normal of . Hence, the optimization problem is expressed as
(30) 
where each region has the number of valid pixels whose gradient magnitudes as their weights are above a threshold. Given the intrinsic matrix , the normal vector () is obtained by .
6.3 Snapshot Selection
The solution from each snapshot should guarantee that the laser points from the line and from must respectively lie on and . From multiple observations, it can guide us to select wellconditioned snapshots for further accuracy. Given the solution and , we will keep this snapshot if it satisfies
(31) 
where and respectively have and points with as the distance threshold.
7 Synthetic Experiments
Using synthetic data, we first validate the numerical stability of our solution. The sensitivity to noise along with the angle effect between two triangle boards from a single snapshot are explored. We further present the result of the noise reduction using multiple observations. The accuracy improvement of the snapshot selection is shown at last.
For the simulation setting, the obtuse angle between two triangle boards of the calibration target is set to . Then, we uniformly and randomly generate roll, pitch and yaw in the range of for the orientation of the LRF w.r.t the camera, and three components of the position from 5 to 30 cm. For each instance of the ground truth, we randomly generate the orientation and position of the calibration target within the range of and 50 to 150 cm.
7.1 Numerical Stability
We performed MonteCarlo trials to validate the numerical stability of our solution in the noisefree case. For each trial, only one snapshot of the calibration target is needed. The error metric is the difference between the ground truth and our solution. The histogram of errors is shown in Fig. 6. Due to the randomness, the computational error varies but the accuracy is still in a high level (around ). It demonstrates that our method correctly solves the LRFcamera calibration problem, which further validates the sufficiency of the constraints from a single snapshot.
7.2 Sensitivity to Noise
This simulation tests the sensitivity of our solution to the noise in feature data. Two sources of error are taken into account: the laser depth uncertainty long the beam direction and the pixel uncertainty in line detection. Based on the practical setting, we respectively set the standard deviations (STDs) of laser noise and pixel error as
mm and px in the image. A factor varies from 0 to 1 to combine the noise information as () [13] from both sensors in one plot. 1000 MonteCarlo trials are performed, each of which needs only one snapshot. The metrics for angular error (rotation) and distance error (translation) are respectively and . Fig. 8 demonstrates that as noise level increases, our solution is not sensitive to the image noise but has a greater sensitivity to the laser depth noise.The sensitivity of our solution to the angle between two triangle boards of the calibration target is further tested. We set mm and px. The angle varies from to . Fig. 8 shows the means and STDs for both angular and distance error. Our solution is not sensitive to the boardangle but still has the smallest errors around .
7.3 Noise Reduction
We report a simulation designed for testing the noise reduction of our solution when using multiple observations. Here, varies from 1 mm to 10 mm with set to 3 px. At each noise level, the means and STDs are calculated for both and from 1000 MonteCarlo trials. From Fig. 10, we observe that as number of observations increases, the means and STDs of both errors asymptotically decrease. Moreover, with a small number of snapshots, our solution can achieve a highly accurate initial value for further optimization. Specifically, and are respectively around and 5 mm for only 5 snapshots.
With snapshot selection, additional accuracy is achieved as shown in Fig. 10, where mm and px. The distance threshold is set to 5 mm. The errors and are further decreased by nearly compared with the solution without any singlesnapshot check.
8 Real Experiments
We perform experiments using the real data to further validate our method in comparison with other two existing methods.
A LRF Hokuyo URG04LX is rigidly mounted on a stereo rig which consists of a pair of Point Grey Chameleon CMLN13S2C cameras (See Fig. 1). Sensors are synchronized based on time stamps. The LRF is set to horizontal field of view, with an angular resolution of and a line scanning frequency of 10 Hz. Its scanning accuracy is cm within a range from 2 cm to 100 cm, and has error for a range from 100 cm to 400 cm. The cameras have a resolution of pixels, and are precalibrated based on [26]. The images prior to data processing are warped to get rid off the radial and tangent distortions.
8.1 Comparison with Existing Methods
We compare our method to two stateoftheart algorithms [5] and [10] using the ground truth of the stereorig baseline. Specifically, let represent the transformation (rotation and translation) between two frames. For each method, the LRF is first calibrated w.r.t both left and right cameras to obtain and . We then compute the relative pose (baseline) between stereo cameras and compare it with the the ground truth calibrated from the toolbox [15]. Hence, the error matrix is
Comments
There are no comments yet.