A Novel Method for Extrinsic Calibration of a 2-D Laser-Rangefinder and a Camera

by   Wenbo Dong, et al.
University of Minnesota

We present a novel solution for extrinsically calibrating a camera and a Laser Rangefinder (LRF) by computing the transformation between the camera frame and the LRF frame. Our method is applicable for LRFs which measure only a single plane. It does not rely on observing the laser plane in the camera image. Instead, we show that point-to-plane constraints from a single observation of a V-shaped calibration pattern composed of two non-coplanar triangles suffice to uniquely constrain the transformation. Next, we present a method to obtain a solution using point-to-plane constraints from single or multiple observations. Along the way, we also show that previous solutions, in contrast to our method, have inherent ambiguities and therefore must rely on a good initial estimate. Real and synthetic experiments validate our method and show that it achieves better accuracy than previous methods.



There are no comments yet.


page 1

page 9

page 10


Spatiotemporal Calibration of Camera and 3D Laser Scanner

The multi-sensory setups consisting of the laser scanners and cameras ar...

In Situ Translational Hand-Eye Calibration of Laser Profile Sensors using Arbitrary Objects

Hand-eye calibration of laser profile sensors is the process of extracti...

Joint Calibration of Panoramic Camera and Lidar Based on Supervised Learning

In view of contemporary panoramic camera-laser scanner system, the tradi...

A Maximum Likelihood Approach to Extract Finite Planes from 3-D Laser Scans

Whether it is object detection, model reconstruction, laser odometry, or...

Automatic Scale Estimation of Structure from Motion based 3D Models using Laser Scalers

Recent advances in structure-from-motion techniques are enabling many sc...

A Dual-Beam Method-of-Images 3D Searchlight BSSRDF

We present a novel BSSRDF for rendering translucent materials. Angular e...

A Branch-and-Bound Algorithm for Checkerboard Extraction in Camera-Laser Calibration

We address the problem of camera-to-laser-scanner calibration using a ch...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Many robotics systems rely on cameras and laser range finders to compute environment geometry [1]. Two dimensional (2D) Laser Range Finders (LRFs) which measure depth along a single plane are commonly used due to their low weight, low cost and low power requirements.

Taking advantage of measurements from an LRF or a LIDAR combined with a camera, however, requires precise knowledge of the relative pose (orientation and position) between them. This is a classical extrinsic calibration problem where the objective is to determine the transformation between two coordinate frames. Establishing correspondences between two sensors is easier for 3D LIDARs since distinct features can be identified both among laser points and in the camera image. Existing methods include 3D LIDAR-camera calibration by using a circle-based calibration target [2] and an arbitrary trihedron [3].

Extrinsic calibration of a 2D LRF is more challenging because a 2D LRF produces only a single scanning plane for each pose which is invisible from the regular camera. This makes it difficult to find correspondences. Therefore, additional constraints must be used. We note that if we were given the correspondences between laser points and their images (e.g. IR camera) the extrinsic calibration problem reduces to a standard PP (Perspective-n-Point) computation [4]. However, in our case, these correspondences are unknown. We study this extrinsic calibration problem and make the following contributions:

Figure 1: The calibration system incorporating a calibration target and a capture rig; Left: The calibration target formed by two triangular boards with a checkerboard on each triangle; Right: The capture rig consisting of a 2D LRF and stereo cameras. (Only one camera is involved in the calibration problem, the other is just for testing in real experiment.)
  • We show that a single observation of two non-coplanar triangles sharing a common side (See Fig. 1) suffices to unambiguously solve the calibration problem.

  • Even though planar, triangular or V-shaped rectangular patterns have already been proposed to solve the calibration problem, we show that previous methods do not sufficiently constrain the calibration problem to allow for a unique solution. Therefore, they rely on a large number of measurements and a good initial estimate.

  • We present a robust method to calibrate from a single observation in the presence of noise.

  • For additional accuracy, we show that by using only a few additional observations, our method achieves significantly smaller error than existing methods.

2 Related Work

One of the earliest methods for LRF-camera calibration is presented by Zhang and Pless [5]

using points-on-plane constraints (laser points must lie on the plane of the calibration pattern). The weakness of this approach is that the cost function in the nonlinear optimization only focuses on the distance between the laser points and the corresponding plane, which only constrains two out of six degrees of freedom. However, the remaining four degrees of the relative pose between the camera and the LRF have ambiguity (See Fig. 

2). Consequently, the solution may converge to a local minimum. Therefore, this method requires a large number of different observations with a wide range of views (more than 20 snapshots) for accuracy.

Vasconcelos et al. [6] and Zhou [7] presented minimal solutions to address disadvantages in [5]. In contrast to our method, both techniques require multiple observations (at least three) and have inherent degeneracy (two planes are parallel or there exists a dangerous cylinder).

Figure 2: Existing methods do not sufficiently constrain the problem.
(a): In Zhang’s approach [5], the rectangular calibration board can be moved horizontally and vertically, and also can be respectively rotated along two different axes without changing their solution; (b): In Li’s approach [8], the triangular calibration board can be rotated respectively along two different axes, and each red laser edge point can be moved along the green plane through the camera center and one board edge; (c): In methods [9] and [10], the V-shaped calibration target can be moved vertically, and each red laser edge point can be moved long its corresponding green plane through the camera center and one board edge; (d): In approaches [12] and [11] using single or double line features but with same principle, the rectangular calibration board can be moved vertically, and each red laser feature point can be moved long the green plane through the camera center and one feature line.

With point-on-line constraints (laser points must lie on intensity edges or lines obtained from the image), the approaches proposed in [8] and  [9] respectively use a black triangular board and a V-shaped calibration target. The results from these two methods are not accurate due to sparse sampling of laser points. Further, a large number (usually more than 100) of images are needed to compensate for the lack of constraints for each observation. Using triangular board [8] turns out to be one way to improve the constraints mentioned above in [5]: the constraint “points on the border lines” removes the ambiguity of the horizontal translation and “triangular plane” removes the ambiguity of the vertical translation for the relative LRF-camera pose. However, there are still two degrees of freedom that remain ambiguous (See Fig. 2). Essentially, the drawback is that the constraints are imposed on image: there must be a total of four degrees of freedom from views of depth and orientation (2 linear geometry constraints plus 3 nonlinear constraints for rotation matrix to solve nine unknowns). Additional details are explained in Section 4.

Based on the ideas in [8] and [9] (minimizing the projection distance on the image between intersected laser points and the feature lines), the authors in [10] also propose to use a V-shaped calibration target. They increase the laser points’ sampling for each observation by introducing more feature lines and virtual end-points, but the same drawback still exists as in [9] and [10] (See Fig. 2). Therefore, they still need a large amount (around 50) of different observations to achieve a reasonable result. Besides, the convergence of further optimization requires a good initial estimate, which is not guaranteed in these methods.

The method in [11] provides an analytical solution using a white board with a black band in the middle. It needs only six different observations. Similarly, the authors in [12] give an analytical solution to this problem using a white board with a black line in the middle. Compared with [11], it further computes the optimal least-squares solution to improve the robustness to noise. However, both of these two methods still cannot avoid using a large number of different observations for accuracy because of the insufficient constraints for each observation (See Fig. 2). In practice, it is not robust to deal with these ambiguities through a large numbers of snapshots from different valid views.

The work described in [13], presents an approach only requiring the observation of a scene corner (orthogonal trihedron) commonly found in any human-made environments. Moreover, this method builds line-to-plane and point-to-plane constraints, which are enough for each observation. The authors in [14] further extend this work by deriving a robust minimal solution. However, its assumption of three orthogonal planes highly depends on the accuracy of three right angles which are difficult to be made exactly in practice. Further, when multiple observations from different views are needed for additional accuracy, the sharp angle between the laser beams and walls on two sides often affects the laser measurement: scanned laser line on the wall is curved. Our method accommodates an arbitrary obtuse angle in our calibration target so that it can adjust the view angle between the pattern and linear beams.

In contrast to previous methods, our proposed method uses a triangular V-shaped calibration target with checkerboards on it, which are involved in camera calibration. Our method builds sufficient constraints, which guarantee the uniqueness of the solution for each observation. In theory, we can use only one snapshot to calibrate the camera-LRF rig. In practice, an accurate result can be achieved with only a few images (previous methods require 30 or more). Further, the angle between two triangular boards of our calibration target can be arbitrary which makes it convenient to build.

3 Problem Formulation

Our calibration setup is shown in Fig. 3. A V-shaped calibration target is formed by two triangular boards with a checkerboard on each triangle. , , and are four corners of the target. For a given rigidly connected LRF-camera pair, and are respectively the LRF’s and the camera’s frames of reference. We define the triangles as and , and let and . For simplicity, the notations we use are the same ones to refer the planes they lie in. In contrast to the methods proposed in [9] and [10], the angle between and is not required to be a fixed value. It could be an arbitrary obtuse angle. For each observation, the scanning plane of the LRF intersects with the three sides , and at points , and respectively in the LRF frame. Moreover, the camera and LRF should be either synchronized or held stationary during data collection. Each observation of the calibration target consists of an image acquired from the camera and a single scan obtained from the LRF. The features to be extracted from a single observation are: three laser points , and

in the LRF frame; four unit normal vectors

, , and of respective planes , , and in the camera frame; and two distances and from camera to planes and

respectively. Further details of feature extraction are described in Section 


Figure 3: A single pair of LRF-camera observations of the calibration target with the definition of parameters for geometry constraints.

Problem Definition: The objective is to obtain the relative pose between these two sensors: the orientation and position of the LRF with respect to (w.r.t) the camera frame. Without loss of generality, the laser scanning plane is defined as the plane . Thus, a 3D laser point represented as can be transformed to the point in the camera frame by .

The camera is modeled by the standard pinhole model. We ignore lens distortions in the rest of the paper, and assume that the images have already been undistorted, e.g. using the functions from MATLAB Camera Calibration Toolbox [15].

4 Uniqueness Of The Solution

In this section, we show how the features from a single observation constrain the calibration problem to a unique solution.

4.1 Constraints

A single laser scan consists of a depth value for each angle at which the depth was sensed. In the LRF frame, we assume that the sensor is at its origin . Let us express the feature points as , where are the indices of the feature points. Let the feature norm vectors of planes be , where . We now have a correspondence between a 3D point in LRF frame and a plane in camera frame. Thus, our constraint is that the laser point, transformed to the camera frame, must lie on the corresponding plane, which can be divided into three parts.

First, laser points and must respectively lie on the planes and . Then, the first two constraints have the form


where and are the unknowns. Second, for laser points and , they must both lie on the plane . Then, we have other two constraints


Similarly, laser points and must both lie on the plane . This gives two more constraints:


Let , and be the three columns of . Since we set up the LRF coordinate system in a way that the second coordinate of any laser point is zero, we do not have an explicit dependence on . Once we solve for and , can be obtained by


Since is an orthonormal matrix, we have three further constraints


To summarize, we have nine unknowns (in , and ) and a system of six linear and three nonlinear equations. In the next section, we show that these constraints are independent and hence sufficient to obtain a solution.

4.2 Proof of Uniqueness

For a single observation of the calibration target, our method builds up a system of Eqs. (1)-(5). In order to prove the proposed method does not induce any ambiguity, the nine equations must be independent. We show that the first six linear equations are linearly independent. Since the other three nonlinear equations have no relationship with geometry constraints, they are independent from the first six linear equations.

From the constraints formulation, the six linear equations can be expressed as the following form


where is the vector of unknowns with , and , is the distance vector denoted as , and is the coefficient matrix whose elements are expressed using components from and as defined in Subsection 4.1. Lemma 1 below states that the three unit vectors , and are linearly independent, which means they span the entire 3D space.

Lemma 1

Suppose is the normal vector of plane for defined as in Fig. 3, these normal vectors in any cardinality three subset of are linearly independent (totally four subsets: I. , and ; II. , and ; III. , and ; IV. , and .).

The proof is presented in Appendix 10.

As a corollary, the unit vector can be expressed as the combination of first three unit vectors . This allows us to perform Gaussian elimination on Eq. (6) as follows:

  • Keep and unchanged, and let ;

  • Let and ;

  • Let .

Here, is transformed as


with sub-matrices , , and where


with , , and . Since laser features , and are extracted from two distinct line segments, their coordinates cannot be equal otherwise these three points are on a same plane from an invalid observation. Therefore, , and can be calculated. After Gaussian elimination, the distance vector is transformed into a new vector denoted as , where .

Let us first take a close look at the structure of . Since we know that unit vectors , and are linearly independent (Lemma 1), matrix is non-singular such that we can reduce it to an upper triangular matrix. Thus, the first three linear equations are independent. Next, the unit vectors , and are also linearly independent (Lemma 1). Then, matrix is also non-singular and can be reduced into an upper triangular matrix, which means the last three linear equations are also independent. From the procedure above, we just reduce the to a matrix which has a lower triangular corner with zero elements, just shown as the following


where represents a upper triangular matrix and represents a square matrix. So from the matrix structure in Eq. (9), we can conclude that the six linear equations for geometry constraints are linearly independent, which means plus the other three nonlinear equations we can solve for nine unknown components in , and respectively. Hence, there is no ambiguity in our proposed method in which the relative pose between camera and LRF is determined from a single snapshot of the calibration target.

5 Analytical Solution

In this section, we first present how to obtain the solution for the extrinsic calibration of the camera-LRF system from just a single observation of the calibration target. Then, we show the solution from multiple observations which is needed to reduce the effect of noise. Note that an analytical solution is obtained in our constraints system, which is more general than the closed-form solution in [11]. Moreover, we present a strategy to exclude fake solutions from cheirality check.

5.1 From a Single Observation

We outline seven steps to solve the polynomial system (Eqs. (1)-(5)). For convenience, the geometry constraints (Eqs. (1)-(3)) are reformulated as the following


where the parameters , and are defined as


STEP 1: The problem is reformulated in the view of nonlinear optimization as stated in [12] shown below


where . From the reformulated problem (12), the optimal solution for is obtained as shown below


where and .

Lemma 2

is a non-singular matrix and thus invertible.

Lemma 2 is proved in Appendix 11. Since for a laser point , we arrange the expression of in (13) to the form


in which and .

STEP 2: With , , and defined as


we substitute (14) in constraints (10) and obtain


where , and . Then, is further expressed in terms of


where and .

Note that is invertible. The proof is straightforward as the following. We first assume the matrix is non-invertible thus rank deficient. From (16), we have , which is a system for solving . Then for any given (thus is given), the rank-deficient results in infinite solutions for (which is ). It means our system has ambiguity, which is in contradiction to the uniqueness proof in Section 4. Hence, is invertible.

STEP 3: Now we can eliminate by substituting (17) in the three remaining second order constraints (5). After full expansion, we have the following


where the coefficients and the constant are computed in a closed form in terms of the components of and . To solve the polynomial system (Eqs. (18)-(20)), we aim to obtain a univariate polynomial in using Macaulay resultant [16]. This multivariate resultant is the ratio of two determinants, the denominator (21) and numerator (22)


with elements , , , , and defined as


We set this resultant to , and obtain the univariate polynomial equation


where the coefficients are computed in closed-form of the coefficients of Eqs. (18)-(20).

Although the eighth-degree (higher than four) univariate polynomial

does not have a closed-form roots expression, we can obtain its all roots by computing the eigenvalues of its companion matrix

 [17]. For numerical stability, we approximate the roots through an iterated method [18] which uses a generalized companion matrix constructed from and initialized by . Here, and are expressed as


where , and . is first initialized as the eigenvalues of . Then for each iteration, is updated as the eigenvalues of until is converged. Eight possible solutions for are obtained.

STEP 4: Each solution for (numeric value ) corresponds to a single solution for the rest of the unknowns. For numerical stability, we compute the Sylvester resultant [16] of Eq. (18) and Eq. (20) w.r.t . With the determinant of this resultant set to zero, we obtain a quartic polynomial in


where . Similarly, we compute the Sylvester resultant of Eq. (19) and Eq. (20) w.r.t , and set its determinant to zero to obtain another quartic polynomial in . To solve this overdetermined system, we aim to minimize the sum of the squares of and , and thus set the derivative of + w.r.t to zero to get a seventh-degree polynomial. Seven solutions for obtained by iterated method mentioned above are tested if both and .

After substituting the numeric solutions and into Eqs. (18)-(20), we perform the same optimization method to solve the overdetermined system for . Note that we have a closed-form roots expression for the third-degree polynomial obtained from the derivative of the cost function w.r.t (the sum of the squares of three polynomials in (18)-(20)). We only keep the solution if all Eqs. (18)-(20) hold.

STEP 5: After obtaining , can be calculated from Eq. (17) and can be retrieved from Eq. (4). Finally, can be obtained using Eq. (13).

Eight possible solutions give us up to four real solutions. Four complex solutions can be eliminated as the following. We square all the elements of , , and , and check if they all have non-negative real parts.

STEP 6: In practice, while the solution for fails to deliver a real solution in the presence of noise, we use its projection on real domain as the initial value. Eqs. (18)-(20) are then treated as a whole for , which can be solved using the Trust-Region Dogleg method [19] [20]. At each iteration , the trust region subproblem here is


where is updated, and the Jacobian is defined as . The step to obtain is then constructed as


where is the largest value such that . The Cauchy step and the Gauss-Newton step are respectively defined as and , where and .

STEP 7: We must choose the unique solution from fake solutions through the cheirality check. Fig. 4 shows four possible real solutions, which are , , and in terms of a combination of the LRF frame and the camera frame.

The unique solution is determined as the following. Both the LRF and the camera must face towards the same direction, and three laser points , and after transformation to the camera frame must be in front of the camera.

Figure 4: Four possible real solutions are shown as a combination of the LRF frame and the camera frame (, , and ). The unique solution is determined through the cheirality check.

5.2 From Multiple Observations

It is always reasonable to use multiple observations from different views to suppress noise. Similarly, the problem in this case is formulated as following


where and is the total number of different multiple observations. The solution is obtained following the seven steps mentioned in 5.1. As the rotation is represented by the Rodrigues’ formula, the calibration result is further refined using Levenberg-Marquardt (LM) method [21] [22] with the accurate initial value from our solution in (29).

6 Details of the Calibration Approach

We explain below how to accurately extract the features required for our method. These are four normal vectors () and two distances () from the camera, and three laser edge points () from the LRF.

6.1 Data from the LRF

Instead of being concave, our calibration can be built in a convex shape as shown in Fig. 5. We can put the calibration target on a planar object (such as table, wall and the ground) such that , and are accurately determined by the intersection between line segments. Specifically, the laser intersection with the objects are first segmented into four parts (e.g. using the Ramer-Douglas-Peucker algorithm (RDP) algorithm [23]). We then fit a line to each segment using the total least squares method [24]. is thus obtained by the intersection of lines and . Similarly, and are respectively determined by the other two pairs and .

6.2 Data from the Camera

Using our calibration target with two checkerboards, the camera calibration is first performed by the MATLAB Toolbox [15]. Two normal vectors and of planes and can be obtained directly from the calibration toolbox. Each checkerboard is on the plane in its own world frame. For example, we have the rotation and translation from the calibration result, which represent the orientation and position of the plane w.r.t the camera. is just minus the 3rd column of , and the distance from the camera to is . Similarly, and can be also calculated.

From the image (See Fig. 5), the unit line directions of , and are projected as , and as , and the projection point of the vertex is . As stated in [13], we perform a weighted optimization initialized by the LSD [25] line detector to accurately estimate the set . Specifically, only the pixels within a rectangle region are considered for fitting . Let be the normal of . Hence, the optimization problem is expressed as


where each region has the number of valid pixels whose gradient magnitudes as their weights are above a threshold. Given the intrinsic matrix , the normal vector () is obtained by .

Figure 5: A calibration target built in a convex V-shape is put on the table. Accurate data extraction from both the LRF and the camera is performed by line intersection and optimized line detection.

6.3 Snapshot Selection

The solution from each snapshot should guarantee that the laser points from the line and from must respectively lie on and . From multiple observations, it can guide us to select well-conditioned snapshots for further accuracy. Given the solution and , we will keep this snapshot if it satisfies


where and respectively have and points with as the distance threshold.

7 Synthetic Experiments

Using synthetic data, we first validate the numerical stability of our solution. The sensitivity to noise along with the angle effect between two triangle boards from a single snapshot are explored. We further present the result of the noise reduction using multiple observations. The accuracy improvement of the snapshot selection is shown at last.

For the simulation setting, the obtuse angle between two triangle boards of the calibration target is set to . Then, we uniformly and randomly generate roll, pitch and yaw in the range of for the orientation of the LRF w.r.t the camera, and three components of the position from 5 to 30 cm. For each instance of the ground truth, we randomly generate the orientation and position of the calibration target within the range of and 50 to 150 cm.

7.1 Numerical Stability

We performed Monte-Carlo trials to validate the numerical stability of our solution in the noise-free case. For each trial, only one snapshot of the calibration target is needed. The error metric is the difference between the ground truth and our solution. The histogram of errors is shown in Fig. 6. Due to the randomness, the computational error varies but the accuracy is still in a high level (around ). It demonstrates that our method correctly solves the LRF-camera calibration problem, which further validates the sufficiency of the constraints from a single snapshot.

Figure 6: The histogram of computational errors for computed rotation and translation in random and noise-free Monte-Carlo trials.

7.2 Sensitivity to Noise

This simulation tests the sensitivity of our solution to the noise in feature data. Two sources of error are taken into account: the laser depth uncertainty long the beam direction and the pixel uncertainty in line detection. Based on the practical setting, we respectively set the standard deviations (STDs) of laser noise and pixel error as

mm and px in the image. A factor varies from 0 to 1 to combine the noise information as ([13] from both sensors in one plot. 1000 Monte-Carlo trials are performed, each of which needs only one snapshot. The metrics for angular error (rotation) and distance error (translation) are respectively and . Fig. 8 demonstrates that as noise level increases, our solution is not sensitive to the image noise but has a greater sensitivity to the laser depth noise.

The sensitivity of our solution to the angle between two triangle boards of the calibration target is further tested. We set mm and px. The angle varies from to . Fig. 8 shows the means and STDs for both angular and distance error. Our solution is not sensitive to the board-angle but still has the smallest errors around .

Figure 7: Errors in estimated rotation and translation as a function of the laser noise with different levels of the pixel noise. Each point represents the median error for 1000 Monte-Carlo trials. Each line corresponds to a different level of the pixel noise versus the laser noise changed by their own STDs.
Figure 8: Means and STDs of the errors for estimated rotation and translation versus the angle between two triangle boards of the calibration target. Each point represetns the mean error with its STD from 1000 Monte-Carlo trials. The errors gently vary as long as the board-angle is obtuse.

7.3 Noise Reduction

We report a simulation designed for testing the noise reduction of our solution when using multiple observations. Here, varies from 1 mm to 10 mm with set to 3 px. At each noise level, the means and STDs are calculated for both and from 1000 Monte-Carlo trials. From Fig. 10, we observe that as number of observations increases, the means and STDs of both errors asymptotically decrease. Moreover, with a small number of snapshots, our solution can achieve a highly accurate initial value for further optimization. Specifically, and are respectively around and 5 mm for only 5 snapshots.

With snapshot selection, additional accuracy is achieved as shown in Fig. 10, where mm and px. The distance threshold is set to 5 mm. The errors and are further decreased by nearly compared with the solution without any single-snapshot check.

Figure 9: Means and STDs of the errors for estimated rotation and translation versus the number of observations of the calibration target in 1000 Monte-Carlo trials with the laser noise changed by its STD.
Figure 10: The comparision of means and STDs of the errors for estimated rotation and translation between single-view-check and without-any-check versus the number of observations of the calibration target in 1000 Monte-Carlo trials.

8 Real Experiments

We perform experiments using the real data to further validate our method in comparison with other two existing methods.

A LRF Hokuyo URG-04LX is rigidly mounted on a stereo rig which consists of a pair of Point Grey Chameleon CMLN-13S2C cameras (See Fig. 1). Sensors are synchronized based on time stamps. The LRF is set to horizontal field of view, with an angular resolution of and a line scanning frequency of 10 Hz. Its scanning accuracy is cm within a range from 2 cm to 100 cm, and has error for a range from 100 cm to 400 cm. The cameras have a resolution of pixels, and are pre-calibrated based on [26]. The images prior to data processing are warped to get rid off the radial and tangent distortions.

8.1 Comparison with Existing Methods

We compare our method to two state-of-the-art algorithms [5] and [10] using the ground truth of the stereo-rig baseline. Specifically, let represent the transformation (rotation and translation) between two frames. For each method, the LRF is first calibrated w.r.t both left and right cameras to obtain and . We then compute the relative pose (baseline) between stereo cameras and compare it with the the ground truth calibrated from the toolbox [15]. Hence, the error matrix is