Camera Calibration: a USU Implementation

07/31/2003
by   Lili Ma, et al.
0

The task of camera calibration is to estimate the intrinsic and extrinsic parameters of a camera model. Though there are some restricted techniques to infer the 3-D information about the scene from uncalibrated cameras, effective camera calibration procedures will open up the possibility of using a wide range of existing algorithms for 3-D reconstruction and recognition. The applications of camera calibration include vision-based metrology, robust visual platooning and visual docking of mobile robots where the depth information is important.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 11

page 13

page 20

page 23

page 27

12/06/2021

Self-Supervised Camera Self-Calibration from Video

Camera calibration is integral to robotics and computer vision algorithm...
09/08/2017

Calibration of depth cameras using denoised depth images

Depth sensing devices have created various new applications in scientifi...
07/30/2020

Infrastructure-based Multi-Camera Calibration using Radial Projections

Multi-camera systems are an important sensor platform for intelligent sy...
05/26/2021

How to Calibrate Your Event Camera

We propose a generic event camera calibration framework using image reco...
12/15/2020

Practical Auto-Calibration for Spatial Scene-Understanding from Crowdsourced Dashcamera Videos

Spatial scene-understanding, including dense depth and ego-motion estima...
03/15/2019

Calibration of Asynchronous Camera Networks for Object Reconstruction Tasks

Camera network and multi-camera calibration for external parameters is a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Depending on what kind of calibration object being used, there are mainly two categories of calibration methods: photogrammetric calibration and self-calibration. Photogrammetric calibration refers to those methods that observe a calibration object whose geometry in 3-D space is known with a very good precision. Self-calibration does not need a 3-D calibration object. Three images of a coplanar object taken by the same camera with fixed intrinsic camera parameters are sufficient to estimate both intrinsic and extrinsic parameters. The obvious advantage of the self-calibration method is that it is easy to set up and the disadvantage is that it is usually considered unreliable. However, the author of [3] shows that by preceding the algorithm with a very simple normalization (translation and rotation of the coordinates of the matched points), results are obtained comparable with the best iterative algorithms. A four step calibration procedure is proposed in [4]. The four steps are: linear parameter estimation, nonlinear optimization, correction using circle/ellipse, and image correction. But for a simple start, linear parameter estimation and nonlinear optimization are enough. In [5], a plane-based calibration method is described where the calibration is performed by first determining the absolute conic , where is a matrix formed by a camera’s 5 intrinsic parameters (See Section 4.2). In [5], the parameter (a parameter describing the skewness of the two image axes) is assumed to be zero and the authors observe that only the relative orientations of planes and camera is of importance for singularities and planes that are parallel to each other provide exactly the same information. Recently, Intel distributes its “Camera Calibration Toolbox for Matlab” freely available online [6]. The Intel camera calibration toolbox first finds the feature locations of the input images, which are captured by the camera to be calibrated using a checkerboard calibration object. Then, it calculates the camera’s intrinsic parameters. However, when we used the images captured by our desktop camera as the input images, the detected feature locations contain great errors. We decided not to use Intel’s method since its flexibility and accuracy are poor. Therefore, in this report, our work is mainly based on the self-calibration algorithm originally developed by Microsoft Research Group [1, 2], which has been commonly regarded as a great contribution to the camera calibration. The key feature of Microsoft’s calibration method is that the absolute conic is used to estimate the intrinsic parameters and the parameter is considered. The proposed technique in [1, 2] only requires the camera to observe a planar pattern at a few (at least 3, if both the intrinsic and the extrinsic parameters are to be estimated uniquely) different orientations. Either the camera or the calibration object can be moved by hand as long as they cause no singularity problem and the motion of the calibration object or camera itself need not to be known in advance.

By “flexibility”, we mean that the calibration object is coplanar and easy to setup while by “robustness”, it implies that the extracted feature locations are accurate and the possible singularities due to improperly input images can be detected and avoided.

The main contributions in this report are briefly summarized as follows:

  1. A complete code platform implementing Microsoft’s camera calibration algorithm has been built. (Microsoft did not release the feature location pre-processor [1, 2]);

  2. A technical error in Microsoft’s camera calibration equations has been corrected (Equation (6.3));

  3. A new method to effectively find the feature locations of the calibration object has been used in the code. More specifically, a scan line approximation algorithm is proposed to accurately determine the partitions of a given set of points;

  4. A numerical indicator is used to indicate the possible singularities among input images to enhance the robustness in camera calibration (under development and to be included);

  5. The intrinsic parameters of our desktop camera and the ODIS camera have been determined using our code. Our calibrated results have also been cross-validated using Microsoft code;

  6. A new radial distortion model is proposed so that the radial undistortion can be performed analytically with no numerical iteration;

  7. Based on the results of this work, some new application possibilities have been suggested for our mobile robots, such as ODIS.

The rest of the report is arranged as follows. First, some notations and preliminaries are given, such as camera pinhole model, intrinsic parameters, and extrinsic parameters. Then, the calibration method proposed in [1, 2] is re-derived with a correction to a minor technical error in [1, 2]. Using this method, calibration results of 3 different cameras are presented. Finally, several issues are proposed for future investigations and possible applications of camera calibration are discussed.

2 Camera Projection Model

To use the information provided by a computer vision system, it is necessary to understand the geometric aspects of the imaging process, where the projection from 3-D world reference frame to image plane (2-D) causes direct depth information to be lost so that each point on the image plane corresponds to a ray in the 3-D space

[7]. The most common geometric model of an intensity imaging camera is the perspective or pinhole model (Figure 1). The model consists of the image plane and a 3-D point , called the center or focus of projection. The distance between the image plane and is called the focal length and the line through and perpendicular to the image plane is the optical axis. The intersection between the image plane and the optical axis is called the principle point or the image center. As shown in Figure 1, the image of is the point at which the straight line through and intersects the image plane. The basic perspective projection [8] in the camera frame is

(1)

where is a 3-D point in the camera frame and is its projection in the camera frame. In the camera frame, the third component of an image point is always equal to the focal length . For this reason, we can write instead of .

Figure 1: The perspective camera model

3 Aspects that Real Cameras Deviate from Pinhole Model

A real camera deviates from the pinhole model in several aspects. The most significant effect is lens distortion. Because various constraints in the lens manufacturing process, straight lines in the world imaged through real lenses generally become somewhat curved in the image plane. However, this distortion is almost always radially symmetric and is referred to as the radial distortion. The radial distortion that causes the image to bulge toward the center is called the barrel distortion, and distortion that causes the image to shrink toward the center is called the pincushion distortion (See Figure 2). The center of the distortions is usually consistent with the image center.

Figure 2: The barrel distortion and the pincushion distortion

The second deviation is the flatness of the imaging media. However, digital cameras, which have precisely flat and rectilinear imaging arrays, are not generally susceptible to this kind of distortion.

Another deviation is that the imaged rays do not necessarily intersect at a point, which means there is not a mathematically precise principle point as illustrated in Figure 3. This effect is most noticeable in extreme wide-angle lenses. But the locus of convergence is almost small enough to be treated as a point especially when the objects being imaged are large with respect to the locus of convergence.

Figure 3: Lens camera deviates from the pinhole model in locus of convergence

4 Camera Parameters

 

Definition: Camera Parameters

Camera parameters are the parameters linking the coordinates of points in 3-D space with the coordinates of their corresponding image points. In particular, the extrinsic parameters are the parameters that define the location and orientation of the camera reference frame with respect to the world reference frame and the intrinsic parameters are the parameters necessary to link the pixel coordinates of an image point with the corresponding coordinates in the camera reference frame.  

4.1 Extrinsic Parameters

The extrinsic parameters are defined as any set of geometric parameters that uniquely define the transformation between the world reference frame and the camera frame. A typical choice for describing the transformation is to use a vector and a orthogonal rotation matrix such that . According to Euler’s rotation theorem, an arbitrary rotation can be described by only three parameters. As a result, the rotation matrix

has 3 degree-of-freedom and the extrinsic parameters totally have 6 degree of freedom. Given a rotation matrix

in Equation (2), one method to get the 3 parameters that uniquely describe this matrix is to extract Euler angles [9], denoted by , such that

(2)
(3)

where

(4)

When , the solutions for are

(5)

4.2 Intrinsic Parameters

The intrinsic parameters are as follows:

  • The focal length:

  • The parameters defining the transformation between the camera frame and the image plane
    Neglecting any geometric distortion and with the assumption that the CCD array is made of rectangular grid of photosensitive elements, we have:

    (6)

    with the coordinates in pixel of the image center and the effective sizes of the pixel in the horizontal and vertical direction respectively. Let , the current set of intrinsic parameters are , and .

  • The parameter describing the skewness of the two image axes:
    The skewness of two image axes is illustrated in Figure 4.

    Figure 4: Skewness of two image axes
  • The parameters characterizing the radial distortion: and
    The radial distortion is governed by the Equation [10]

    (7)

    Two coefficients for distortion are usually enough. The relationship between the distorted and the undistorted image points can be approximated using

    (8)

    where are the real observed distorted image points and the ideal projected undistorted image points. So, till now, the set of intrinsic parameters are , and .

4.3 Projection Matrix

With homogeneous transform and the camera parameters, we can have a matrix , called the projection matrix, that directly links a point in the 3-D world reference frame to its projection in the image plane. That is:

(9)

where is an arbitrary scaling factor and the matrix fully depends on the intrinsic parameters. The calibration method used in this work is to first estimate the projection matrix and then use the absolute conic to estimate the intrinsic parameters [1, 2]. From Equation (1) and (6), we have

(10)

From Equation (9), we have

(11)

with scaling factor . From the above two equations, we get . In a same manner, .

5 Extraction of Feature Locations

5.1 Calibration Object

The calibration method illustrated here is a self-calibration method, which uses a planar calibration object shown in Figure 5, where 64 squares are separated evenly and the side of each square is 1.3 cm.

Figure 5: Current calibration object

The procedures to extract the feature locations of the above calibration object are illustrated in Table 2. The input image is an intensity image. After thresholding it with a certain value (which is 150 in our case), we can get a binary image. The binary image then goes through some Connected Component Labeling algorithm [11] [12] that outputs a region map where each class of the connected pixels is given a unique label. For every class in the region map, we need to know whether or not it can be a square box. In our approach, this is done by first detecting the edge of each class and then finding the number of partitions of the edge points. If the number of partitions is not equal to 4, which means it is not a 4-sided polygon, we will bypass this class. Otherwise, we will fit a line using all the edge points that lie between each two adjacent partition points and thus get 4 line fits. The final output of this class is the intersections of these 4 lines that approximate the 4 corners of each box. After running through all the classes in the region map, if the number of detected boxes equals to the actual number of boxes in calibration object, we will record all the detected corners and arrange them in the same order as for the 3-D points in space (for a given calibration object, assume , we know the exact coordinates of the feature points in world reference frame and we need to arrange these feature points in certain order so that after detecting feature points in the observed images, we can have an algorithm to seek the map from a point in the world frame to its corresponding projection in the image plane). After detecting several this kind of images, we are fully prepared to do calibration calculation.


Threshold input intensity image (PGM) to make it binary (the threshold is 150)
Find connected components using 8-connectivity method
Loop for every class in the region map
Select the class whose area is 3000 and 20
Binary edge detection of this class
Find partitions of the edge points
If of partitions = 4
Line fit between each two adjacent partition points
Output 4 line intersections
End if
End loop
If the total of intersections = 4
Arrange intersections in the same order as points in the world reference frame
End if
Table 2: Procedures to Extract Feature Locations for One Input Image

5.2 Binary Image Edge Detection (Boundary Finding)

A boundary point of an object in a binary image is a point whose 4-neighborhood or 8-neighborhood intersects the object and its complement. Boundaries for binary images are classified by their connectivity and by whether they lie within the object or its complement. The four classifications are: interior or exterior 8-boundaries and interior or exterior 4-boundaries

[13]. In our approach, we use interior 8-boundary operator, shown in Figure 6, which is denoted as:

(12)

where

  1. is the input binary image

  2. is the output boundary binary image

  3. N is the 4-neighborhood:

  4. For each pixel in , N = minimum pixel value around in the sense of 4-neighborhood

Figure 6: Objects and their interior 8-boundary

5.3 Partitions of Edge Points

Given a set of points that characterize the boundary of some object, a common question is what shape this object is, when we try to use polygons, especially the convex polygons, to denote objects in the real world. The set of points can be the output of some range finding sensors, such as laser and sonar. Or, it can come from images captured by a camera and is preprocessed by some edge detector, which is just the case we are discussing. In our problem, we know beforehand that the region of interest is a square and we can use the scan line approximation method [14, 15] to find the number of partitions. The scan line approximation algorithm is described in Table 3. Figure 7 is an illustration.


Problem Definition

Assumption: Object is described using a convex polygon
Given: A set of data points that have already been sorted in certain order
Find: Partition points

Algorithm

ScanLineApproximation (startindex, endindex, datapoints)
Draw a line connecting start point and ending point
Calculate the maximum distance each point that lies between startindex and endindex to this line
If the maximum distance is greater than a predefined threshold
Record the index corresponding to the point that gives the maximum distance
Set endindex = the index of that point that gives the maximum distance
ScanLineApproximation (startindex, endindex, datapoints)
Set startindex = the index of that point that gives the maximum distance
ScanLineApproximation (startindex, endindex, datapoints)
End if
Table 3: Scan Line Approximation Algorithm [14, 15]

Figure 7: Illustration of scan line approximation algorithm (a) 3 sides (b) 4 sides

In Table 3, the algorithm is described/implemented in a recursive way. Applying this algorithm, an important issue is how to decide the threshold. Unfortunately, this threshold is application-related. In our implementation, we choose pixels. The smaller the boxes or the farther that the camera is way from the calibration object, the smaller the threshold should be. Figure 8 shows the fitting results using the partitions found by scan line approximation algorithm, where all the input data are the edge points of some classes in the region map. Another thing that we need to pay attention to is about how to choose the initial starting and ending points. It is obvious that they cannot be on the same side. Otherwise, due to noise in the data, the point whose distance to the line connecting starting and ending points is maximal might not be around corners. That is why we always start around corners as in Figure 7. This problem can be solved simply by first finding the two adjacent points whose maximal distance that all other points to this line is the biggest.

Figure 8: Fitting results using partitions found by scan line approximation algorithm

Figure 9shows an example of the processed images at all steps, where the input images are captured by a desktop camera. Notice that in Figure LABEL:fig:_feature_points_extraction_for_desktop_images_2, in the image titled with “Binary Image + Partition Points”, the triangular in the upper right corner does not show in the next step. The reason why this happens is that after the process of finding partition points, the number of partition points does not equal to 4 and we thus bypass this class.

Figure 9: Feature points extraction for desktop images (1)

6 Calibration Method

In this section, the calibration method in [1, 2] is described in detail. Using the calibration object shown in Figure 5, this algorithm is a self-calibration method. It only requires the camera to observe a planar pattern at a few different orientations. Either the camera or the calibration object can be moved by hand and the motion need not be known. The reason why this is feasible is that one image observed by a camera can provide 2 constraints about this camera’s intrinsic parameters that are regarded to be unchanged here. With 3 images observed by the same camera, 6 constraints are established and we are able to recover the 5 intrinsic parameters. Once the intrinsic parameters are known, we can estimate the extrinsic parameters, the distortion coefficients , and put every initial guess of these parameters into some nonlinear optimization routine to get the final estimations. Another aspect that makes [1, 2] appealing is that the author provides calibration results and an executable file on the web page [16] along with the sample images. The procedures to do calibration are illustrated in Table 4.


Linear Parameter Estimation

Estimate Homographies (Section 6.1)
Let be the number of images that we want to observe
Loop for from 1 to
Assume the calibration object is at
Establishes the homography between the calibration object and its image
Change the orientation of either calibration object or camera
End Loop
Estimate Intrinsic Parameters (Section 6.3)
For each homography we have 2 constraints concerning the 5 intrinsic parameters
Now we have constraints and we can solve the 5 intrinsic parameters using SVD
Estimate Extrinsic Parameters (Section 6.4)
Using the estimated intrinsic parameters and homographies, we can estimate the extrinsic parameters
Estimate Distortion Coefficients (Section 6.5)
Using the estimated intrinsic and extrinsic parameters, we can get the ideal projected image points
Along with the real observed image points, we can estimate the two distortion coefficients

Nonlinear Optimization

(Section 6.6)

Take all parameters estimated above as an initial guess
Use some nonlinear optimization routine, we can get the final estimated values
Table 4: Camera Calibration Procedures

The idea to assume the calibration object is always at even after some unknown movement maybe bewildering (We are talking about the case when we keep the camera static and move the calibration object). The common sense about the world reference frame is that it is unique. So, how can we assume the calibration object is still at after some rotation and translation? The answer to this question is: as mentioned before, only the relative position and orientation between the calibration object and the camera is of concern. Each image can provide 2 constraints that are independent to all others. Thinking in the other way, it is the same when we keep the calibration object static and move the camera. The basic calibration equations are given as follows.

6.1 Homography Between the Model Plane and Its Image

Without loss of generality, we assume the calibration object is in the world reference frame. Let’s denote the column of the rotation matrix by , we have:

(13)

Therefore a model points in 3-D space is related to its image by a homography

(14)

where

In this way, the matrix is defined up to a scaling factor.

Given an image of the calibration object, the homography can be estimated by maximum likelihood criterion. Let and be the model and its image point respectively. Let’s assume is corrupted by Gaussian noise with mean 0 and covariance matrix . Then the maximum likelihood estimation of is obtained by minimizing

(15)

where is the row of and

(16)

In practice, we simply assume for all . This is reasonable if the points are extracted independently with the same procedure. For each pair of and , we have

(17)

Let , then

(18)

When we are given points, we have above equations and we can write them in matrix form as where

The matrix is a matrix and the solution is well known to be the right singular vector of

associated with the smallest singular value.

6.2 Constraints on the Intrinsic Parameters

Given the estimated homography , we have

(19)

with an arbitrary scalar. Using the knowledge that are orthogonal, we have

(20)

Since ,

(21)

Given a homography, these are the 2 constraints we obtained on the intrinsic parameters.

6.3 Estimation of Intrinsic Parameters

Let

(22)

Note that is symmetric. Define . Let the column vector of be , we have

(23)

Denote

(24)

the two constraints in Equation (21) become

(25)

If images of the calibration object are taken, by stacking such equations, we have where is a matrix.

When , we will have a unique solution defined up to a scaling factor. The solution is well known to be the right singular vector of associated with the smallest singular value. The matrix is estimated up to a scaling factor . After estimation of , the intrinsic parameters can be extracted from by

(26)

The original equation to estimate in [1, 2] is . This must be an obvious mistake since when all the other 5 parameters are known, can be estimated directly from . The reason why using a wrong equation to estimate still achieves a good accuracy might due to the fact that and are the scaling factors in the two image axes and they are usually close to each other.

6.4 Estimation of Extrinsic Parameters

Once is known, the extrinsic parameters can be estimated as:

(27)

where . Of course, due to the noise in the data, the computed matrix does not satisfy the properties of a rotation matrix . One way to estimate the best rotation matrix from a general matrix is: by the Matlab function svd with . The best rotation matrix will be .

6.5 Estimation of Distortion Coefficients

Assume the center of distortion is the same as the principal point, Equation (8) describes the relationship between the ideal projected undistorted image points and the real observed distorted image points . Given points in images, we can stack all equations together to obtain totally equations in matrix form as , where . The linear least-square solutions for is

(28)

6.6 Nonlinear Optimization: Complete Maximum Likelihood Estimation

Assume that the image points are corrupted independently by identically distributed noise, the maximum likelihood estimation can be obtained by minimizing the following objective function

(29)

where is the projection of point in the image using the estimated parameters. This is a nonlinear optimization problem that can be solved by Matlab optimization function fminunc.

In our implementation, one observation is that without an initial estimation of distortion coefficients , simply setting them to be 0, gives the same optimization results as the case with a good initial guess. Clearly, a “good” initial guess of the distortion coefficients is not practically required.

7 Calibration of Different Cameras

In this section, some calibration results are presented using the images provided in [16]. Images captured by a desktop camera and the ODIS camera are also used. For each camera, images are captured and the feature locations are extracted for each image. Here, we always use 5 images for calibration. Using different number of images is also feasible. In practice, we found that 5 images are sufficient for camera calibration.

7.1 Code Validation Using Images in [1, 2]

In [1, 2], the calibration images are posted on web page [16]. We use the reported results to validate our implementation code with the same calibration images.

7.1.1 Plot of the Observed and the Projected Image Points - Microsoft Images

Figure 10 shows the observed and the projected image points using the images provided in [1, 2].

Figure 10: Plot of the observed and the projected image points - Microsoft images 222Blue dots are the real observed image points333Red dots are the projected image points using the estimated camera parameters

7.1.2 Comparison of Calibration Results - Microsoft Images

The parameters before and after nonlinear optimization along with the Microsoft calibration results, obtained by executing its executable file posted on the web page [16], are shown in Table 5. Comparing these final calibration results, one can find that there are slight differences between some of the parameters such as . However, as can be seen in section 7.1.3, the objective functions from these two calibration codes are very close.


Our Implementation Microsoft
Before Opti After Opti After Opti
871.4450 832.5010 832.5
0.2419 0.2046 0.2045
300.7676 303.9584 303.959
871.1251 832.5309 832.53
220.8684 206.5879 206.585
0.1371 -0.2286 -0.2286
-2.0101 0.1903 0.1903
Table 5: Comparison of Calibration Results - Microsoft Images

7.1.3 Objective Function - Microsoft Images

Table 6 shows the comparison of the final values of objective function defined in Equation (29) after nonlinear optimization between Microsoft result and our implementation. The results show that they are very close. We can conclude that our code is correct for the Microsoft images. In what follows, we shall present two more groups of calibration results for our desktop camera and the ODIS camera in our center. As in the case of Microsoft images, we will compare the results similarly for further validation.


Microsoft Our Code
144.8799 144.8802
Table 6: Objective Function - Microsoft Images
Remark 1

The options when using the Matlab function fminunc is not recorded. So, when using different options, slightly different results can be achieved.

7.1.4 Nonlinear Optimization Iterations - Microsoft Images

Table 7 shows the nonlinear optimization iterations, where the initial guess of all parameters are the estimations obtained in Section 6. The nonlinear optimization using Matlab function fminunc. From this table, we can see that after 52 iterations, the value of , the objective function defined in Equation (29), drops to from the initial value of .


Iteration Function Step-size Directional
Count Derivative
1 37 1055.89 0.001 -5.05e+009
2 78 1032.26 9.36421e-009 -7.97e+004
3 120 915.579 1.85567e-007 -1.13e+004
4 161 863.19 1.1597e-008 -2.78e+005
5 202 860.131 1.77145e-008 -2.45e+004
6 244 836.386 1.37495e-007 -4.83e+003
7 285 820.765 1.58388e-008 -1.13e+005
8 327 816.13 1.86391e-007 -2e+003
9 368 800.842 3.79421e-008 -4.12e+004
10 410 788.888 3.68321e-007 -3.29e+003
11 452 769.459 1.47794e-006 -1.38e+003
12 493 738.541 1.13935e-006 3.2e+003
13 535 692.716 3.81991e-006 606
14 576 674.548 1.0489e-006 360
15 618 631.838 3.81559e-006 -1.26e+003
16 659 616.973 1.91963e-006 -1.79e+003
17 700 604.857 2.96745e-006 -724
18 742 593.179 6.19011e-006 -980
19 783 573.66 3.68278e-006 -4.67e+003
20 824 544.497 4.69657e-006 -2.28e+003
21 865 537.636 6.60037e-006 -15.1
22 906 530.468 3.62979e-006 -103
23 947 525.032 2.35736e-006 -85.4
24 989 523.091 4.80959e-006 22.9
25 1031 509.698 2.98894e-005 -320
26 1072 505.972 2.3338e-006 -2.54e+003
27 1114 499.005 6.5114e-005 -61.8
28 1155 493.817 4.78348e-006 -587
29 1197 468.823 0.000433333 -162
30 1238 378.238 0.000463618 -924
31 1279 280.68 0.00024013 -1.45e+003
32 1320 230.465 0.000148949 -990
33 1361 223.328 0.00017819 31.5
34 1402 221.525 0.0649328 10.6
35 1444 218.413 0.179589 -0.000716
36 1486 191.642 0.577805 -0.521
37 1528 176.735 1.68909 0.003
38 1570 173.092 1.62593 8.26e-005
39 1612 169.211 1.38363 -0.0033
40 1654 157.984 2.9687 0.00291
41 1696 146.471 1.87697 -0.0366
42 1737 145.015 0.894999 0.00491
43 1778 144.911 0.869282 -0.000345
44 1820 144.893 1.56986 0.000205
45 1862 144.882 1.50158 5.98e-005
46 1903 144.88 1.31445 -0.00126
47 1946 144.88 0.0356798 0.000261
48 1989 144.88 1.06404 -9.25e-006
49 2030 144.88 0.771663 1.23e-005
50 2071 144.88 0.123009 -0.000482
51 2109 144.88 0.0615043 -0.000114
52 2147 144.88 -0.0307522 -1.18e-005
Table 7: Nonlinear Optimization Iterations - Microsoft Images

7.2 Calibration of a Desktop Camera

This section shows the calibration results for a desktop camera.

7.2.1 Extracted Corners in the Observed Images - The Desktop Camera Case

Figure 11 shows the extracted feature points in the observed images captured by the desktop camera. The extracted corners are marked by a cross and the dot in the center of each box is just to test if the detected boxes are in the same order as the 3-D points in the world reference frame. Due to the low accuracy of this camera, the extracted feature points deviate a lot from their “true positions”, as “sensed” or “perceived” by our human observers.

Figure 11: Extracted corners in the observed images captured by the desktop camera

7.2.2 Plot of the Observed and the Projected Image Points - The Desktop Camera Case

Figure 12 shows the observed and the projected image points captured by the desktop camera. For descriptions, please refer to Figure 10.

Figure 12: Plot of the observed and the projected image points - the desktop camera case

7.2.3 Comparison of Calibration Results - The Desktop Camera Case

Table 8 show the calibration results of our implementation and Microsoft executable file.


Our Implementation Microsoft
Before Opti After Opti After Opti
350.066701 277.1457 277.145
1.693062 -0.5730 -0.573223
200.051398 153.9923 153.989
342.500985 270.5592 270.558
100.396596 119.8090 119.812
0.096819 -0.3435 -0.343527
-0.722239 0.1232 0.123163
Table 8: Comparison of Calibration Results - The Desktop Camera Case

7.2.4 Objective Function - The Desktop Camera Case

From Table 9, we can see that the final calibration results by our implementation and by the Microsoft group are almost identical.


Microsoft Our Code
778.9763 778.9768
Table 9: Objective Function - The Desktop Camera Case

7.2.5 Nonlinear Optimization Iterations - The Desktop Camera Case

For data format and descriptions, please refer to Table 7.


Iteration Function Step-size Directional
Count Derivative
1 37 7077.09 0.001 -6.46e+009
2 79 6943.03 4.15059e-008 1.29e+005
3 121 6821.48 2.25034e-007 -3.08e+003
4 162 6760.09 5.93771e-008 -2.22e+004
5 204 6747.99 1.11643e-007 -2.72e+003
6 246 6574.48 8.41756e-007 -2.97e+004
7 287 6543.07 4.72854e-008 -2.6e+004
8 329 6428.87 6.11914e-007 -1.47e+003
9 370 6386.18 5.19775e-008 -1.8e+004
10 412 6108.9 1.57962e-006 -1.37e+005
11 453 5961.18 2.7321e-006 -940
12 495 5886.28 1.25318e-005 -5.55e+003
13 536 5825.17 1.33725e-005 -1.96e+003
14 577 5784.52 1.59682e-005 -710
15 619 5727.74 6.05741e-005 -192
16 661 5519.12 0.000129424 -292
17 702 4413.34 0.000150092 -4.65e+005
18 743 4214.43 3.64671e-005 -4.58e+004
19 784 4106.31 4.97942e-005 -426
20 825 4062.22 3.43799e-005 -546
21 867 3974.48 0.000114609 -298
22 908 3879.17 9.30444e-005 413
23 949 3808.83 8.52215e-005 -158
24 990 3766.27 5.46297e-005 -865
25 1032 3691.29 0.000156467 -263
26 1073 3622.83 0.000149539 -364
27 1114 3607.38 0.000119243 4.45
28 1156 3591.08 0.000379983 -6.78
29 1197 3575.83 0.000240085 -21.2
30 1239 3556 0.000489406 -2.82
31 1280 3549.92 0.000163356 -23.3
32 1322 3420.37 0.00606838 -44.4
33 1363 3285.3 0.00265288 -305
34 1404 3268.6 0.00371299 -3.06
35 1446 2911.69 0.0421166 -284
36 1487 2502.19 0.0301324 -809
37 1529 1810.42 0.841983 -2.73
38 1570 1590.51 0.863446 -1.84
39 1611 1457.62 0.317624 -10.9
40 1652 1224.1 0.681949 -73.5
41 1693 1147.71 0.316197 -32.5
42 1734 1061.67 0.479219 -0.771
43 1775 990.417 0.423682 -0.0632
44 1816 944.745 0.4477 -0.602
45 1857 892.944 0.711193 -0.207
46 1898 871.868 0.552265 -0.523
47 1939 855.089 0.463739 0.0449
48 1981 832.165 1.1442 -0.16
49 2022 818.818 0.517951 -0.0946
50 2063 811.471 0.697572 -0.00811
51 2104 806.001 0.58951 0.00154
52 2146 799.658 0.894743 -0.00199
53 2187 794.968 0.97607 -0.0104
54 2228 789.198 0.923713 -0.00576
55 2269 786.548 0.653634 -0.00545
56 2310 783.866 0.706569 -0.00184
57 2351 781.579 0.908965 -0.000913
58 2392 780.375 0.925928 -0.00159
59 2433 779.786 0.642352 7.13e-005
60 2474 779.452 0.783045 -9.29e-005
61 2515 779.317 1.00864 -0.000402
62 2556 779.175 1.36328 -4e-005
63 2597 779.07 1.16283 0.000299
64 2638 779.024 0.799815 0.000659
65 2680 779.004 1.26311 -0.000271
66 2722 778.987 1.75016 -1.44e-005
67 2763 778.979 1.3062 -6.31e-006
68 2804 778.978 0.956557 -1.84e-005
69 2846 778.977 1.36853 -2.32e-006
70 2887 778.977 1.191 1.01e-006
71 2928 778.977 0.736495 -4.04e-006
72 2969 778.977 0.498588 -0.000232
73 3007 778.977 0.249294 -1.46e-005
Table 10: Nonlinear Optimization Iterations - The Desktop Camera Case

7.3 Calibration of the ODIS Camera

Now, we want to calibrate the ODIS camera, a 1/4 inch color board camera with 2.9 mm Lens [17].

7.3.1 Extracted Corners in the Observed Images - The ODIS Camera Case

Figure 13 shows the extracted feature points in the observed images captured by the ODIS camera.

Figure 13: Extracted corners in observed images captured by the ODIS camera

7.3.2 Plot of the Observed and the Projected Image Points - The ODIS Camera Case

Figure 14 shows the observed and the projected image points captured by the ODIS camera. For descriptions, please refer to Figure 10.

Figure 14: Plot of the observed and the projected image points - the ODIS camera case

7.3.3 Comparison of Calibration Results - The ODIS Camera Case

(See Table 11)


Our Implementation Microsoft
Before Opti After Opti After Opti
320.249458 260.7636 260.764
10.454189 -0.2739 -0.273923
164.735845 140.0564 140.056
306.001053 255.1465 255.147
85.252209 113.1723 113.173
0.071494 -0.3554 -0.355429
-0.342866 0.1633 0.163272
Table 11: Comparison of Calibration Results - The ODIS Camera Case

7.3.4 Objective Function - The ODIS Camera Case

(See Table 12)


Microsoft Our Code
840.2665 840.2650
Table 12: Objective Function - The ODIS Camera Case

7.3.5 Nonlinear Optimization Iterations - The ODIS Camera Case

(See Table 13)


Iteration Function Step-size Directional
Count Derivative
1 37 5872.79 0.001 -9.99e+008
2 78 5849.4 4.67898e-008 -1.93e+005
3 120 5841.17 1.15027e-007 -2.18e+003
4 162 5734.46 7.19555e-007 -3.04e+003
5 203 5703.36 1.27443e-007 -6.05e+003
6 245 5524.51 4.61815e-007 -1.11e+004
7 286 5505.25 7.93522e-008 -7.52e+003
8 328 5286.49 9.13589e-007 -3.24e+004
9 369 5252.79 4.27308e-007 -1.41e+003
10 410 5147.55 2.0105e-007 -5.05e+004
11 452 5123.77 1.07891e-006 -278
12 494 5111.44 8.41391e-006 -59.2
13 535 5085.05 4.14977e-006 -301
14 577 5033.1 0.00012499 -49.6
15 618 5017.1 8.89033e-006 -4.93e+003
16 660 4875.16 3.23114e-005 -3.2e+003
17 702 4638.53 9.16416e-005 -2e+004
18 743 4569.17 2.25799e-005 -9.29e+003
19 785 4502.31 7.32695e-005 -277
20 827 4446.65 0.000227162 -86.5
21 868 4199.75 0.000361625 -9.57e+003
22 909 4140.98 6.2771e-005 -2.72e+003
23 950 4052.51 8.45911e-005 -374
24 991 3989.78 0.000138648 -44.9
25 1032 3907.17 0.000104626 907
26 1074 3856.12 0.000173797 -298
27 1116 3779.25 0.000346633 -21.2
28 1157 3645.96 0.000344504 -692
29 1198 3619.6 0.000148309 -20.3
30 1240 3539.96 0.000609224 -67.3
31 1281 3204.79 0.000461207 -8.13e+003
32 1322 3173.84 9.8575e-005 -2.55e+003
33 1364 3039.27 0.00127777 -1.33e+003
34 1406 2869.31 0.0301385 -28.1
35 1447 2805.64 0.00227993 -3.43e+003
36 1488 2252.13 0.0840027 -2.31e+003
37 1529 1774.1 0.405937 -46.5
38 1570 1438.04 0.390156 -0.399
39 1612 1242.37 0.864944 -0.572
40 1653 1166.62 0.702619 0.178
41 1694 1124.36 0.492462 -0.158
42 1735 1030.76 1.09544 -1
43 1776 957.955 0.812872 -3
44 1817 905.386 0.837528 -0.884
45 1858 877.988 0.5546 -0.0476
46 1899 856.077 0.793661 -0.164
47 1940 847.3 1.14172 -0.00252
48 1981 843.999 0.434118 -0.0237
49 2022 842.438 0.780044 -0.0019
50 2063 841.821 0.826689 4.11e-006
51 2105 841.286 1.55349 -0.000208
52 2146 840.828 1.651 -0.000259
53 2187 840.559 0.904055 -0.000108
54 2228 840.414 0.636094 0.000147
55 2270 840.334 1.07033 -4.16e-005
56 2311 840.311 1.03604 -6.22e-005
57 2353 840.29 1.88839 -3.81e-005
58 2394 840.274 1.85123 -1.93e-005
59 2435 840.269 1.53297 -6.25e-006
60 2476 840.266 1.20127 -9.11e-006
61 2517 840.265 1.23825 -8.04e-006
62 2558 840.265