1 Introduction
The microlens array (MLA) based light field cameras, including conventional light field camera [1] and focused light field camera [2], can capture radiance information of light rays in both spatial and angular dimensions, i.e., 4D light field [3, 4]
. The data from light field camera is equivalent to narrow baseline images of traditional cameras with coplanar projection centers. The measurement of same point in multiple directions allows or strengthens the applications in computational photography and computer vision, such as digital refocusing
[5], depth estimation
[6], segmentation [7] and so on. Recent work also proposed the methods on light field registration [8] and stitching [9, 10] to expand the field of view (FOV). To support these applications, it is crucial to accurately calibrate light field cameras and establish exact relationship between the ray space and 3D scene.It plays an important role to build a model for describing the ray sampling pattern of light field cameras. Previous approaches have dealt with imaging models on light field cameras in different optical designs [11, 12, 13, 14, 15]. The common points are based on the fact that the microlens is regarded as a pinhole model and the mainlens is described as a thinlens model. However, some of open issues still remain in the models and methods. Firstly, the proposed models focus on angular and spatial information of rays, but the relationship between light field and 3D scene geometry is not explored. Secondly, very little work has considered a generic model before to describe light field cameras with different image formations [1, 2]. Thirdly, existing intrinsic parameters of light field camera models are either redundant or incomplete such that corresponding solutions are neither effective nor efficient.
In the paper, we first propose a multiprojectioncenter (MPC) model based on twoparallelplane (TPP) representation [3, 4]. Then we deduce the transformations between 3D scene geometry and 4D light rays. Based on geometry transformations in the MPC model, we characterize various light field cameras in a generic 6intrinsicparameter model and present an effective intrinsic parameter estimation algorithm. Experimental results on both virtual (simulated data) and physical (Lytro, Illum and a selfassembly focused) light field cameras have verified the effectiveness and efficiency of our model.
Our main contributions have three aspects, including
(1) We deduce the transformations to describe the relationship between light field and scene structure.
(2) We describe light field cameras with different image formations as a generic 6parameter model without redundancy.
(3) We propose an effective intrinsic parameter estimation algorithm for light field cameras, including a closedform linear solution and a nonlinear optimization.
The remainder of the paper is organized as follows. Section 2 summarizes related work on the models of light field cameras and calibration methods. Section 3 introduces our MPC model and the transformations between the 3D structure and 4D light field. Based on the theory of light field parameterization, a generic 6intrinsicparameter light field camera model is proposed. Section 4 provides the details of our calibration method and analyzes computational complexity of the closedform solution. In Section 5, we present extensive results on the simulated and real scene light fields, demonstrating more accurate intrinsic parameter estimation than previous work [11, 13].
2 Related Work
To acquire 4D light field, there are various imaging systems developed from traditional camera. Wilburn et al. [16] present a camera array to obtain light field with high spatial and angular resolutions. Classic calibration approach is employed for the camera array [17]. More general, in traditional multiview geometry framework, multiple cameras in different poses are defined as a set of unconstrained rays, which is known as Generalized Camera Model (GCM) [18]. The ambiguity of the reconstructed scene is discussed in traditional topics [19]. However, such applications on the camera array are limited by its high cost and complex control. In contrast, the MLA enables a single camera to record 4D light field more conveniently and efficiently, though the baseline and spatial resolution are relatively smaller than camera array. Compared to the camera array, multiple projection centers of MLAbased light field camera are aligned on a plane strictly due to physical design. Recent work devotes to intrinsic parameter calibration of light field cameras in two designs [1, 2], which are quite different according to the image pattern of microlenses.
The main difference of light field cameras is the relative position of main lens’s imaging plane and the MLA plane [20]. It determines rays’ distribution from the same point, which affects the way to extract subapertures from raw image, i.e., the microlens images [21, 22]. However, the measurements of the same point in multiple directions are obtained in different types of light field cameras, equivalent to the data of GCM. Therefore, the light field camera model can use classic multiview geometry theory for reference.
Recently, some stateoftheart methods have proposed models on conventional light field camera, where multiple viewpoints or subapertures are convenient to be synthesized. Dansereau et al. [11] present a model to decode pixels into rays for a Lytro camera, where a 12freeparameter transformation matrix is related to reference plane outside the camera (in nonlinear optimization, 10 intrinsic parameters and 5 distortion coefficients are finally estimated). However, the calibration method using traditional camera calibration algorithm is not effective, also there are redundant parameters in the decoding matrix. Bok et al. [13] formulate a geometric projection model consisting of a main lens and a MLA (their extended work has been published in IEEE TPAMI [23]). Intrinsic parameters are estimated by conducting raw images directly and an analytical solution is deduced. Moreover, Thomason et al. [15] try to deal with the misalignment of the MLA and estimated its position and orientation.
Apart from this, other researchers have explored models on the focused light field camera, where multiple projections of the same point are convenient to be recognized. Johannsen et al. [12] propose to calibrate intrinsic parameters of the focused light field camera. By reconstructing 3D points from the parallax in adjacent microlens images, the parameters (including depth distortion) are estimated. However, the geometry center of micro image is on its microlens’s optical axis in the camera model. This assumption causes inaccuracy on reconstructed points and estimated results are finally compensated by the coefficients of depth distortion. Hahne et al. [24] further discuss the influence of abovementioned assumption, i.e., the deviation of microlens and its image. Heinze et al. [25] apply a similar model with Johannsen et al. [12] and deduce a linear initialization for intrinsic parameters.
In a word, previous light field camera models are either redundant or complex, which leads to a nonunique solution of intrinsic parameter estimation or inaccuracy of decoding light field. An unreliable camera model is also a bottleneck that might impede light field applications for computer vision and computational photography, especially on light field registration, stitching and enhancement. To support further applications, a general light field camera model capable of representing rays and scene geometry more concisely is in urgent need.
3 MultiProjectionCenter Model
In this section, we first propose the MPC model based on the TPP representation of light field. Then we deduce the transformation matrix to relate 3D scene geometry and 4D rays. Finally, we utilize the MPC model to describe the image formation of light field cameras and define generic intrinsic parameters, including conventional and focused light field cameras. Table I gives the notation of symbols used in the following sections.
Term  Definition 

Indexed pixel of raw image inside the camera  
Virtual (conjugate) light field outside the camera  
Decoded physical light field  
Intrinsic parameters  
3D point in the world coordinates  
3D point reconstructed by  
3D point reconstructed by  
Rotation matrix of extrinsic parameter  
Translation vector of extrinsic parameter 

Measurement matrix of rays  
Homogenous projection matrix  
Nonhomogenous projection matrix partitioned from  
Homography matrix decided by intrinsic and extrinsic parameters only  
Distortion vector 
3.1 The Coordinates of MPC Model
As shown in Fig. 1, there are three coordinates in the MPC model, i.e., 3D world coordinates , 3D camera coordinates , 4D TPP coordinates ( for the view plane and for the image plane). In general, the transformation between world and camera coordinates is related by extrinsic parameters . The spacing between two parallel planes of traditional TPP representation is normalized as to describe a set of rays [3, 4]. Although it is complete and concise, to derive the transformation between 3D structure and 4D rays in light field cameras, we prefer a model consisting of two parallel planes with the spacing .
Let denote light field in the MPC model with the spacing . Then the ray is parameterized by two planes, i.e., and . Let denote the view plane and denote the image plane . In the MPC model, defines a ray passing and , where is the projection center and is the corresponding projection.
Given a projection center (i.e., the  view or subaperture) and the 3D point , we can get the image projection in the local coordinate of the  view,
(1) 
Since there are multiple projection centers , , the 3D point can be observed for times. Obviously, when the spacing changes to and there is only one projection center on the view plane, the image formation degenerates into traditional centralprojective camera model [19].
3.2 Transformation between Geometry and Rays
It is known that different directional rays from one point enable 3D reconstruction. Let the ray intersect at the point in the 3D space, we can get the relationship between the ray and 3D point by the triangulation,
(2) 
where is a matrix consisting of rays and the MPC parameter .
If two rays and are from one 3D point , they can be represented by the following two equivalent forms,
(3) 
and
(4) 
3.3 3D Projective Transformation
In fact, a linear transformation on the coordinates of
causes 3D projective distortion on the reconstructed point [19], deduced from Eqs.(3) and (4). As shown in Fig. 2, we show three examples of linear transformations, including the changing of , scaling in the image plane () (in general there are 4 scaling factors , two in the view plane and two in the image plane respectively), and translation in the image plane of specific view (generally in both planes). The details are derived as follows.(1) If we change into , the imaging point passed by becomes and the intersection of rays becomes . Substituting it into Eqs.(3) and (4), we have
(5) 
where and are in the homogeneous coordinates.
(2) Let become , thus there is a transformation on the rays caused by the offset . Substituting it into Eqs.(3) and (4), we can get the transformation matrix between and ,
(6) 
(3) Let become , thus there is a transformation caused by the scaling vector . Then the transformation matrix between and is,
(7) 
and
(8) 
As shown in the leftmost of Fig. 2, there is a scene with a Lambertian cube recorded by a MPC model. The observation of the cube in multiple directions is 4D light field. If the coordinates are linearly transformed and the light intensity keeps constant, the intersections of rays will be transformed by a 3D projection matrix. Therefore, the cube will be projected by transformation parameters (the right three of Fig. 2).
3.4 The MPC Model in Light Field Cameras
Light field cameras are improved from traditional cameras. They record real world scene in different but similar ways. In traditional cameras, the central projection process of a 2D image is a dimension reduction of 3D space [19]. In light field camera, 3D structure projected by the main lens is arranged by the design of light path on the image sensor. The processes of multiple center projections are analyzed as follows.
On the one hand, as for a conventional light field camera, the sampling pattern of light field is shown in Fig. 3. The pixel of subaperture images is extracted from the microlens of . The subaperture image of the view is extracted from the pixels in the local microlens image coordinates, as shown in Fig. 3. Obviously, there are two light fields, i.e., inside the camera and in the outer world. Considering the projection of main lens, there is a 3D projective distortion between the 3D points reconstructed from and .
On the other hand, as for the focused light field cameras, two sampling patterns of light field in two different optical paths are shown in Fig. 4. The microlenses project the distorted 3D scene inside the camera on the image sensor, where the image range is controlled by the aperture of main lens and the distance of components. The light field inside the camera can be decoded by the pixels of image sensor and their corresponding optical centers of microlens , i.e., . In addition, is determined by the layout of MLA, as shown in Fig. 5b. By the transformation on the coordinate of we have discussed in Sec 3.3, the outside light field is obtained, which is the conjugate MPC coordinate outside the camera. The real world scene can be reconstructed by the light field without projective distortion.
Let denote indexed pixels of light field cameras with . Moreover, is a set of indexed pixels and not a physical light field. In conventional light field camera, are the subaperture images indexed by the view. In the focused light field cameras, are microlens images indexed by their relative positions on the raw image. Obviously, by a linear transformation on the , we can conduct and eliminate 3D projective distortion caused by the main lens. However, to parameterize 4D light field without redundancy, the spacing of two parallel planes should be 1. Let denote the normalized light field. According to Eqs.(5) to (8), the normalization is a linear operation on the coordinates, and transformation matrices , and are all identity matrices. It means that indexed pixels can be transformed to physical rays in real world scene by linear transformations as we discussed before. The indexed pixels and decoded physical light field of light field cameras in two different designs are shown in Fig. 5, where pixels and physical rays are related by intrinsic parameters.
In summary, we can transform an indexed pixel of raw image into a normalized physical light field by a decoding matrix that is consisting of intrinsic parameters .
(9) 
Let and denote two 3D points reconstructed by and respectively. According to Eq.(9), the relationship between and is
(10) 
where is determined by intrinsic parameters in the decoding matrix . Here, and , which are totally decided by the mapping from indexed pixels to real world light rays.
In addition, the light field inside a conventional light field camera (in Fig. 3) can also be parameterized by the MPC model that is consisting of image sensor and the MLA. However, considering the convenience of extracting subaperture images and the difficulty on detecting points on raw image in a conventional light field camera, we prefer to discuss the data as a set of subaperture images. Conversely, for the focused one, we model the parameterization plane by the raw image plane and discuss the raw image directly.
4 Light Field Camera Calibration
We verify our light field camera model by intrinsic parameter calibration. We will provide the details of how to solve intrinsic parameters, including a linear closedform solution and a nonlinear optimization to minimize the reprojection error. In our method, the prior scene points are supported by a planar calibration board in different poses.
4.1 Linear Initialization
After necessary preprocessing, the microlens images are recognized [11, 21, 26], i.e., . We assume that the prior 3D point in the world coordinates is related to the 3D point in the MPC coordinates by a rigid motion, , with the rotation and translation . Let denote  column vector of . The relationship among , , and intrinsic parameters is obtained by Eqs.(2) and (10).
(11) 
where is a measurement matrix of rays and . These rays are derived from the indexed pixels as mentioned in Eq.(1).
Suppose that the calibration board is on the plane of in the world coordinates, thus . To solve the unknown parameters, we simplify Eq.(11) as,
(12) 
where is a matrix stretched on row from . is a direct product operator. is a matrix only consisting of intrinsic and extrinsic parameters, defined as
(13) 
In addition, is a matrix containing at least 2 rays from light field , according to Eq.(2). By stacking measurements from at least 3 noncollinear points , the homography can be estimated by Eq.(12).
In order to derive intrinsic parameters from , we can partition to extract a upper triangle matrix . Let denote the element on the  row and  column of , we rewrite Eq.(13) as follows,
(14) 
where is a matrix, i.e., topleft of .
Let denote the  column vector of . Utilizing the orthogonality and identity of , we have
(15)  
where .
Let a symmetric matrix denote . The analytical form of is
(16) 
Note that there are only 5 distinct nonzero elements in , denoted by . To solve , we rewrite Eq.(15) as follows,
(17) 
By stacking at least two such equations (from two poses) as Eq.(17), we can obtain a unique general nonzeros solution for , which is defined up to an unknown scale factor.
Once is determined, it is an easy matter to solve using Cholesky factorization [27]. Let denote the estimation of , i.e., . Let denote the element on the  row and  column of , intrinsic parameters except and are estimated by the ratio of elements
(18)  
Apart from intrinsic parameters, extrinsic parameters in different poses can be extracted as follows,
(19)  
where denotes norm. values 1 or 1 and it is decided by image formation. In conventional light field camera and the focused one with shorter light path (as shown in Fig. 3 and 4b), makes . Otherwise, in the focused light field camera with longer light path (see Fig. 4a), makes .
To obtain other two intrinsic parameters and , we substitute the results in Eq.(19) for Eq.(11) and obtain using the estimated extrinsic parameters. Then, Eq.(2) is rewritten as,
(20) 
Stacking the measurements in different poses, we can obtain a unique nonzeros solution for and .
4.2 Nonlinear optimization
The most common distortion of traditional camera is radial distortion. The optical property of main lens and physical machining error of the MLA might lead to the distortion of rays in light field camera. Theoretically, due to two level imaging design with main lens and microlens array, there should exist radial distortion on the image plane and sampling distortion on the view plane simultaneously. In the paper, we only consider the distortion on the image plane and omit sampling distortion on the view plane (i.e., angular sampling grid is ideal without distortion).
(21) 
where and is the ray transformed from the measurement by intrinsic parameter according to Eq.(9). denotes distortion vector and is undistorted projection from the distorted one in the local image coordinates under the view. In the distortion vector , and regulate radial distortion on the image plane. and represent the distortion of image plane affected by the sampling view , which is caused by nonparaxial rays of the main lens.
We minimize the following cost function with the initialization solved in Section 4.1 to refine the parameters, including intrinsic parameter , distortion vector , and extrinsic parameters and , , is the number of poses.
(22) 
where is the image point from according to Eq.(9) and followed by distortion rectification according to Eq.(21). is the projection of 3D point in the world coordinates according to Eq.(1).
In Eq.(22), is parameterized by Rodrigues formula [28]. In addition, the Jacobian matrix of cost function is simple and sparse. This nonlinear minimization problem can be solved with the LevenbergMarquardt algorithm based on trust region method [29]. We adopt MATLAB’s function to complete the optimization.
4.3 Computational Complexity
The calibration algorithm of light field camera is summarized in Alg. 1. Let denote sampling number on the view plane, be the number of prior points on the calibration board, and be the number of poses, respectively. For the measurement of each pose, there are linear equations to solve . Then linear equations and equations are solved to obtain intrinsic parameters. The main complexity is spent on the solution of from different poses, i.e., .
By contrast, the algorithm in Dansereau et al. [11] calculates a homography for every subaperture image in different view, i.e., . It suffers from a higher complexity and a lower accuracy on parameter initialization. The algorithm in Bok et al. [13] solves a linear equation on every pose and its computational complexity is . However, there are intrinsic and extrinsic parameters in the equations, which causes inaccuracy on the solution.
5 Experimental Results
In this section, we verify our light field camera model by the calibration of intrinsic parameters. We present various experimental results both on simulated and real datasets. The performance is analyzed by comparing with the ground truth or baseline algorithms [11] and [13].
5.1 Simulated data
In this subsection we verify our calibration method on simulated data. The simulated light field camera has the following property referred to Eq.(9), as shown in Table II. These parameters are close to the setting of Lytro camera so that we obtain plausible input close to realworld scenarios. The checkerboard is a pattern with points with cells.
5.1.1 Performance w.r.t the number of poses and views
Firstly, we test the performance with respect to the number of poses and the number of views. We vary the number of poses from 2 to 8 and the number of views from to . For each combination of pose and view, 200 trails of independent calibration board poses are generated. The rotation angles are randomly generated from to
, and the measurements are all added with Gaussian noise with zero mean and standard deviation 0.5 pixels.
The calibration results with increasing measurements are shown in Fig. 6. We find that the relative errors decrease with the increase in the number of poses. When the number of pose is greater than 2, all the relative errors are within an acceptable level, as summarized in TableIII. Meanwhile, the errors reduce as the number of views grows once the number of poses is fixed. In particular, when and , all the relative errors are less than . Furthermore, the standard deviations of relative errors of Fig. 6 are shown in Fig. 7, from which we can see that standard deviations decrease significantly when the number of pose is greater than 2. Particularly, when and , standard deviations keep at a low level stably. The results in Fig. 6 and Fig. 7 have verified the effectiveness of the proposed calibration algorithm.
2.4000e04  2.5000e04  2.0000e03  1.9000e03  0.3200  0.3300 
Min  0.0842  0.0795  0.1019  0.1020  0.1633  0.1295 

Max  2.0376  1.9238  0.6871  0.6881  1.0511  0.9298 
5.1.2 Performance w.r.t the measurement noise
Secondly, we employ the measurements of 3 poses and views to verify the robustness of calibration algorithm. The rotation angles of 3 poses are , and respectively. Gaussian noise with zero mean and a standard deviation is added to the projected image points. We vary from to pixels with a step. For each noise level, we performed 150 independent trials. The mean results compared with ground truth are shown in Fig. 8. It demonstrates that the errors increase almost linearly with the noise level. For pixels which is larger than normal noise in practical calibration, the errors of and are less than . Although the relative error of is , the absolute error of is less than pixel (In Eq.(9), and , where is the principal point of subaperture imaging), which further exhibits that the proposed algorithm is robust to higher noise level.
5.2 Physical camera
We also verify the calibration method on real scene light fields captured by conventional and focused light filed cameras. For the conventional light field camera, we use Lytro and Illum to obtain measurements. For the focused one, we use a selfassembly camera according to optical design in Fig. 4a.
5.2.1 Conventional Light Field Camera
The subaperture images are obtained by the method of Dansereau et al. [11]. We compare the proposed method in ray reprojection error with stateofthearts, including DPW by Dansereau et al. [11] and BJW by Bok et al. [13].
Firstly, we carry out calibration on the datasets collected with [11]. For every different pose, the middle subapertures are utilized similar to DPW. Table IV summarizes the root mean square (RMS) ray reprojection errors of our method and DPW[11]. In Table IV, the errors of DPW[11]1 are taking from the paper directly. The errors of DPW[11]2 are obtained by running their latest released code. On the item of initial, the proposed method provides a smaller ray reprojection error than DPW except on datasets A and B. The result on dataset A performs worse because of bad corner extraction from several poses (i.e., 7, 8, 9 and 10 light field). On the item of optimized, compared with DPW [11] which employs 12 intrinsic parameters, the proposed MPC model only employs a half of parameters but achieves similar performance on ray reprojection errors (the results on datasets A, B and D are better but the results on datasets C and E are worse). Light fields within each dataset are taken over a range of depths and orientations, as shown in Fig. 9. The ranges of datasets A, B are whilst the ranges of datasets C and D do not exceed . Meanwhile, the ranges do not exceed in dataset E. Large ranges are reasonable in all datasets only deducing the accuracy in light of distortion model considering the shifted view. This is the main reason why the performance of the proposed method is worse than that of DPW on dataset E. From the dataset A, we select 6 light fields from which the corners are exactly extracted for the proposed method. The ray reprojection error decreases obviously in Table IV. Considering the fact that the errors exhibited in DPW are minimized in its own optimization (i.e., ray reprojection error), we additionally evaluate the performance in mean reprojection error of DPW and BJW. As exhibited in Table V, the errors of the proposed method are obviously smaller than those of DPW and BJW. In addition, the calibration with fewer number of poses on the datasets [11] is conducted. For dataset D, we randomly select 6 light fields, and for datasets B, C and E, 5 light fields are randomly selected for calibration. Table VI summarizes RMS ray reprojection errors and RMS reprojection errors of the proposed method and DPW [11] respectively. In Table VI, the proposed method achieves smaller errors than DPW. Besides, the calibration results on datasets D and E are obviously improved by reducing the number of poses. We find that smaller range of poses contributes to a performance improvement on datasets D and E, which is shown in Fig. 9. Table VII lists intrinsic parameter estimation results. The reprojection errors of subaperture images of B are summarized in Table VIII. The distribution of errors is almost homogeneous. All results have verified the effectiveness of the proposed method.
A  A(6)  B  C  D  E  

DPW[11]1  3.2000    5.0600  8.6300  5.9200  13.8000  
Initial  DPW[11]2  0.5190  0.4229  0.5403  0.8832  1.1021  5.9567 
Ours  15.3753  0.5400  0.5952  0.5837  0.7473  2.6235  
DPW[11]1  0.0835    0.0628  0.1060  0.1050  0.3630  
Optimized  DPW[11]2  0.0822  0.0903  0.0598  0.1300  0.1149  0.3843 
Ours  0.0810  0.0810  0.0572  0.1123  0.1046  0.5390 
A  A(6)  B  C  D  E  

DPW[11]  0.2284  0.3338  0.1582  0.1948  0.1674  0.3360 
BJW[23]  0.3736    0.2589      0.2742 
Ours  0.2200  0.2375  0.1568  0.1752  0.1475  0.2731 
Ray reprojection error  Reprojection error  

unit:  unit:  
DPW[11]  Ours  DPW[11]  Ours  
B(5)  0.0643  0.0622  0.2380  0.1458 
C(5)  0.1260  0.1250  0.2323  0.1705 
D(6)  0.0941  0.0622  0.2024  0.1458 
E(5)  0.2967  0.2888  0.3525  0.2049 
A  B  C  D  E  

2.6998e04  2.7937e04  2.4569e04  2.6833e04  2.3004e04  
2.7608e04  2.8874e04  2.5359e04  2.6930e04  2.3073e04  
1.8572e03  1.8357e03  1.8122e03  1.8342e03  1.7585e03  
1.8692e03  1.8323e03  1.8133e03  1.8352e03  1.7634e03  
0.3417  0.3415  0.3550  0.3343  0.3520  
0.3449  0.3344  0.3382  0.3275  0.3615  
0.2288  0.1829  0.1639  0.1719  0.1612  
0.0928  0.0875  0.0174  0.0213  0.0483  
4.5308  3.6330  3.3591  3.5122  2.7747  
4.4428  3.6064  3.3394  3.4662  2.8320 
3  2  1  0  1  2  3  

3  0.1930  0.1820  0.1781  0.1759  0.1759  0.1812  0.2372 
2  0.1836  0.1763  0.1700  0.1687  0.1718  0.1786  0.1813 
1  0.1815  0.1724  0.1669  0.1658  0.1692  0.1761  0.1826 
0  0.1783  0.1731  0.1683  0.1662  0.1713  0.1798  0.1897 
1  0.1772  0.1733  0.1706  0.1705  0.1748  0.1837  0.1851 
2  0.1769  0.1761  0.1757  0.1768  0.1809  0.1836  0.1815 
3  0.2039  0.1746  0.1728  0.1730  0.1798  0.1833  0.2755 
Illum1  Illum2  Lytro1  Lytro2  

DPW[11]  0.9355  0.6274  0.6201  0.5057  
Initial  BJW[13]  1.0765  0.8330  1.6676  1.0201 
Ours  0.7104  0.4899  0.3538  0.2364  
Optimized  DPW[11]  0.5909  0.4866  0.1711  0.1287 
without  BJW[13]         
Rectification  Ours  0.5654  0.4139  0.1703  0.1316 
Optimized  DPW[11]  0.2461  0.2497  0.1459  0.1228 
with  BJW[13]  0.3966  0.3199  0.4411  0.2673 
Rectification  Ours  0.1404  0.0936  0.1400  0.1124 
Illum1  Illum2  Lytro1  Lytro2  
3.5721e04  2.2464e04  5.9386e04  3.8915e04  
3.5455e04  2.3299e04  5.7870e04  3.8247e04  
1.4309e03  1.6670e03  9.5083e04  1.3195e03  
1.4303e03  1.6657e03  9.4794e04  1.3261e03  
0.4565  0.5178  0.1964  0.2775  
0.2827  0.3557  0.1865  0.2521  
0.3001  0.3562  0.4559  0.0254  
0.2779  0.2595  6.8221  0.8469  
1.4109  0.6185  1.3060  2.2441  
1.4204  0.8879  1.3234  2.2684 
5  3  1  0  1  3  5  

5  0.7880  0.2997  0.3008  0.3041  0.3058  0.3178  0.8070 
3  0.2992  0.3003  0.3033  0.3025  0.3015  0.2930  0.3077 
1  0.2988  0.3086  0.3176  0.3182  0.3141  0.2996  0.2827 
0  0.2942  0.3139  0.3115  0.3058  0.3064  0.3024  0.2772 
1  0.2934  0.3178  0.3118  0.2963  0.3077  0.3057  0.2784 
3  0.3002  0.2966  0.3170  0.3093  0.3115  0.2856  0.2843 
5  0.3283  0.2961  0.2851  0.2854  0.2841  0.2852  0.3102 
Unlike the core idea of DPW, BJW directly utilizes raw data instead of subapertures. However it has a stricter requirement on the acquisition of the calibration board. The data for calibration must be unfocused in order to make the measurements detectable, thus some datasets provided by DPW are incalculable for BJW, just as shown in Table V (i.e. datasets C and D). In order to directly compare with DPW and BJW, we collect other 4 datasets^{1}^{1}1http://www.npucvpg.org/opensource using Lytro and Illum cameras. The dataset Illum1 shoots corners with cells, including 9 poses. The dataset Lytro1 shoots corners with cells, including 8 poses. The datasets Illum2 and Lytro2 shoot corners with cells, including 10 poses. For Illum1 and Illum2 datasets, the middle views are used ( views in total). For Lytro1 and Lytro2, the middle views are used ( views in total). Table IX summarizes the RMS ray reprojection errors compared with DPW and BJW at three calibration stages. As exhibited in Table IX, the proposed method obtains smaller ray reprojection errors on the item of initial solution which verified the effectiveness of linear initial solution for both intrinsic and extrinsic parameters. Besides, the proposed method provides similar or even smaller ray reprojection errors on the item of optimization without rectification compared with DPW. It is noticed that the result on dataset Lytro2 is relatively larger than that of DPW. The main reason is that distortion coefficients and in our model are similar to the elements of the decoding matrix in [11]. Considering the fact that MPC model employs less parameters (i.e. 6parameter) than DPW (i.e. 12parameter), the proposed method is competitive with acceptable calibration performance. Further, it is more important that we achieve smaller ray reprojection errors if distortion rectification is introduced in optimization. The ray reprojection errors are encouraging that the proposed method outperforms DPW and BJW. Consequently, the 6parameter MPC model and 4parameter distortion model are effective to represent light field cameras.
The reason why we compare ray reprojection errors here is to eliminate differences in camera models. The decoding matrix in [11] is similar to Eq.(9), except for the nondiagonal elements. The nonzero elements and indicate that pixels on the same subaperture image have specific relationships among different views. If we calculate the estimated rays by for the measurement , the views may be different. It indicates that there are errors both on the view plane and image plane in [11]. As a result, it is not reasonable to compare reprojection error only.
Moreover, the results of intrinsic parameter estimation and pose estimation on our datasets are demonstrated in Table X and Fig. 10 respectively. After the calibration process, we measure the RMS reprojection errors of subaperture images by utilizing estimated parameters, as shown in Table XI. In order to further verify the accuracy of intrinsic and extrinsic parameter estimation, we stitch all other light fields on the first pose, as shown in Fig. 11, from which we can see all view light fields are registered and stitched very well. Eventually, it is worthy noting that there exists distinct distortion in the Illum camera. In Fig. 12, we show original central view subaperture image and rectification results using distortion models of the proposed method, DPW [11] and BJW [13] respectively. Since the reprojection error indicates the image distance between a projected point and a rectified one, it can be used to quantify the error of distortion rectification results. In Fig. 12, we list the RMS reprojection error of central view subaperture image using different methods in parentheses, which further verifies that the rectification results of the proposed method are better than those of baseline algorithms.
Highprecision calibration is essential in early stages of light field processing pipeline. In order to verify the accuracy of geometric reconstruction of the proposed method compared with baseline methods, we capture two real scene light fields, then reconstruct several typical corner points and estimate the distances between them. Fig. 13 shows reconstruction results on the central view subaperture images. As exhibited in Fig. 13, the estimated distances between points reconstructed by the proposed method are nearly equal to those measured lengths from real objects by rulers. In addition, Table
Comments
There are no comments yet.