Creating Realistic Ground Truth Data for the Evaluation of Calibration Methods for Plenoptic and Conventional Cameras

Camera calibration methods usually consist of capturing images of known calibration patterns and using the detected correspondences to optimize the parameters of the assumed camera model. A meaningful evaluation of these methods relies on the availability of realistic synthetic data. In previous works concerned with conventional cameras the synthetic data was mainly created by rendering perfect images with a pinhole camera and subsequently adding distortions and aberrations to the renderings and correspondences according to the assumed camera model. This method can bias the evaluation since not every camera perfectly complies with an assumed model. Furthermore, in the field of plenoptic camera calibration there is no synthetic ground truth data available at all. We address these problems by proposing a method based on backward ray tracing to create realistic ground truth data that can be used for an unbiased evaluation of calibration methods for both types of cameras.

READ FULL TEXT VIEW PDF

Authors

page 1

page 3

page 4

page 5

03/09/2022

Simulation of Plenoptic Cameras

Plenoptic cameras enable the capturing of spatial as well as angular col...
01/26/2018

A Two-point Method for PTZ Camera Calibration in Sports

Calibrating narrow field of view soccer cameras is challenging because t...
03/09/2022

Ray Tracing-Guided Design of Plenoptic Cameras

The design of a plenoptic camera requires the combination of two dissimi...
03/10/2020

Reconstruction of 3D flight trajectories from ad-hoc camera networks

We present a method to reconstruct the 3D trajectory of an airborne robo...
07/12/2021

Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

The amount and quality of datasets and tools available in the research f...
04/16/2020

Blur Aware Calibration of Multi-Focus Plenoptic Camera

This paper presents a novel calibration algorithm for Multi-Focus Plenop...
09/20/2021

BabelCalib: A Universal Approach to Calibrating Central Cameras

Existing calibration methods occasionally fail for large field-of-view c...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The most commonly used camera calibration procedure consists of three steps: i) capturing images of calibration patterns, ii) detection of the patterns, points of interest in the images belonging to the calibration pattern, and iii) using the correspondences to optimize the parameters of the assumed mathematical camera model. In order to evaluate single parts of this pipeline, synthetic calibration pattern renderings with known correspondences can be beneficial in two ways. Firstly, the quality of the pattern detection method can be assessed by comparing the detector results on the renderings to the ground truth positions, and secondly, the optimization as well as the camera model can be evaluated using the ground truth correspondences without depending on a possibly biased pattern detector. However, the validity of such an evaluation depends on the quality of the ground truth data, its ability to reflect real data.
For conventional cameras the synthetic image generation is usually done by first rendering the calibration pattern from the perspective of a simple pinhole camera model so that the correspondences are easy to calculate. Afterwards the images and correspondences are then distorted according to the assumed camera model (see [24][6][13]). This procedure poses the problem, that the generated data is not reflecting a real camera, but a virtual camera perfectly complying with the assumed distortion model. Accordingly, in a comparative evaluation of different calibration algorithms those methods assuming the exact same camera model have an advantage. Another problem is posed by the modeling of de-focus and image degradation effects like vignetting. In previous works these are either not modeled at all or simulated by adding Gaussian noise and blur to the perfectly distorted images. In neither of these cases the resulting images are directly comparable to real data resulting from a significantly more complex image formation process.

Figure 1: Schematics of a plenoptic camera as introduced by Adelson and Wang [1] and Lumsdaine and Georgiev [14] based on the ideas of Lippmann [11]. The red and green cones indicate the areas visible from the corresponding pixels.

In the case of plenoptic cameras this imaging process is even more complicated since an additional microlens array (MLA) is placed between the main lens and the image sensor (compare Fig. 1). This renders the standard pipeline of creating perfect images and adding distortions and degradation afterwards infeasible since distortions of the main lens affect the position and angle of a ray hitting the MLA and therefore have to be applied before the light rays enter the microlenses.
We propose a pipeline to generate realistic renderings of calibration patterns with ground truth correspondences, the positions of the calibration patterns’ points of interest in the form of 2D pixel coordinates as well as 3D world coordinates. The key idea in generating these ground truth positions is to use ray tracing not only to render a realistic image of a calibration pattern, but also to render a position image whose pixels store positional information about the scene points hit by the rays traced from the respective pixel. Since the pose of the calibration pattern and thereby the 3D positions of its points of interest are exactly known, the search for the ground truth positions within the rendering is reduced to simply finding the pixel positions with the correct positional information in the rendering . Since a straight forward implementation of this idea is computationally expensive, we also propose a second method in which a ray in the scene space is calculated for every pixel, which is then intersected with the desired calibration pattern model to find the ground truth point positions. In summary, our contributions are:

  • An extension of the plenoptic camera model of [15] to include multiple microlens types

  • Two methods for calculating the ground truth positions of the points of interest in the rendered images

  • Publicly available implementations of the simulation and ground truth creation methods111https://gitlab.com/ungetym/plenoptic_ground_truth_creator

Note, that while the descriptions throughout this work are focused on plenoptic cameras, the whole pipeline is directly applicable to conventional cameras. We simply choose to describe the method for plenoptic cameras since these present a more complex case and the research in this area is in greater need of ground truth data as the standard approach of distorting perfect renderings is not applicable here.

2 Related Work

Camera simulation in computer graphics: The idea of using more realistic, physically-based lens models instead of a perfect pinhole camera for synthesizing images via ray tracing has first been explored by Potmesil and Chakravarty [19] and was later refined by Kolb [8] and Wu [21]. These models were further extended by Wu [22] regarding the use of spectral ray tracing to simulate certain wave optics effects.
In contrast to these advanced methods for conventional cameras, the simulation of plenoptic cameras is a less explored area. This type of camera has only gained interest during the past decade due to the emergence of the first prototypes by Ng [16] and the commercial realizations by Lytro (no longer existing) and Raytrix [20]. Despite becoming a more active field of research, the simulation of plenoptic cameras in most publications concerned with using synthetic images is rather rudimentary. Fleischmann [4] render images without any main lens and Zhang [23] as well as Liang [10] use a simplified thin main lens model. Accordingly, the synthesized images do not show the distortion and image degradation effects present in real data. Further works by Liu [12] and Li [9] based on ray splitting require an unrealistic large distance between the camera and the scene objects as well as simple scene materials. More recently, Michels [15] proposed to fully model a plenoptic camera’s components and presented an implementation for Blender [2]. Despite wave optic effects and multiple microlens types not being simulated in this approach, we decided to base our method on it due to its availability and extensibility.
Evaluation of pattern detectors: While previous works on the calibration of plenoptic cameras use either manually labeled data [17] or evaluate the detection and calibration as a combined system relying on precise real life measurements [7][3], approaches for conventional cameras have been evaluated with synthetic data for a broad variety of different patterns over the past decades. Luccese and Mitra [13] use projective warping and Gaussian blur on checkerboard images and Zhang [24]

renders square pattern images assuming a pinhole camera with non-zero skew and also applies Gaussian blur. Heikkila

[6] employs ray tracing and additional Gaussian noise as well as blur, but uses exactly the same camera model for rendering the point patterns as for the calibration. Ha [5] render sharp single triangle pattern corners for a perspective camera and add different levels of Gaussian noise and blur afterwards.
In summary, there is no previous work for the calibration of plenoptic cameras featuring synthetic data and the work dealing with conventional cameras uses simplified or ideal models to generate synthetic data.

3 Organization

Sections 4.1 to 4.3 describe the extended camera model for ray tracing and our general approaches for creating the ground truth positions. Subsequently, some insights regarding the usefulness of the direct approach via forward ray tracing are provided in section 4.4. Finally, in section 5.1 the realization for Blender is explained and the remaining sections are devoted to the evaluation of our approach.

4 Method

The ray tracing approach presented in [15] is capable of producing realistic images for plenoptic cameras with one microlens type and by deactivating the MLA it can also be used to simulate conventional cameras within the bounds of ray tracing, without wave optical effects. Nevertheless, instead of directly using this simulation for our positional rendering approach, we first extend the camera model in order to also be able represent multifocus plenoptic cameras as distributed by Raytrix [18].

4.1 Simulation of Plenoptic Cameras

Figure 2: Schematics of the plenoptic camera model for ray tracing. While the objective’s lenses are fully modeled, the MLA is approximated by two planes with recalculated normals and the sensor is simulated by a combination of an orthographic camera and a diffusor plane.

The basic setup of the camera model is given in Fig. 2. As described in [15], the objective’s lenses are explicitly modeled and the refraction at their surfaces is smoothed by recalculating the surface normals in order to avoid image artifacts resulting from the polygonal surface structure. The sensor is modeled by combining an orthographic camera with a diffusor plane which randomly refracts the rays traced from the orthographic camera within a specified angle distribution. This simulates a real pixel’s field of view (FOV) and its response to light rays with different angles of incidence. Accordingly, the diffusor plane can be thought of as the location of the sensor.
The last component, the MLA, is designed as a simple two plane model by exploiting the lensmaker’s equation for a thin lens with index of refraction (IOR) and focal length , given by

(1)

where and describe the front and back surface curvature radii. For a flat back surface, given by , it follows and since a large radius leads to the front surface locally nearly being a plane, the microlenses can be constructed by using a two plane model with large IOR and recalculated surface normals. We extend this part of the model to feature differently focused microlenses on the same MLA as shown in Fig. 3 by setting different values for depending on the coordinates of a microlens in the hexagonal grid. Since to our knowledge the Raytrix cameras are the only commercially available multi focus plenoptic cameras, we describe the extension for a setup with three microlens types as used by Raytrix. This model, however, can easily be modified to feature different configurations.
For the three microlens setup the type of a microlens with center coordinates in the hexagonal grid is given by as visualized in Fig. 3. In order to match the setup of a Raytrix camera, the three focal lengths have to be chosen carefully with two restrictions in mind. First, all focal lengths should be larger than the distance between the MLA and the sensor plane since the MLA in Raytrix cameras is placed between the main lens and the virtual image of the scene, thus the microlenses collect converging light rays (compare Fig. 1). And second, the depth of field (DoF) of the three lens types should slightly overlap to create a connected combined DoF without focus gaps [18].

Figure 3: Rendering of a checkerboard with three differently focused microlens types overlaid with the hexagonal MLA layout. The black tuples are the coordinates of the center points with respect to the visualized base and the colored numbers indicate the lens type.

The described model can now be used to render realistic images of calibration patterns (or arbitrary scenes) for various plenoptic as well as conventional camera setups, where the camera type can be switched by (de)activating the MLA and modifying the parameters and positioning of the sensor and MLA. Note, that for the sake of simplicity, the illustrations in the remaining sections will contain the schematics of a real plenoptic camera instead of the model described here.

4.2 Rendering Positional Information

Figure 4: Positional rendering visualized: The right image shows a section of the rendering containing the pixel and the schematics in the middle visualize the bundle of rays traced from and its intersection with the calibration pattern object. On the left the set of scene points hit by the rays, , is shown in red. Despite the calibration pattern not being in focus of the microlens, the pixel’s positional information, , can be calculated as the mean of the set of points, shown in blue.

Since the pattern position and orientation are exactly known for the rendering, we can assume to have a set of locations of the relevant pattern points, the corners of a checkerboard. In order to use this information for finding the ground truth pixel positions of the calibration pattern points in the images rendered with the previously described setup, the same model is used to render positional information via ray tracing. In the usual backward ray tracing pipeline, for every pixel a set of rays is traced through the camera into the scene and the colors of the scene points hit by the rays are accumulated which will be shortly formalized in the following. Let denote the first scene point hit by the ray and split the set of rays into two disjoint sets with containing the rays not leaving the camera due to being blocked by the aperture or camera housing and denoting the set of rays intersecting scene objects, whereby every ray leaving the camera is assumed to be in . The color of a pixel in the calibration pattern rendering is then given by

(2)

where describes the color of the 3D point hit by ray r and rays in are not assumed to add non-zero color information. We would like to remark, that the color can be the result of further ray tracing calculations depending on the scene objects’ reflectivity and transmission properties. However, for the task at hand only the first scene point hit by a ray, , is considered.
For the positional rendering the same procedure is used, but instead of the color values the positions are accumulated and averaged, the value of the positional rendering at pixel position is given by

(3)

as visualized in Fig. 4. Note, that the pixel value is normalized by instead of as in subsection 4.2 since we are interested in the average scene point hit by the rays unbiased by vignetting, the amount of blocked rays.
Despite knowing the average scene position a camera pixel is seeing, the known 3D calibration point positions can most likely not directly be found in the positional rendering due to maximally containing 3D positions of the continuous calibration pattern plane. The naive solution to this problem is searching for pixels at which the value of is close to a position , for every we search for

(4)

and accept the solution , if the distance for this position is within a certain threshold, for some . This procedure, however, does not work for plenoptic images since these can contain a scene point multiple times in different microlens images as shown in Fig. 1 and Fig. 3. Fortunately, the MLA configuration is known and therefore the image can be splitted into microlens images , each containing only the rendered information for exactly one microlens. In these images the search can then independently be performed.
This naive solution for finding the ground truth pixel positions has the obvious drawback of a limited accuracy. The solution is only accurate within and if the number of samples, rays per pixel, is not sufficient, the found pixel

might even be an outlier due to

containing a wrong position. This accuracy problem will be tackled in the following by rendering with a higher resolution than

, filtering out unreliable results and finally using interpolation near the filtered pixel positions. First, one can observe, that for a sufficiently large number of samples per pixel, the values of small neighborhoods in

form an equidistant grid on the calibration pattern plane as visualized in Fig. 5.

Figure 5: The values of in a neighborhood of approximately form a grid on the calibration pattern plane. This visualization shows the effect that the number of rays has on the grid structure. The positional images used here were calculated with (top) and (bottom) samples per pixel.

This observation is used as a constraint for filtering the point candidates.
Assume was rendered with a resolution of , and let a pixel such that some corner position is located in the polygon given by and as shown in Fig. 5 and without loss of generality let be the closest of the four corners to . In order to allow the interpolation of the pixel position within these coordinates, we first check, if the neighborhood approximately forms an equidistant grid. To this end, the average distances between the values of vertical and horizontal neighbors,

(5)
(6)

are calculated and then used to define the first constraint

(7)

for all and a threshold and analogously

(8)

for all . In this length constraint describes the maximal relative deviation which simply enforces that the lengths of horizontal and vertical lines in the grid do not deviate too much from the respective average. A similar constraint is calculated for the angles of grid connections, the average angle

(9)

with is calculated and the respective constraint is formulated as

(10)

for a threshold . If both constraints hold for the neighborhood of , the ground truth pixel position in is interpolated via

(11)

where and are the solution of the linear equation

(12)

and is used to rescale the pixel position to the size of . Note, that the solution exists and does not require numerical approximations since all values of as well as the point are located on the same plane.

4.3 Calculating Pixel Rays

A major disadvantage of the approach described in the previous section is, that it requires one additional image to be rendered for every calibration pattern position. Especially for a large resolution scaling factor and large sample numbers this is inefficient considering that the camera setup usually does not change during the creation of one dataset and therefore the exact same rays are traced through the camera into the scene for every positional rendering. To circumvent this redundancy, we propose the rendering of only two positional images per camera setup - the first one, , for a plane located at the start of the cameras DoF and another one, , for a plane at the DoF’s end. For every pixel the 3D points and define a ray in the scene space (compare Fig. 6) similar to the often used two-plane parametrization of the plenoptic function. Given the rendering of a calibration pattern as before, the corresponding positional image , as defined in the previous chapter, can be calculated by intersecting the pattern plane and the pixel rays,

(13)
(14)

where is an arbitrary point of the calibration pattern plane and denotes its normal.
This method for calculating the positional image requires only two positional renderings per camera setup instead of one rendering per calibration pattern image .

Figure 6: Two plane approach: The colored planes are used to create positional renderings and and the positional information for a pixel is then given by the intersection of the calibration pattern object and the ray defined by and .

4.4 How-Not-To: Forward Ray Tracing

Instead of rendering whole positional images using computationally expensive backward ray tracing with large numbers of samples, one might wonder why we do not simply use forward ray tracing, tracing rays from the known positions to the sensor. This idea is appealing since the number of required rays would be heavily reduced. However, this approach only works for scene points that are either in focus or create a perfect circle of confusion on the sensor. In the former case, the rays all hit exactly one single pixel on the sensor (for a plenoptic camera with one microlens type, they might hit unique pixels in different microlens images) and for the latter one can simply take the center of the circle of confusion as the ground truth position. By treating the sensor as a continuous plane instead of discretizing it into pixels during the ray tracing, even sub-pixel accuracy could be reached. However, the shape of the area of confusion can vary heavily depending on the optical system used for the imaging and it is unclear, which point could be regarded as ground truth position for arbitrary shapes which in addition can be split over multiple microlens images.
Nevertheless, the forward ray tracing could be used to determine the areas of the sensor which should be rendered via backward ray tracing. After rendering a calibration pattern image , the positions could be traced to the continuous sensor plane and after choosing the resolution of , these sensor areas could be discretized into a set of pixels which are subsequently used for the positional rendering.

5 Evaluation

Figure 7: Method 1: Accuracy of interpolated corners for different numbers of samples and different scalings of . For every combination of a scaling factor and a number of samples the resulting corners were compared to the reference solution rendered with and

samples. The average difference and standard error of mean (SEM) in terms of pixels are shown by the colored bars. Furthermore, the black dots show the ratio of detected corners corresponding to the respective colored bars.

5.1 Realization in Blender

In order to evaluate our approach, first the model of [15] for Blender 2.79c was extended by modifying the MLA materials to support up to three configurable microlens types as described in section 4.1. With this setup, a calibration pattern rendering can easily be created. A corresponding positional image , however can not directly be rendered since the Cycles renderer does not provide the functionality to only accumulate rays hitting the scene. However, giving the calibration pattern plane a material that emits positional information, , and everything else a purely black material results in a rendering with

(15)

which differs from (see Equation 3) only by the factor describing the ratio of rays hitting the calibration pattern. This factor can be calculated by rendering an additional single channel image for which the calibration pattern plane emits a purely white material and everything else remains black, if and otherwise. The resulting image then contains the desired factor,

(16)

and accordingly can be calculated by dividing the three channels of by .
Unfortunately, this procedure requires a lot of redundant ray tracing since the same rays are traced into the scene for as for . Fortunately, a set point on a plane parameterized by with is uniquely determined by the parameters with . Thus the rendering of UV coordinates suffices to reconstruct the corresponding 3D point on the plane. Consequently only two channels are needed to save the positional information and the third channel of can be used to store the ray proportion . Analogous to the previous normalization, the positional image in terms of UV coordinates, is given by dividing the first two channels of by its third channel. The search for ground truth positions can then be performed by transferring into UV coordinates and searching these in .
For the second method the reparametrization of the planes is not necessary. Placing the two planes parallel to the worlds coordinate axes results in the points of the same plane having one fixed coordinate. This fixed coordinate can be saved in a small configuration file instead of the image channels of and , thus the freed channel in these images can again be used to save the ray proportion . The final positional rendering and are then calculated by dividing the two positional channels by the ray proportion channel and subsequently replacing the latter by the externally saved fixed coordinate.

5.2 Number of Samples and Resolution

In order to assess the dependency of the resulting accuracy on the number of samples and the positional image resolution, we used a plenoptic camera setup with a double Gaussian 100mm objective, an MLA-to-sensor distance of 1.7mm, an MLA-to-main lens distance of 123.3mm and focal lengths of 1.9mm, 2.1mm and 2.3mm for the microlenses. The MLA as well as sensor have a size of and the MLA contains approximately microlenses whereby the larger number of microlenses in the vertical is a result of the hexagonal ordering of the lenses. Furthermore, the thresholds for the constraints given in section 4.2 have been empirically chosen as and . We would like to remark, that further tests confirmed, that the general conclusions of the following evaluation also hold for different threshold values as these mainly regulate the number of positional outliers.
With this setup we rendered multiple images showing a checkerboard located at different depths with varying angles. For these images we applied our first approach for different combinations of sample numbers and resolutions. The results are presented in Fig. 7 where the mean and SEM of the differences between the determined pixel positions and the reference positions is shown. These results show, that increasing the scaling factor

significantly decreases the error and variance while the number of detected corners significantly improves. In contrast to this observation, the number of samples seems to have only a limited impact in the tested range. The results rendered with

samples show an average improvement of for the pixel positions and for the ratio of detected corners, compared to the results produced with samples. Only positional images rendered with significantly fewer samples seem to suffer from an imprecision caused by the low sample rate as the results produced with samples suggest.

5.3 Comparison of the two Methods

Figure 8: Method 2: Accuracy of corners calculated via the two plane method. The resulting corners were compared to the same reference solution as in Fig. 7. The average difference and SEM in terms of pixels are shown by the colored bars and the black squares show the absolute difference between the means of both methods for the respective combination of and number of samples.

The same combinations of sample numbers and resolutions shown in Fig. 7 were also used to evaluate the differences between the two proposed approaches. As Fig. 8 shows, the convergence behavior of the two plane approach with respect to the reference from the previous section is identical to that of the first method and the means of both methods only deviate from each other by less than less than . The remaining fluctuations between the two methods, which show no clear winner in regards of accuracy, are a result of two aspects. Firstly, the two plane method has the disadvantage of additional intersection calculations which can introduce further precision errors, especially since the ray tracing is usually done on GPUs with only single precision. And second, small positional errors can have different effects in both methods. While this error is located directly on the calibration pattern plane for the first method, the impact of a positional error in the two plane method varies with the pose of the calibration pattern since the error is located on the near or far plane.
However, since the difference in accuracy is negligible for a sufficiently large number of samples, the two plane method is recommended due to the significantly smaller render time. In our experiments we used two Nvidia Titan X and with that setup, the rendering of an image or with a resolution of pixel and samples per pixel took approximately minutes. Furthermore, the time needed for searching the positions , including the calculation of from and in the second method, is in both cases by several orders of magnitude smaller than the rendering time. Accordingly, the two plane method is significantly faster for every dataset consisting of more than two images per camera setup.

5.4 Conclusion and Limitations

The proposed methods are able to produce highly accurate, realistic data for the evaluation of calibration methods and a wide range of cameras. However, they are limited regarding the geometry of the calibration object. Throughout this work, it is assumed that the calibration pattern is placed on a plane, which excludes calibration objects, like checkerboard cubes, featuring a three dimensional structure. If the scene points that are hit by the rays traced from the pixel are not located on a common plane, the calculation of via simple averaging as described in Equation 3 is incorrect. To a limited extent this problem might be avoidable by using the two plane method. However, it remains to be analyzed whether the intersection of a more complex scene and the mean ray of a pixel can be used in the same manner as in this work.
Another problem of geometrical nature results from the proposed method for filtering outliers, where it is assumed, that a small neighborhood of pixels is projected to a grid on the calibration pattern plane as shown in Fig. 5. While this assumption holds true for most common camera setups, it is theoretically possible for an optical system to contain high frequency distortions which deform the grids even in the smallest neighborhoods. In this case it is recommended to skip the filter and simply render or the two proxy planes with a significantly larger resolution such that the solution of the naive approach given by Equation 4 is sufficiently accurate.

6 Acknowledgment

This work was supported by the German Research Foundation, DFG, No. K02044/8-1 and the EU Horizon 2020 program under the Marie Sklodowska-Curie grant agreement No 676401.

References

  • [1] E. H. Adelson and J. Y. Wang (1992) Single lens stereo with a plenoptic camera. IEEE transactions on pattern analysis and machine intelligence 14 (2), pp. 99–106. Cited by: Figure 1.
  • [2] Blender Foundation, Blender 2.79c Note: accessed 07.06.2019 External Links: Link Cited by: §2.
  • [3] Y. Bok, H. Jeon, and I. S. Kweon (2017) Geometric calibration of micro-lens-based light field cameras using line features. IEEE transactions on pattern analysis and machine intelligence 39 (2), pp. 287–300. Cited by: §2.
  • [4] O. Fleischmann and R. Koch (2014)

    Lens-based depth estimation for multi-focus plenoptic cameras

    .
    In

    German Conference on Pattern Recognition

    ,
    pp. 410–420. Cited by: §2.
  • [5] H. Ha, M. Perdoch, H. Alismail, I. So Kweon, and Y. Sheikh (2017) Deltille grids for geometric camera calibration. In

    Proceedings of the IEEE International Conference on Computer Vision

    ,
    pp. 5344–5352. Cited by: §2.
  • [6] J. Heikkila (2000) Geometric camera calibration using circular control points. IEEE Transactions on pattern analysis and machine intelligence 22 (10), pp. 1066–1077. Cited by: §1, §2.
  • [7] O. Johannsen, C. Heinze, B. Goldluecke, and C. Perwaß (2013) On the calibration of focused plenoptic cameras. In Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications, pp. 302–317. Cited by: §2.
  • [8] C. Kolb, D. Mitchell, P. Hanrahan, et al. (1995) A realistic camera model for computer graphics. In SIGGRAPH, Vol. 95, pp. 317–324. Cited by: §2.
  • [9] T. Li, S. Li, Y. Yuan, Y. Liu, C. Xu, Y. Shuai, and H. Tan (2017) Multi-focused microlens array optimization and light field imaging study based on monte carlo method. Optics express 25 (7), pp. 8274–8287. Cited by: §2.
  • [10] C. Liang and R. Ramamoorthi (2015) A light transport framework for lenslet light field cameras. ACM Transactions on Graphics (TOG) 34 (2), pp. 16. Cited by: §2.
  • [11] G. Lippmann (1908) Epreuves reversibles, photographies integrales. Academie des sciences, 446451. Cited by: Figure 1.
  • [12] B. Liu, Y. Yuan, S. Li, Y. Shuai, and H. Tan (2015) Simulation of light-field camera imaging based on ray splitting monte carlo method. Optics communications 355, pp. 15–26. Cited by: §2.
  • [13] L. Lucchese and S. K. Mitra (2002) Using saddle points for subpixel feature detection in camera calibration targets. In Asia-Pacific Conference on Circuits and Systems, Vol. 2, pp. 191–195. Cited by: §1, §2.
  • [14] A. Lumsdaine and T. Georgiev (2009) The focused plenoptic camera. In Computational Photography (ICCP), 2009 IEEE International Conference on, pp. 1–8. Cited by: Figure 1.
  • [15] T. Michels, A. Petersen, L. Palmieri, and R. Koch (2018) Simulation of plenoptic cameras. In 2018-3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4. Cited by: 1st item, §2, §4.1, §4, §5.1.
  • [16] R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan (2005) Light field photography with a hand-held plenoptic camera. Computer Science Technical Report CSTR 2 (11), pp. 1–11. Cited by: §2.
  • [17] S. Nousias, F. Chadebecq, J. Pichat, P. Keane, S. Ourselin, and C. Bergeles (2017) Corner-based geometric calibration of multi-focus plenoptic cameras. In Proceedings of the IEEE International Conference on Computer Vision, pp. 957–965. Cited by: §2.
  • [18] C. Perwass and L. Wietzke (2012) Single lens 3d-camera with extended depth-of-field. In Human Vision and Electronic Imaging XVII, Vol. 8291, pp. 829108. Cited by: §4.1, §4.
  • [19] M. Potmesil and I. Chakravarty (1981) A lens and aperture camera model for synthetic image generation. ACM SIGGRAPH Computer Graphics 15 (3), pp. 297–305. Cited by: §2.
  • [20] Raytrix GmbH Note: accessed 07.06.2019 External Links: Link Cited by: §2.
  • [21] J. Wu, C. Zheng, X. Hu, Y. Wang, and L. Zhang (2010) Realistic rendering of bokeh effect based on optical aberrations. The Visual Computer 26 (6-8), pp. 555–563. Cited by: §2.
  • [22] J. Wu, C. Zheng, X. Hu, and F. Xu (2013) Rendering realistic spectral bokeh due to lens stops and aberrations. The Visual Computer 29 (1), pp. 41–52. Cited by: §2.
  • [23] R. Zhang, P. Liu, D. Liu, and G. Su (2015) Reconstruction of refocusing and all-in-focus images based on forward simulation model of plenoptic camera. Optics Communications 357, pp. 1–6. Cited by: §2.
  • [24] Z. Zhang (2000) A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence 22. Cited by: §1, §2.