Half of the earth’s surface - specifically everything 200 meters below water level - is not illuminated by sunlight. In addition, many other environments like caves or tunnels, or cavities to be explored by endoscopy, do not provide naturally illuminated scenes. To successfully perform vision tasks in those conditions, artificial light sources are demanded. To explore dark areas, an efficient way is to integrate a light source within the vision system. Thus, cars use headlights for driving in the dark and autonomous underwater vehicles (AUVs) are installed lights for exploring in the deep sea. However, the visual appearances of objects heavily varies under changing illumination conditions and traditional computer vision solutions can struggle in such cases. Vision in the dark with moving light sources, especially non-isotropic ones, is not a well studied topic compared to other topics in last decades. The knowledge about the relative pose of light sources with respect to the camera can not only improve the performance of vision based algorithms, but also facilitate many other applications like shape from shading (SFS), shape from shadow, augmented reality, photometric stereo and image-based rendering techniques in computer graphics.
Hence, this paper presents a novel strategy to calibrate the fixed relative pose (position and orientation) of a directional light source (i.e. a point light with a non-isotropic angular characteristic) wrt. the camera in a camera-light vision system. As an basis, an energy preserving rendering model is proposed and applied to estimate the relative pose parameters of lights. This model considers camera, object and light properties in order to render the pixel value as the irradiance which arrives on the pixel. The actual estimation of the relative light pose is solved by minimizing the difference between real and rendered pixel intensity values.
2 Previous Work and Main Contributions
The knowledge of the light pose became important in shape from shading approaches that tried to recover the 3D shape of objects according to the variations of the shading in the image. Most of the SFS solutions assume the illuminant direction for all light rays to be parallel and that it can be estimated from either the first derivative of the image intensity , the occluding boundary and intensity extrema  or the shading along image contours .
Another family of approaches use reflective objects to reflect the light into the image scene. The light source position can then be acquired by tracking the reflected rays from highlights in the images. The reflective objects can be specular spheres [4, 5] or even a general specular surfaces . The main problem of those methods is twofold: First, the exact localization of the highlight is very difficult, since it is spatially extended. Second, because of triangulating through the reflection, highlight detection inaccuracies have a big impact on the estimated light position.
Besides detecting highlights from specular objects, different types of objects with different properties are used to infer the light source position:  uses a Lambertian cube to estimate the location of a light source,  designs a planar mirror, attached with a chessboard pattern and a diffuse region to recover the position of a light source and  uses the inside and outside highlights of a clear hollow sphere to estimate position and direction of illuminant.
Latest approaches like e.g.,  implement a light position calibration technique, which leverages Structure-from-Motion(SfM) algorithms to optimize the triangulation of reflected highlights from at least two reflective spheres in a single image.  use a more general calibration object with a Lambertian plane and small shadow casters, to estimate the shadow caster positions and the illuminant position and direction by solving a SfM problem. However, all methods introduced above treat light source as isotropic and only estimate the location of it. To the knowledge of the authors, only  gives a solution to calibrate the light position and orientation for a non-isotropic point light from the multi-view images of a weakly textured planar scene. However, this approach is based on reflections and works only for a single rotationally symmetric light source.
The main contributions of this paper are: (1) Propose a practical solution to calibrate the relative poses of the light sources in wrt. the camera in the camera-light vision systems. (2) Using a physical light propagation model to simulate pixel intensities under consideration of energy preservation. (3) Use an analysis-by-synthesis approach to solve light pose calibration by optimization.
3 Energy preserving rendering models
This section describes the physical models involved in the rendering methods under consideration of energy preservation. Their corresponding geometric relationships are depicted in Figure 2 and described in the following.
As we will use a reference plane (e.g. a flat white wall, which is used in this paper, or any mobile planar single color target) for calibration, we set this to be the plane of the world coordinate system, with the -axis parallel to the reference plane surface normal . Each pixel in the image is modeled as a square with four vertices and is back-projected to the reference plane as a quadrilateral. We further assume that the reference plane has a Lambertian surface which reflects the incident light to all directions equally, then the intensity (illuminance) of a pixel is proportional to the energy arriving on the back-projected quadrilateral on the reference plane. From an energy preserving perspective, the light energy is completely distributed on a hemisphere with the light source located at its center. When the vertices of the quadrilateral are now projected onto the surface of such a light hemisphere, the energy passing through the thus-defined area is the same as the energy arriving in the area defined by the quadrilateral on the reference plane.
3.1 Camera and projection model
The camera model in this paper is the perspective model. From an energy preserving point of view, pixels in CCD arrays are treated as squares constituting an area rather than infinitely small points. The pixel intensity is interpreted as light energy arriving on each cell of the CCD chip. The back-projected region on a reference plane for each pixel can be acquired by shooting four rays from pixel corner vertices and intersecting with the plane. Hence
Where and represent the 3D and 2D coordinates of pixel corner vertices respectively. stands for the camera matrix which holds intrinsic parameters. In addition, distortion parameters have also been considered during the projection. All those parameters can be achieved from a standard camera calibration procedure . and are extrinsic parameters which denote the rotation and center of the camera. Those extrinsics can be computed by standard SfM approaches. Finally, is the scale factor, when the coordinate system is based on the reference plane (world) coordinate system , it is equal to the -component of the viewing ray divided by last element of .
Besides the geometric calibration of the camera, a radiometric calibration is also demanded in order to recover the linear relationship between the pixel intensity value and the light energy which has arrived on it. We obtain the response curve according to .
3.2 Reflectance model
Reflectance rendering is a well-studied topic in computer graphics. A well known model is the bidirectional reflectance distribution function (BRDF). This paper adapts Lambert’s Cosine Law since the filmed object, i.e. a white wall, can be considered as a Lambertian surface. Hence, we apply the Inverse Square Law to fulfill the energy preserving property.
In this model, light that is cast onto a surface will be reflected equally to all directions and the reflected irradiance only depends on the incident angle, which can be derived by the dot product of the incident light ray with the reference plane surface normal . The received by the camera decreases quadratically with the distance from the 3D reference plane point to the camera.
3.3 Light source model
Light sources irradiance models can generally be grouped into 2 categories: isotropic and non-isotropic. The latter category can continuously be classified as symmetric and arbitrary pattern of light. Different types of lights require different parameterizations to properly describe the relative pose with respect to the camera. The isotropic light model only considers the relative position of the light source as its orientation is irrelevant (same radiance to all directions). Rotationally symmetric non-isotropic light needs another two rotation angles to describe the relative rotation from the camera’s optical axis to the light’s central axis (rotation around central axis is irrelevant due to its symmetric property). The angular characteristic can be formulated as radiation intensity distribution (RID) curve, which is assumed to be known in this contribution, but which could potentially also be included in the optimization scheme presented later. For lights with non-symmetric angular characteristic, the radiance pattern can be stored into a grid, then this grid can be used to lookup the corresponding radiance energy; and these lights are characterized by an additional rotational degree of freedom for their pose.
The approach taken in this contribution can handle both symmetric and non-symmetric lights. For the experiments and clarity of presentation we will however restrict ourselves to symmetric lights:
Where is the solid angle formed by the projected pixel vertices on the light hemisphere. denotes the average irradiance from the RID curve, which only depends on the angle between light ray and the light’s central axis. Since the RID gives a relative measurement of light energy distribution, a scale factor is included to cover all scale effects (e.g. analogue-digital conversion, reference plane surface albedo) which linearly links the relative radiance measurement to pixel intensity value.
This section outlines the approach to estimate the relative pose (position and orientation) of a non-isotropic point light, which is rigidly attached to a camera, by using an energy preserving rendering approach. All the images used in this paper are single channel raw images, which give more dynamic ranges in order to achieve higher accuracy. Assume that several images of a flat reference plane have been taken by a camera-light vision system. Now, an estimation of the relative pose between camera and light can be obtained by the following steps:
(1) Camera Calibration
Geometric calibration is implemented by a standard camera calibration procedure to compute the camera matrix and distortion parameters. Radiometric calibration obtains the response curve of the camera, The intensity values of the acquired images can then been corrected into a linear space.
(2) Multi-view SfM
SfM is performed to obtain the extrinsics (rotation matrix and camera center) for each image in the reference plane coordinate system, alternatively markers can be used.
Select several pixels in each image and render their intensity values under the initial light poses and scale factor setups employing the energy preserving rendering model from Section 3.
Minimize the difference between measured pixel intensity values and rendered ones to optimize the initial light poses (for rotationally symmetric light: light position and two Euler angles ) and scale factor . The optimization target function is formulated as:
5 Experiments and evaluations
We built a camera-light vision system for evaluation by fixing a SONY Alpha 7 camera and a Bridgelux RS Array LED (with LEDiL CA12900 reflector) on a metal bar (see Figure 4 (a)). Twelve raw images were taken with this system from different views on a flat white wall (see Figure 4 (b)). One hundred valid pixels in each image are chosen for estimating the relative pose of the light. The estimated relative pose of light and scale factor are solved by the procedure mentioned in the above Section 4. In our implementation, the optimization problem is solved by Ceres Solver .
The evaluation is implemented on different numbers of images with the same very coarsely tape-measured initial values. The optimization is solved by using 12 to 8 images to evaluate the consistence of the method.
Number of images
The results over different numbers of input images are shown in Figure 5, the estimation of relative pose of light is consistent and reliable. More input images yield more robust optimization results.
During the evaluation, we also noticed that the variation of distance and viewing angle will significantly improve the accuracy of light calibration results (similar to camera calibration).
An additional test with far off (errors 30 and 1m) initial values is also conducted. The calibration results still converged and remain consistent, which indicates that the optimization in the smooth, one-light setting has a large basin of convergence. Full images are rendered in Figure 6 for giving more intuitive sense of the estimated light poses.
This paper presents an optimization strategy, based on energy-preserving rendering for the calibration of the relative light pose wrt. the camera in a camera-light vision system. This method only requires to take several light pattern images on a flat reference plane from different views and distances as inputs. Upon that, the estimation of the light pose is solved by minimizing the residuals between real and rendered pixel intensities in an analysis-by-synthesis fashion, which is also suited to extend to multiple light source cases. The experimental results indicate that the method is able to estimate the relative light position and orientation consistently and robustly and independent of the initial value. In the experiment, we applied the proposed method on a known symmetric non-isotropic light. However, this method is not limited by the types of lights and can be extended to other light types. Future work should examine how the light’s radiance intensity distribution can best be included in the optimization
-  Alex P Pentland, “Finding the illuminant direction,” Josa, vol. 72, no. 4, pp. 448–455, 1982.
Yihing Yang and Alan Yuille,
“Sources from shading,”
Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1991, pp. 534–539.
-  Qinfen Zheng and Rama Chellappa, “Estimation of illuminant direction, albedo, and shape from shading,” in Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1991, pp. 540–545.
-  Wei Zhou and Chandra Kambhamettu, “Estimation of illuminant direction and intensity of multiple light sources,” in European conference on computer vision. Springer, 2002, pp. 206–220.
-  Kwan-Yee K Wong, Dirk Schnieders, and Shuda Li, “Recovering light directions and camera poses from a single sphere,” in European conference on computer vision. Springer, 2008, pp. 631–642.
-  Jan Jachnik, Richard A Newcombe, and Andrew J Davison, “Real-time surface light-field capture for augmentation of planar specular surfaces,” in 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 2012, pp. 91–97.
-  Martin Weber and Roberto Cipolla, “A practical method for estimation of point light-sources.,” in BMVC, 2001, vol. 2001, pp. 471–480.
-  Hui-Liang Shen and Yue Cheng, “Calibrating light sources by using a planar mirror,” Journal of Electronic Imaging, vol. 20, no. 1, pp. 013002, 2011.
-  Takahito Aoto, Takafumi Taketomi, Tomokazu Sato, Yasuhiro Mukaigawa, and Naokazu Yokoya, “Position estimation of near point light sources using a clear hollow sphere,” in Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). IEEE, 2012, pp. 3721–3724.
-  Jens Ackermann, Simon Fuhrmann, and Michael Goesele, “Geometric point light source calibration.,” in VMV, 2013, pp. 161–168.
-  Hiroaki Santo, Michael Waechter, Masaki Samejima, Yusuke Sugano, and Yasuyuki Matsushita, “Light structure from pin motion: Simple and accurate point light calibration for physics-based modeling,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–18.
-  Jaesik Park, Sudipta N Sinha, Yasuyuki Matsushita, Yu-Wing Tai, and In So Kweon, “Calibrating a non-isotropic near point light source using a plane,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2259–2266.
-  Z. Zhang, “Flexible camera calibration by viewing a plane from unknown orientations,” in Proceedings of the International Conference on Computer Vision, Corfu, Greece, 1999, pp. 666–673.
-  P.E. Debevec and J. Malik, “Recovering high dynamic range radiance maps from photographs,” in SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, New York, NY, USA, 1997, pp. 369–378, ACM Press/Addison-Wesley Publishing Co.
-  Sameer Agarwal, Keir Mierle, and Others, “Ceres solver,” http://ceres-solver.org.