Consumer-grade cameras usually have a rolling shutter (RS) mechanism which causes consecutive rows of an image to be captured with an inter-line delay. RS cameras are widely used in smartphones , mixed reality products , and autonomous driving cars . Thanks to the RS feature, these cameras have been used to measure distances 
, to estimate motion at a high frequency, and to identify modulated flickering LEDs . On the other hand, RS introduces image distortion when there is relative motion between the camera and the scene. For better performance, this distortion needs to be considered in applications sensitive to motion. In response, methods tailored to RS cameras have been proposed for video stablization , camera calibration , structure from motion , vision-aided odometry , dense mapping , .
For applications like vision-aided odometry and dense mapping, a camera is usually rigidly attached to other sensors, , depth cameras, lidars, IMUs, to provide complementary data. One important step of these applications is the spatiotemporal calibration of these sensors. Existing calibration methods usually assume that the camera uses a global shutter (GS), , . For specific sensor assemblies, , a lidar-camera rig, a RGB-D sensor, it is possible to estimate the extrinsic parameters using standstill data, , . But for a lidar-camera rig, many recent algorithms require motion between the sensor system and the scene for fast calibration [19, 16]
and hence these methods were only validated with GS cameras. For camera-IMU systems, the extrinsic calibration always requires egomotion where the RS skew comes into play. But there has been limited spatiotemporal calibration methods for RS cameras. Consequently, it has been unclear how the RS affects the extrinsic calibration.
Taking the camera-IMU system as an example, this paper looks into the effect of RS on spatiotemporal calibration. We formulate the calibration problem with continuous-time B-splines 
which allow interpolating a unique camera pose for each observation in a RS image, precisely handling the RS effect. By differentiating the B-splines, it is straightforward to accommodate the IMU data. This formulation has been shown to improve simultaneous localization and mapping (SLAM) with a RGB-D camera, and spatiotemporal calibration of combinations of a GS camera, an IMU, and a laser range finder . In this regard, our work extends the existing calibration methods to deal with the RS effect.
The proposed method was evaluated with simulated data generated from four sets of public calibration data, and with real data captured by two industrial camera-IMU systems. The simulation and real data tests showed that considering the RS effect in spatiotemporal calibration often improved relative orientation by 1 and relative location by 2 cm from results of a GS-based calibration approach.
The contributions of this paper are as follows.
We propose a continuous-time B-spline-based approach for spatiotemporal calibration of systems composed of a RS camera and an IMU. Our novelty lies in considering the RS effect in continuous-time spatiotemporal calibration (Section 3).
We quantify the RS effect on extrinsic and temporal calibration with simulation and real data, answering questions about the necessity of accounting for the RS effect (Section 4).
As a byproduct, the proposed approach also provides accurate estimates for the line delay (Section 4.2.2).
2 Related work
The wide application range of RS cameras has drawn much research about its effect on a variety of vision-based tasks. In general, modeling the RS effect leads to better performance with RS cameras. This improvement has been validated for video stabilization , structure from motion [6, 25, 18, 4], dense mapping , vision-aided odometry [14, 20, 26].
Though many sensor systems use RS cameras, , Kinect Azure, camera-IMU systems on smartphones, few research has been conducted on the spatiotemporal calibration of a RS camera. Many existing extrinsic calibration studies either deal with GS cameras or assume that the data are captured when the sensor system is held still. Spatiotemporal calibration for a GS camera-IMU system or a GS camera-lidar system is achieved by continuous-time B-splines in . Calibration methods for a GS camera-lidar system with motion relative to the scene have been presented in [19, 16]. A camera-lidar system was calibrated with data collected by holding the device at different poses in . Basso  estimated the extrinsic parameters of a RGB-D system using synchronized data at standstills. To calibrate the line delay of a RS camera, a continuous-time optimization method based on B-splines was proposed in .
Driven by the question of how the RS affects the spatiotemporal calibration, we develop a calibration approach for the camera-IMU system and quantify the RS effect on sensor assemblies with precise reference values. Cumulative cubic B-splines were used in  for visual inertial SLAM with a RS camera. They showed that the SLAM system could improve the scale estimation by considering the RS effect. A study close to ours  presents a discrete-time optimization approach to calibrate the spatiotemporal parameters of a RS camera-IMU system. The approach assumes constant IMU biases and linearly interpolates poses for rolling shutter observations from poses at discrete times. Due to varying time offset in iterations of the nonlinear refinement, the IMU factors have to be integrated repeatedly, adding to the computation. A bit surprisingly, their tests showed that their approach and the GS camera-IMU calibration tool in Kalibr achieved similar spatiotemporal calibration results.
3 Continuous-time rolling-shutter-camera-IMU calibration
This section presents our spatiotemporal calibration method for the combined device of a RS camera and an IMU. The inputs are images of a calibration target, , an Aprilgrid, and the corresponding gyroscope and accelerometer data. Observations extracted from this data are used in a least-squares solver for estimating the spatiotemporal parameters and the device motion. Since landmark observations in an image are exposed at different times due to the RS effect, we have to interpolate the camera pose at these observations by fitting motion with sparse poses. Early studies considering the RS effect often linearly interpolated the camera pose with the constant velocity assumption. An alternative approach is to use continuous-time basis splines, , [5, 20]
, which has better fitting capacity and flexibility with high-order curves. Our method uses the continuous-time vector-valued B-splines to accommodate landmark observations at different times. In the following, we first briefly review the vector-valued B-splines for representing poses, and then describe the observation models used in calibration.
3.1 Continuous-time B-splines
With B-splines of order , a state variable is interpolated by weighting control points , ,
The weights of these control points are computed with basis splines which are analytical functions of time defined recursively ,
where the epochs denoted byare also known as knots. A spline takes nonzero values in only time intervals. Thus, the vector variable fitted by B-splines of order is determined by control points, , that contribute to its value at time :
Defining , can be written in matrix form , which facilitates efficient value and derivative evaluation,
where . The entries of matrix can be found in . For uniform B-splines of evenly spaced knots, its entries can be computed analytically by
with and the binomial coefficient .
To fit motion with vector-valued B-splines, the device pose relative to the frame on the calibration target is expressed by the angle-axis representation for and the translation component . Thus, the motion spline with control points is given by
where and are the control points for rotation and translation, respectively.
The above describes the vector-valued B-splines for which the control points are in a vector space. It is also possible to define cumulative B-splines on Lie groups, , SO(3) . The cumulative B-splines on SO(3) are free of rotation singularity, but are more complex when dealing with unevenly spaced knots. We choose to use the vector-valued B-splines, partly for fair comparison to existing methods. To deal with potential angle jumps (axis flips) at 2, , in the angle-axis representation, the approach described in  is used to ensure consecutive angle-axis values are close.
To model the time-varying IMU biases , we fit them by B-splines with control points , ,
where and are gyroscope and accelerometer biases, respectively.
For the camera-IMU system, the camera provides images from which landmark observations are detected and the IMU provides angular velocity and linear acceleration measurements.
The landmark observations are modeled with the classic reprojection model. For a landmark of homogeneous coordinates , its reprojection error in image of timestamp per the camera clock is
where is the camera projection model with the intrinsic parameters , the camera extrinsic parameters are in , the observation is , and the observation time accounts for the camera time offset relative to the IMU and line delay . The noise affecting image observations is assumed to be 2D white Gaussian with magnitude in each dimension.
We adopt two IMU models presented in , the calibrated model with bias terms, and the scale-misalignment model considering scale and misalignment effects besides biases.
Recall that the accelerometer triad measures the acceleration in the IMU frame, , due to specific forces,
In the calibrated model, the IMU measurements and are affected by accelerometer and gyroscope biases, and
, and Gaussian white noise processes,and ,
The biases are usually assumed to be driven by Gaussian white noise processes, and ,
The power spectral densities of , , , and , are usually assumed to be , , , and , respectively.
For the scale-misalignment model, the accelerometer measurement is corrupted by systematic errors encoded in a lower triangular matrix , , and ,
The 6 nonzero entries of encompass 3-DOF scale factor error and 3-DOF misalignment. The gyroscope measurement is corrupted by systematic errors encoded in a matrix and the -sensitivity effect encoded in a matrix , , and ,
The 9 entries of encompass 3-DOF scale factor error, 3-DOF misalignment, and 3-DOF relative orientation between the gyroscope input axes and the frame defined by the accelerometer input axes.
In summary, variables to be estimated and known parameters in our optimization-based calibration are listed in Table 1.
|pose of the camera relative to the IMU|
|camera time offset relative to the IMU|
|gravity direction in the target frame|
|line delay of the rolling shutter camera|
|landmark coordinates on the target|
|camera intrinsic parameters including distortion|
|local gravity magnitude, , 9.80665|
To study the RS effect on spatiotemporal calibration of a camera-IMU system and validate the proposed calibration method, we conducted simulation with four public calibration datasets and tests on real data captured by two industrial camera-IMU units.
The proposed calibration approach was implemented on top of Kalibr . For the following tests, both poses and IMU biases were fitted by sixth-order B-splines which are piece-wise fifth-degree polynomials, accommodating the required diverse motion to render the extrinsic parameters observable. Knots were placed at 100 Hz for pose B-splines, and at 50 Hz for bias B-splines. Both simulation and calibration adhered to image noise = 1 pixel and pinhole camera model with equidistant distortion. The equidistant distortion model encoded by four parameters was chosen for its fitness to a wide range of lenses  and its high precision . The camera intrinsic parameters required by the spatiotemporal calibration were obtained with the camera calibration tool in Kalibr  when necessary.
To evaluate calibration results, differences between reference values and estimated parameters are computed. For the extrinsic parameter , deviation of its estimated value is computed by
And its estimation error is quantified by the angle of the angle-axis representation of , , and .
To quantify how RS affects the spatiotemporal parameters of a camera-IMU system, we simulated RS camera and IMU data based on four public calibration datasets, and then estimated these parameters with the proposed method.
The four public datasets include the camera-IMU calibration sample provided by Kalibr, and the calibration data of the EuRoC, TUM-VI, and UZH datasets. The four calibration datasets are summarized here 222https://github.com/VladyslavUsenko/basalt-mirror/blob/master/doc/Calibration.md.
To create simulated data, for every dataset, we first ran the GS-camera-IMU calibration tool in Kalibr and saved the fitted pose B-splines. Then from the B-splines, noisy RS camera observations of the target and IMU measurements spanning 50 seconds were simulated using reference camera and IMU parameters for the dataset. These reference parameters were obtained by simply rounding the calibrated precise values for easy interpretation. For all datasets, the IMU noise parameters in simulating IMU data and in the subsequent calibration were identical to those for ADIS16448 found in the Kalibr sample dataset, , accelerometer noise density = 1.0 , accelerometer random walk = 2.0 , gyroscope noise density , gyroscope random walk .
The RS image observations were simulated at four line delays, 137.5, 82.5, 51.563, and 41.25 ms, which correspond to pixel clocks, 12, 20, 32, and 40 MHz, of a Matrix Vision Bluefox MLC202dG camera with a line length 1650 pixels. To avert a bias in the camera time offset when a GS-camera-IMU calibration method processes the simulated RS camera data, a simulated RS image was assigned timestamp of its central row.
In the end, the simulated data were processed by our RS camera-IMU calibration method and the camera-IMU calibration tool in Kalibr . For both methods, the calibrated IMU model (10) is used. The errors in , and , are visualized in Figs. 2, 3, respectively. From Fig. 2, we see that longer line delays led to greater extrinsic errors for the GS-camera-IMU calibration method, while our method maintained small extrinsic errors across varying line delays and datasets. The GS-camera-based method often had errors more than 0.5 in orientation and 2 cm in translation. Fig. 3 shows that ignoring the RS effect often led to time offset errors greater than 4 ms. And Fig. 3(b) shows that line delay could usually be accurately estimated within 1 s. Overall, the simulation shows that the RS effect can noticeably deteriorate the spatiotemporal calibration of a camera-IMU system if ignored, and our method can accurately estimate the RS line delay and remove the adverse effect.
4.2 Real data tests
We also tested the proposed method on two RS camera-IMU systems built with industrial cameras of delicate RS control. One device combined an IDS uEye 3241LE-M-GL camera fitted with a Lensagon BM4018S118 lens of 126 diagonal FOV and a SparkFun OpenLog Artemis IMU board. The other was composed of a Matrix Vision Bluefox MLC202dG camera fitted with a E1M3518 lens of 90 diagonal FOV and an OpenLog Artemis IMU board. The uEye camera allows switching between RS mode and GS mode in operation. Thus, the extrinsic calibration result with data captured in GS mode can serve as an accurate reference for . The Artemis board carries an InvenSense ICM-20948 IMU (cost less than $4), and is able to capture the inertial data at about 230 Hz. The Bluefox camera only supports RS mode, but it has precise pixel clocks and a known line length for determining the reference line delays.
In data acquisition, the exposure time of the two cameras was set to 5 ms to reduce motion blur; the focus distance of both lenses was about 1.5 meters and remained fixed. For every device, the data collection began with capturing the camera intrinsic calibration data. For the uEye camera, a video was captured while the camera in GS mode was moved in front of the static target. For the Bluefox camera, an image sequence was captured by holding the RS camera at 100 different poses. Afterwards, the RS camera-IMU data were collected while the device was moved in front of the static target. For every device, a set of five one-minute RS sessions was captured at each of four pixel clocks, 12, 20, 32, and 40 MHz.
The IMU noise parameters were estimated by Allan variance analysis as detailed in the supplementary material. For the accelerometer and gyroscope noise density, the Allan analysis obtained values reasonably close to those on the ICM-20948 datasheet. The noise values read from Allan analysis were obtained for three accelerometers and three gyroscopes, and then plugged in the compared calibration methods without inflation. Specifically, accelerometer noise density= 2.3 , accelerometer random walk = 6.5 , gyroscope noise density , gyroscope random walk . It is a bit counter-intuitive that the InvenSense IMU has smaller noise parameters than the more expensive ADIS16448. One possible reason is that the latter’s noise values in the Kalibr sample data had been inflated. In a preliminary test about whether to inflate the noise parameters, we inflated the noise density and random walk parameters by scalars and , respectively, ran the proposed calibration method, and iterated the two steps. The two scalars were grid-searched for the minimum total cost. In the end, we found that inflating the IMU noise parameters led to smaller reprojection errors but larger accelerometer and gyroscope errors; and extraordinary inflation caused difficulty in convergence of the optimization-based method. Thus, we chose to use the original noise parameters obtained by the principled method.
For all RS camera and IMU data captured by the two devices, we processed them by the proposed method and the GS-camera-IMU calibration tool in Kalibr. Each method has two variants, one with the calibrated IMU model, and the other with the scale-misalignment IMU model (Section 3.2). The two variants aim to tease apart the scale and misalignment effect commonly found in a consumer-grade IMU from the extrinsic calibration. Overall, we have four calibration methods of shorthand names: GS calibrated IMU, GS scale-misalign IMU, RS calibrated IMU, RS scale-misalign IMU.
For the uEye camera and Artemis IMU assembly, to obtain the camera intrinsic parameters, the video captured in GS mode were down-sampled in frame rate and then processed by the intrinsic calibration tool in Kalibr . The intrinsic parameters were used by all the compared calibration methods. To obtain the reference extrinsic parameters, we used a one-minute camera-IMU session captured with the uEye camera in GS mode at the pixel clock 40 MHz. The session was then processed by the Kalibr camera-IMU calibration tool with the scale-misalignment IMU model, attaining the reference extrinsic parameters which agreed very well with values measured by ruler,
where the uncertainty in a measured distance is about 3 mm.
The RS camera-IMU data were processed by the aforementioned four calibration methods. The translation, rotation, and reprojection errors are visualized in Fig. 4. Similarly to results in simulation, Fig. 4(a) and (b) show that extrinsic calibration errors grew with line delays for the GS calibration methods, and that the RS calibration methods kept relatively small errors across pixel clocks. For the RS calibration methods, the scale-misalignment IMU model slightly improved the estimated extrinsic parameters. Overall, the RS calibration methods consistently outstripped the GS-based methods for both IMU models. For instance, at 40 MHz, the GS calibration methods had errors about 0.4 in rotation and 15 mm in translation, whereas the RS calibration method with the scale-misalignment IMU model incurred errors about 0.15 in rotation, and 2 mm in translation. Fig. 4(c) shows that the RS effect caused reprojection errors about 5 pixels for GS calibration methods while the RS calibration methods had sub-pixel errors. These results confirm that considering the RS effect could substantially improve the extrinsic estimation. Considering that the line delays for the uEye camera at pixel clocks, 12, 20, 32, 40 MHz, are roughly 107.7, 64.6, 40.4, and 32.3 s, the improvements are quite relevant for consumer cameras which often have line delays in range 25 – 60 s .
For the Bluefox camera and Artemis IMU assembly, the intrinsic parameters were estimated with the aforementioned image sequence by the calibration tool in Kalibr. Since the reference extrinsic parameters were unavailable, the errors in were computed relative to the values measured by hand, ,
The accurate reference line delay is the ratio of line length = 1650 provided in datasheet and the pixel clock for the Bluefox camera.
The Bluefox-IMU data were processed by the four calibration methods. The errors in and median reprojection errors are illustrated in Fig. 5. Though the reference extrinsic parameters are inaccurate, from Fig. 5(a) and (b), we see that the RS effect prevented the GS calibration methods from achieving coherent spatial estimation, while the proposed method had much smaller variances in extrinsic parameters. Fig. 5(c) shows that the GS calibration methods had much greater reprojection errors than the RS calibration methods, similarly to Fig. 4(c) with the uEye data.
The line delay estimates for the Bluefox camera are visualized in Fig. 6. By comparing to the reference values, we see that the proposed method can accurately estimate line delays. Unsurprisingly, it has better accuracy than the RS camera calibration tool probably because the latter uses priors on angular and linear acceleration rather than real IMU measurements to constrain device motion.
In summary, to our knowledge, we present the first continuous-time spatiotemporal calibration method for a RS camera-IMU system. The method is also able to estimate the inter-line delay of a RS. By numerous simulation and real data tests with accurate reference values, we showed that RS could degrade the extrinsic calibration results of a GS-based method by a few degrees in orientation and a few centimeters in translation. These tests also validated that our approach achieved accurate and consistent line delay estimation and extrinsic calibration.
The future work is to extend the presented method to other sensor modalities, , lidar-camera systems.
-  IEEE standard specification format guide and test procedure for single-axis laser gyros. IEEE Std 647-2006 (Revision of IEEE Std 647-1995), pages 1–83, 2006.
-  Akash Bapat, True Price, and Jan-Michael Frahm. Rolling shutter and radial distortion are features for high frame rate multi-camera tracking. In , pages 4824–4833, Salt Lake City, UT, USA, June 2018. IEEE.
-  Filippo Basso, Emanuele Menegatti, and Alberto Pretto. Robust intrinsic and extrinsic calibration of RGB-D cameras. IEEE Transactions on Robotics, 34(5):1315–1332, 2018.
-  Yuchao Dai, Hongdong Li, and Laurent Kneip. Rolling shutter camera relative pose: Generalized epipolar geometry. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4132–4140, Las Vegas, NV, USA, June 2016.
-  Paul Furgale, Timothy D. Barfoot, and Gabe Sibley. Continuous-time batch estimation using temporal basis functions. In 2012 IEEE International Conference on Robotics and Automation (ICRA), pages 2088–2095, Saint Paul, MN, USA, May 2012.
-  Johan Hedborg, Per-Erik Forssén, Michael Felsberg, and Erik Ringaby. Rolling shutter bundle adjustment. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1434–1441, Providence, RI, USA, June 2012. IEEE.
-  Jianzhu Huai, Yujia Zhang, and Alper Yilmaz. The mobile AR sensor logger for Android and iOS devices. In IEEE SENSORS, pages 1–4, Montreal, Canada, Oct. 2019.
-  Sunghoon Im, Hyowon Ha, Gyeongmin Choe, Hae-Gon Jeon, Kyungdon Joo, and In So Kweon. Accurate 3D reconstruction from small motion clip for rolling shutter cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4):775–787, Apr. 2019.
Jaehyeon Kang and Nakju L. Doh.
Automatic targetless camera–LIDAR calibration by aligning edge with Gaussian mixture model.Journal of Field Robotics, 37(1):158–179, 2020.
-  Juho Kannala and Sami S. Brandt. A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(8):1335–1340, Aug. 2006.
-  Christian Kerl, Jorg Stuckler, and Daniel Cremers. Dense continuous-time tracking and mapping with rolling shutter RGB-D cameras. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2264–2272, Santiago, Chile, Dec. 2015.
-  Namhoon Kim, Junsu Bae, Cheolhwan Kim, Soyeon Park, and Hong-Gyoo Sohn. Object distance estimation using a single image taken from a moving rolling shutter camera. Sensors, 20(14):3860, Jan. 2020.
-  Chang-Ryeol Lee, Ju Yoon, and Kuk-Jin Yoon. Calibration and noise identification of a rolling shutter camera and a low-cost inertial measurement unit. Sensors, 18(7):2345, July 2018.
-  Mingyang Li and Anastasios I. Mourikis. Vision-aided inertial navigation with rolling-shutter cameras. The International Journal of Robotics Research, 33(11):1490–1507, 2014.
-  Jérôme Maye, Paul Furgale, and Roland Siegwart. Self-supervised calibration for robotic systems. In 2013 IEEE Intelligent Vehicles Symposium (IV), pages 473–480, Gold Coast, Queensland, Australia, 2013. IEEE.
-  Michał R. Nowicki. Spatiotemporal calibration of camera and 3D laser scanner. IEEE Robotics and Automation Letters, 5(4):6451–6458, Oct. 2020.
-  Luc Oth, Paul Furgale, Laurent Kneip, and Roland Siegwart. Rolling shutter camera calibration. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 1360–1367, Portland, OR, USA, June 2013. IEEE.
-  Hannes Ovrén and Per-Erik Forssén. Trajectory representation and landmark projection for continuous-time structure from motion. The International Journal of Robotics Research, 38(6):686–701, May 2019.
-  Chanoh Park, Peyman Moghadam, Soohwan Kim, Sridha Sridharan, and Clinton Fookes. Spatiotemporal camera-LiDAR calibration: A targetless and structureless approach. IEEE Robotics and Automation Letters, 5(2):1556–1563, 2020.
-  Alonso Patron-Perez, Steven Lovegrove, and Gabe Sibley. A spline-based trajectory representation for sensor fusion and rolling shutter cameras. International Journal of Computer Vision, 113(3):208–219, July 2015.
-  Kaihuai Qin. General matrix representations for B-splines. In Proceedings Pacific Graphics ’98, Singapore, 26.
-  Joern Rehder, Janosch Nikolic, Thomas Schneider, Timo Hinzmann, and Roland Siegwart. Extending kalibr: Calibrating the extrinsics of multiple IMUs and of individual axes. In IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 4304–4311, Stockholm, Sweden, May 2016.
-  Joern Rehder, Roland Siegwart, and Paul Furgale. A general approach to spatiotemporal calibration in multisensor systems. IEEE Transactions on Robotics, 32(2):383–398, Apr. 2016.
-  Erik Ringaby and Per-Erik Forssén. Efficient video rectification and stabilisation for cell-phones. International Journal of Computer Vision, 96(3):335–352, 2012.
-  Olivier Saurer, Marc Pollefeys, and Gim Hee Lee. Sparse to dense 3D reconstruction from rolling shutter images. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3337–3345, Las Vegas, NV, USA, June 2016. IEEE.
-  David Schubert, Nikolaus Demmel, Lukas von Stumberg, Vladyslav Usenko, and Daniel Cremers. Rolling-shutter modelling for direct visual-inertial odometry. Technical report, Technical University of Munich, Germany, Nov. 2019.
-  Christiane Sommer, Vladyslav Usenko, David Schubert, Nikolaus Demmel, and Daniel Cremers. Efficient derivative computation for cumulative B-splines on Lie groups. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11145–11153, June 2020.
-  Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, and Benjamin Caine. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2446–2454, June 2020.
-  Vladyslav Usenko, Nikolaus Demmel, and Daniel Cremers. The double sphere camera model. In 2018 International Conference on 3D Vision (3DV), pages 552–560, Verona, Italy, Sept. 2018. IEEE.
-  Yuan Zhuang, Luchi Hua, Longning Qi, Jun Yang, Pan Cao, Yue Cao, Yongpeng Wu, John Thompson, and Harald Haas. A survey of positioning systems using visible LED lights. IEEE Communications Surveys Tutorials, 20(3):1963–1988, 2018.
This supplementary material presents Allan variance analysis results of an OpenLog Artemis board, and extrinsic calibration results of the uEye camera with data captured in global shutter (GS) mode.
A. Allan variance analysis
To characterize noises of the InvenSense ICM-20948 IMU on the OpenLog Artemis board, the Allan variance analysis technique  was used to analyze its static data. A 17-hour data at about 230Hz were captured from our lab at night while the Artemis board was placed on a table with the -axis of the board roughly along gravity. The median of the temperatures recorded by the IMU was 27.36
C, and its standard deviation was 1.36C. The Allan standard deviations were computed for three accelerometers and three gyroscopes, shown in Fig. 7 and Fig. 8, respectively. These figures are annotated with the interpreted values for the white noise, bias stability, and bias random walk, whose corresponding slopes are -1/2, 0, and 1/2, respectively, according to the standard .
The noise parameters and their averages are compiled in Table 2. For comparison, the reference noise density values read from the ICM-20948 datasheet are appended. From Table 2, we see that the noise densities from the Allan variance analysis are reasonably close to those from the datasheet.
B. uEye-Artemis calibration with GS data
To see the variation of spatiotemporal calibration across different settings except the RS effect, we carried out camera-IMU calibration with data captured by the uEye camera in GS mode. These data were in fact captured along with the RS data for the test in Section 4.2.1, including four sets of GS data captured at pixel clocks, 12, 20, 32, and 40 MHz. Each set contained five one-minute sessions.
In calibrating the uEye-Artemis system, the same camera intrinsic parameters as in Section 4.2.1 were adopted. These 20 sessions were processed with both the calibrated and scale-misalignment IMU models. We computed the translation and rotation errors of the extrinsic parameters relative to the reference value in (15), shown in Fig. 9(a) and (b), respectively. From these figures, we see that the estimated translations and rotations had typically small variances. Especially with the GS scale-misalign IMU calibration method, the translation variations were mostly less than 2 mm, and the rotation variations mostly less than 0.2. These variations arising from different settings in GS mode and different sessions were significantly smaller than the calibration deviations when ignoring the RS effect, which were about 15 mm in translation and 0.4 in rotation even at pixel clock 40 MHz (see Fig. 4(a-b)).
Boxplots of the median reprojection errors for the two calibration methods, GS calibrated IMU, and GS scale-misalign IMU, are shown in Fig. 9(c), which shows that subpixel reprojection errors were attained when the GS data were processed by the GS camera calibration method. This contrasted with the reprojection errors (often greater than 2 pixels) when the RS data were processed by the GS camera calibration method (see Fig. 4).
Overall, these additional results show that the RS effect is much more pronounced than the calibration uncertainty due to different sessions of data, and the baseline GS camera-IMU calibration methods worked well for GS data.