1 Introduction
A pantilt platform consists of two motors, each rotating in pan and tilt directions. These 2 degrees of rotation freedoms grant the mounted system theoretically 360 degrees fieldofview. This is particularly beneficial to computer vision applications, for cameras can only obtain data from a field of view that is directed by the optical axis at a time [NLW17]. To obtain scene information from a larger field of view, the camera had to be translated or rotated to capture a series of images. Various applications utilizes pantilt platforms from obvious video surveillance [DC03] to video conferencing, humancomputer interaction, and augmented/mixed reality [WBIH12, BCYH17].
Despite continuous endeavors of the literature on the pantilting mechanics, many studies based their ground on fully factoryassembled and calibrated platforms such as in [DC03, WBIH12, LZHT15], which price from several hundreds to thousands of dollars. This may be problematic because the pantilting model may work seemingly errorfree even if assuming an ideal rotation where rotation axes are aligned with the camera optical axis perfectly orthogonally, due to sound reliable assembly quality.
On the other hand, there is growing interest in usercreated robots [Par11], where users create their own version of robots with such as consumer kits on the market [BCYH17]. In this case, it is unlikely that the exact kinematic specifications will be provided, for example in a CAD file format. Moreover, the mechanism may be fabricated and assembled in an doityourself or arbitrary manner. All these factors attribute to erroneous, if even existent, kinematic equations, rendering kinematic control methods obsolete.
In this paper, we would like to address the issue of accurate pantilting manipulation even the pantilting kinematics are loosely coupled, or often unavailable. More specifically, we propose an operating mechanism of an arbitrarily assembled pantilt model with loose kinematics based on rotational motion modeling of the mounted camera. Our method is based on the pantilt model that is general enough to calibrate and compensate for assembly mismatches, such as skewed rotation axes or offorigin rotations.
2 Related Work
Calibrating pantilt motion of the camera has been broadly studied in the computer vision field, especially surveillance using pantiltzoom (PTZ) camera. For example, Davis and Chen in [DC03] presented a method for calibrating pantilt cameras that incorporates a more accurate complete model of camera motion. Pan and tilt rotations were modeled as occurring around detached arbitrary axes in space, without the assumption of rotation axes aligned to camera optical axis. Wu and Radke in [WR13] introduced a camera calibration model for a pantiltzoom camera, which explicitly reflects how focal length and lens distortion vary as a function of zoom scale. Using a nonlinear optimization, authors were able to accurately calibrate multiple parameters in one whole step. They also investigated and analyzed multiple cases of pantilt errors to maintain the calibrated state even after extended continuous operations. Li et al. in [LZHT15] presented a novel method to online calibrate the rotation angles of a pantilt camera by using only one control point. By converting the nonlinear pantilt camera model into a linear model according to sine and cosine of pan and tilt parameters, a closedform solution could be derived by solving a quadratic equation of their tangents.
In the optics and measurement literatures, studies regarding calibration methods for a turntable or rotational axis have been proposed. Chen et al. in [CDCZ14] fixed a checkerboard plate on a turntable and captured multiple views of one pattern. Retrieved 3D corner points in a 360° view were used to form multiple circular trajectory planes, the equation and parameters of which were acquired using constrained global optimization method. Niu et al. in [NLW17] proposed a method for calibrating the relative orientation of a camera fixed on a rotation axis, where the camera cannot directly ‘see’ the rotation axis. Utilizing two checkerboards, one for the rotating camera and one for the external observing camera, they were able to calibrate the relative orientation of the two cameras and rotation axis represented in the same coordinate system.
In this paper, we propose an operating mechanism of an arbitrarily assembled pantilt model with loose kinematics based on rotational motion modeling of the mounted camera. Our contributions can be summarized as follow:

First, the proposed method models and calibrates the rotation of a generic pantilt platform by recovering directions and positions of its axes in 3D space, utilizing an RGBD camera.

Second, the proposed method is capable of manipulating servo rotations with respect to the camera, based on the inverse kinematic interpretation of its pantilt transformation model.
3 PanTilt Rotation Modeling
Our goal is to model the rotational movement of a pan/tilting platform, so that we can estimate the pose of the RGBD camera mounted on top of it. The platform rotates about arbitrarily assembled, independent axes [DC03] with loosely coupled kinematics. The structural model of such a setup is illustrated in Figure 1.
with respect to the upperleft corner of the checkerboard.
3.1 Rotation Parameters Acquisition
To model the movement of the motor rotation, we first calibrate parameters for the pan and tilt rotation. The calibration is a twostep process, where we first estimate the direction vector of the rotation, and then estimate the center of the circular trajectory. The directions and centers of two rotations are all free variables.
When arbitrary points are rotated around a rotation axis, they create a closed circular rotation trajectory on a 3dimensional plane, where the plane is perpendicular to the rotation axis and the circle center lies on the rotation axis. From the coordinates of a same point in rotated frames, the circular trajectory can be obtained using the least squares method[Sch06, DCYH13].
During the calibration, the camera captures multiple frames of a large checkerboard in front while it rotates, so that the checkerboard moves from one end of the field of view to another. Since the structure of the checkerboard is preknown, all the rotation trajectories can be represented with respect to that of the topleftmost corner. Then, we can parametrize the rotation with every corner of every frame and solve the objective function as a whole. If the checkerboard comprises corners in the vertical direction and corners in the horizontal direction and frames were taken throughout the calibration, we have total corners to globally optimize.
For the rotation of the upperleft corner, let us denote its rotation direction vector as and rotation circle center as . Then the rotation axis equation becomes
(1) 
and the rotation plane, which the upperleft corner is on, is away from the origin:
(2) 
Since all the rotation circles made from checkerboard corners are defined on the same rotation axis, the distance between their planes can be defined with respect to the indices of checkerboard corners. Let us denote the distances between the planes are and respectively in vertical and horizontal directions. Then for the corner at the th row and th column of the checkerboard, the rotation circle center becomes , where
(3) 
The ideal rotation trajectory will be modeled as a great circle, which is represented as the intersection of a plane and a sphere in 3D space. Here, the plane can be modeled as
(4) 
and the sphere, or the intersecting circle can be modeled as
(5) 
Let us denote the 3D vertex of the corner at the th row and th column of the th captured checkerboard frame as . Then, we can setup the objective function where our goal is to find parameters that minimize the following error for the plane model:
(6) 
From Equation 3 and Equation 6, one can calculate parameters that minimize the following error for the circle model:
(7) 
The global least squares method is adopted to minimize errors of two objective functions with regard to entire coordinate variations of 70 checkerboard corners in all frames. This yields optimized parameters for the rotation axis calibration [CDCZ14].
3.2 PanTilt Transformation Model
If we denote a 3D point taken in some local camera frame after rotating tilt and pan angles , and the point before rotations , the relationship can be written with the rotation model as
(8) 
Here, is a 44 matrix that rotates around the direction vector of the tilt axis and is a 44 matrix that translates by the coordinates of the pivot point of the tilt axis. Transformations regarding pan are analogous. Note that in the model tilt rotation comes before pan rotation. This is due to the kinematic configuration of the pantilt system we used. As shown in Figure 2, the tilting servo is installed on the panning arm. Thus to ensure unique rotation in world space, we tilt first, then pan.
3.3 Servo Control with Inverse Kinematics
Our scenario is that the users want to orient the camera so that the target object is located at the image center after the rotation. Specifically, the optical axis of the camera should pass through the target point. Then the task is rotating some point on the optical axis to a target point in world space. This task can be thought of as rotating the linearactuated endeffector (some point on the optical axis) of a robot arm (pantilting platform) to the target point , with unknown pan and tilt angles. Using inverse kinematics notation on Equation 8, the problem becomes to find the parameter vector of 2 rotations and 1 translation that minimizes the error . We adopt the Jacobian transpose method [Bus04] for Algorithm 1.
In our practice, we initialized as 0 and as and terminated the optimization if . Maximum iteration count was set to 100, though 25 on average was enough to satisfy the condition. One pitfall here is that the magnitude difference can be extensive if the value of is set in , for range between and 1.57 radian while may vary from 450 to 8000 . This difference causes too much fluctuations in values, leading to farfromoptimal solutions. One solution is to optimize in meter unit so that the domain difference to angles becomes minimum.
4 Evaluation
4.1 System Configuration and Calibration
To validate the proposed setup, we set up an experimental setup that comprises an Microsoft Kinect v2 and two pan and tilt HS785HB servos controlled with Arduino. For ease of explanation, we assume internal and external parameters of the color and depth cameras are already known. The checkerboard consists of 8 11 whiteandblack checkers, each 100 mm 100 mm in size.
We captured 28 frames by rotating the camera endtoend in pan direction, and 11 frames in tilt rotation. With coordinates of 70 extracted 3D corners, we estimated parameter values in Equation 6 that form a linear function of multiple planes, using the least squares method. Then, we fitted corner trajectories to a circle with parameters in Equation 7. In Table 1, we show values of 6 key rotation parameters that govern pantilt transformation of Equation 8.
Pan rotation  Tilt rotation  

0.011783038  0.998429941  
0.982956803  0.007633507  
0.183458670  0.055492186  
82.414993286  412.069976807  
458.764739990  153.644714355  
108.336227417  22.413515091 
4.2 Experiment Design
Here, we examine the performance of the proposed pantilt rotation model in terms of the accurate targeting capability. The system is tasked to adjust its attitude so that the target point is identical to the optical of center of the camera. We measure errors between the estimated pixels/vertices and actual captured pixels/vertices after the rotation. The coordinates of the 70 checkerboard corners when the system is at its rest pose, i.e. pan=tilt=0, are used as target points, constituting 70 trials in total.
We compare our result with the result produced using the Single Point Calibration Method (SPCM) of [LZHT15] as the baseline. To steer the system, we used inverse kinematics Algorithm 1 for the proposed method, while for SPCM pan, tilt values are calculated in closed form based on its geometrical model.
4.3 Results and Discussion
Figure 3 summarizes evaluation results of two models. In [LZHT15], authors evaluated the model with only Root Mean Squared Errors (RMSE) of L2norms in XY image pixels. Since we have depth values available, we also collected RMSEs of real distances (), and additionally measured Mean Absolute Errors (MAE) in each X, Y (and Z) directions in both pixels and s for further analysis.
In the graph, the proposed method outperforms SPCM in every metric, especially in Z directions in real distances. This can be explained by the omission of nonorigin rotation centers (see Table 1) in SPCM. Also, the rotation axes are assumed to be identical to world XY axes which lead to additional errors. The error differences become small in image planes. We explain this is due to normalizing effects of perspective projections onto the image plane.
Beside the comparative analysis, the results show seemingly high error patterns. We conjecture a number of factors have attributed to this. First, there is the colordepth coordinate conversion. We detected checkerboard corners in color images, then converted them into depth image coordinates. Due to limitation in Kinect SDK, conversion in subpixel resolution was impossible, leading to error increases. Second, servo rotation manipulation could be inaccurate, maybe due to unreliable kinematics of lowgrade servos, such as hysteresis, jitters or inaccurate pulse widthangle mapping.
5 Conclusion and Future Work
In this paper, we have proposed an accurate controlling method for arbitrarilyassembled pantilt camera systems based on the rotation transformation model and its inverse kinematics. The proposed model is capable of recovering rotation model parameters including positions and directions of axes of the pantilting platform in 3D space. The comparative experiment demonstrates outperformance of the proposed method, in terms of accurately localizing target world points in local camera frames captured after rotations.
In following future work, we would like to extend the proposed operating mechanism to a fullscale camera poseestimation framework that can be used in projectionbased augmented reality or point cloud registration and 3D reconstruction. We would also like to delve into hardware limitations discussed in 4.3 and improve the pantilt model to compensate errors further.
6 Acknowledgment
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (No. NRF2015R1A2A1A10055673).
References
 [BCYH17] Byun J., Chae S., Yang Y., Han T.: Air: Anywhere immersive reality with userperspective projection. In 38th annual Eurographics conference (2017), The Eurographics Association, pp. 5–8.
 [Bus04] Buss S. R.: Introduction to inverse kinematics with jacobian transpose, pseudoinverse and damped least squares methods. IEEE Journal of Robotics and Automation 17, 119 (2004), 16.
 [CDCZ14] Chen P., Dai M., Chen K., Zhang Z.: Rotation axis calibration of a turntable using constrained global optimization. OptikInternational Journal for Light and Electron Optics 125, 17 (2014), 4831–4836.
 [DC03] Davis J., Chen X.: Calibrating pantilt cameras in widearea surveillance networks. In In IEEE International Conference on Computer Vision (2003), Citeseer.
 [DCYH13] Dai M., Chen L., Yang F., He X.: Calibration of revolution axis for 360 deg surface measurement. Applied optics 52, 22 (2013), 5440–5448.
 [LZHT15] Li Y., Zhang J., Hu W., Tian J.: Method for pantilt camera calibration using single control point. JOSA A 32, 1 (2015), 156–163.
 [NLW17] Niu Z., Liu K., Wang Y., Huang S., Deng X., Zhang Z.: Calibration method for the relative orientation between the rotation axis and a camera using constrained global optimization. Measurement Science and Technology 28, 5 (2017), 055001.
 [Par11] Park W.: Philosophy and strategy of minimalismbased user created robots (ucrs) for educational roboticseducation, technology and business viewpoint. International Journal of Robots, Education and Art 1, 1 (2011), 26–38.
 [Sch06] Schaffrin B.: A note on constrained total leastsquares estimation. Linear algebra and its applications 417, 1 (2006), 245–258.
 [WBIH12] Wilson A., Benko H., Izadi S., Hilliges O.: Steerable augmented reality with the beamatron. In Proceedings of the 25th annual ACM symposium on User interface software and technology (2012), ACM, pp. 413–422.
 [WR13] Wu Z., Radke R. J.: Keeping a pantiltzoom camera calibrated. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1994–2007.
Comments
There are no comments yet.