Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle

07/30/2018 ∙ by Evgeniy Martyushev, et al. ∙ South Ural State University 0

The internal calibration of a pinhole camera is given by five parameters that are combined into an upper-triangular 3× 3 calibration matrix. If the skew parameter is zero and the aspect ratio is equal to one, then the camera is said to have Euclidean image plane. In this paper, we propose a non-iterative self-calibration algorithm for a camera with Euclidean image plane in case the remaining three internal parameters --- the focal length and the principal point coordinates --- are fixed but unknown. The algorithm requires a set of N ≥ 7 point correspondences in two views and also the measured relative rotation angle between the views. We show that the problem generically has six solutions (including complex ones). The algorithm has been implemented and tested both on synthetic data and on publicly available real dataset. The experiments demonstrate that the method is correct, numerically stable and robust.



There are no comments yet.


page 11

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The problem of camera calibration is an essential part of numerous computer vision applications including 3d reconstruction, visual odometry, medical imaging, etc. At present, a number of calibration algorithms and techniques have been developed. Some of them require to observe a planar pattern viewed at several different positions 

[5, 9, 25]. Other methods use 3d calibration objects consisting of two or three pairwise orthogonal planes, which geometry is known with good accuracy [24]. Also, there are calibration algorithms assuming that a scene involves the pairs of mutually orthogonal directions [3, 14]. In contrast with the just mentioned methods, self-calibration does not require any special calibration objects or scene restrictions [6, 16, 18, 23], so only image feature correspondences in several uncalibrated views are required. This provides the self-calibration approach with a great flexibility and makes it indispensable in some real-time applications.

In two views, camera calibration is given by ten parameters — five internal and five external, whereas the fundamental matrix describing the epipolar geometry in two views has only seven degrees of freedom 


. This means that self-calibration in two views is only possible under at least three further assumptions on the calibration parameters. For example, we can assume that the skew parameter is zero and so is the translation vector, i.e. the motion is a pure rotation. Then, the three orientation angles and the remaining four internals can be self-calibrated from at least seven point matches 

[7]. Another possibility is that all the internal parameters except common focal length are known. Then, there exist a minimal self-calibration solution operating with six matched points [20, 2, 11, 13].

The five internal calibration parameters have different interpretation. The skew parameter and the aspect ratio describe the pixel’s shape. In most situations, e.g. under zooming, these internals do not change. Moreover, for modern cameras the pixel’s shape is very close to a square and hence the skew and aspect ratio can be assumed as given and equal to  and  respectively. Following [10], we say that a camera in this case has Euclidean image plane. On the other hand, the focal length and the principal point coordinates describe the relative placement of the camera center and the image plane. The focal length is the distance between the camera center and the image plane, whereas the principal point is an orthographic projection of the center onto the plane. All these internals should be considered as unknown, since even for modern cameras the principal point can be relatively far from the geometric image center. Besides, it is well known that the focal length and the principal point always vary together by zooming [22].

The aim of this paper is to propose an efficient solution to the self-calibration problem of a camera with Euclidean image plane. As it was mentioned above, in two views at most seven calibration parameters can be self-calibrated. Since a camera with Euclidean image plane has eight parameters, we conclude that one additional assumption should be made. In this paper we reduce the number of external parameters assuming that the relative rotation angle between the views is known. The problem thus becomes minimally constrained from seven point correspondences in two views. In practice, the relative rotation angle can be reliably found from e.g. the readings of an inertial measurement unit (IMU) sensor. The possibility of using such additional data in structure-from-motion has been demonstrated in [12].

In general, the joint usage of a camera and an IMU requires external calibration between the devices, i.e. we have to know the transformation matrix between their coordinate frames. However, if only relative rotation angle is used, then the external calibration is unnecessary, provided that both devices are fixed on some rigid platform. Thus, the rotation angle of the IMU can be directly used as the rotation angle of the camera [12]. This fact makes the presented self-calibration method more convenient and flexible for practical use.

To summarize, we propose a new non-iterative solution to the self-calibration problem in case of at least seven matched points in two views, provided the following assumptions:

  • the camera intrinsic parameters are the same for both views;

  • the camera has Euclidean image plane;

  • the relative rotation angle between the views is known.

Our self-calibration method is based on the ten quartic equations. Nine of them are well-known and follow from the famous cubic constraint on the essential matrix. The novel last one (see Eq. (13)) arises from the condition that the relative angle is known.

Finally, throughout the paper it is assumed that the cameras and scene points are in sufficiently general position. There are critical camera motions for which self-calibration is impossible, unless some further assumptions on the internal parameters or the motion are made [21]. Also there exist degenerate configurations of scene points. However, in this paper we restrict ourselves to the generic case of camera motions and points configurations.

The rest of the paper is organized as follows. In Section 2, we recall some definitions and results from multiview geometry and deduce our self-calibration constraints. In Section 3, we describe in detail the algorithm. In Section 4 and Section 5, the algorithm is validated in a series of experiments on synthetic and real data. In Section 6, we discuss the results and make conclusions.

2. Preliminaries

2.1. Notation

We preferably use for scalars, for column 3-vectors or polynomials, and both for matrices and column 4-vectors. For a matrix the entries are , the transpose is , the determinant is , and the trace is . For two 3-vectors  and  the cross product is . For a vector  the notation  stands for the skew-symmetric matrix such that for any vector . We use 

for the identity matrix.

2.2. Fundamental and essential matrices

Let there be given two cameras and , where  is a matrix and  is a 3-vector. Let  be a -vector representing homogeneous coordinates of a point in 3-space, and  be its images, that is


where means an equality up to non-zero scale. The coplanarity constraint for a pair says


where matrix is called the fundamental matrix.

It follows from the definition of matrix  that . This condition is also sufficient. Thus we have

Theorem 1 ([8]).

A non-zero matrix  is a fundamental matrix if and only if


The essential matrix  is the fundamental matrix for calibrated cameras and , where is called the rotation matrix and  is called the translation vector. Hence, . Matrices  and  are related by


where and are the upper-triangular calibration matrices of the first and second camera respectively.

The fundamental matrix has seven degrees of freedom, whereas the essential matrix has only five degrees of freedom. It is translated into extra constraints on the essential matrix. The following theorem gives one possible form of such constraints.

Theorem 2 ([15]).

A matrix  of rank two is an essential matrix if and only if


2.3. Self-calibration constraints

Let be the angle of rotation between two calibrated camera frames. In case  is known, the trace of rotation matrix is known too. This leads to an additional quadratic constraint on the essential matrix.

Proposition 1.

Let be a real non-zero essential matrix and . Then  satisfies the equation


Let be such that . Then,


where . It is clear that if  satisfies Eq. (6), then so does  and vice versa. Let us represent  in terms of a unit quaternion , i.e.


where . Then,


Substituting this into (6), after some computation, we get


This completes the proof, since . ∎

It is well known [8, 15] that for a given essential matrix  there is a “twisted pair” of rotations  and  so that . By Proposition 1, and must be roots of Eq. (6). Since the equation is quadratic in  there are no other roots. Thus we have

Proposition 2.

Let be a real non-zero essential matrix satisfying Eq. (6) for a certain . Then, either or , where  is the twisted pair of rotations for .

Now suppose that we are given two cameras with unknown but identical calibration matrices  and . Then we have


where is the fundamental matrix. Substituting this into Eqs. (5)–(6), we get the following ten equations:


where . Constraints (12)–(13) involve the internal parameters of a camera and hence can be used for its self-calibration. We notice that not all of these constraints are necessarily linearly independent.

Proposition 3.

If the fundamental matrix  is known, then Eq. (12) gives at most three linearly independent constraints on the entries of .


Recall that matrix  is generically of rank two. Let the right and left null vectors of  be  and  respectively. Denote by  the l.h.s. of Eq. (12). Then it is clear that


It follows that, given , at least six of are linearly dependent. Proposition 3 is proved. ∎

3. Description of the Algorithm

The initial data for our algorithm is point correspondences , , and also the trace  of the rotation matrix .

3.1. Data pre-normalization

To significantly improve the numerical stability of our algorithm, points  and  are first normalized as follows, adapted from [8]. We construct a matrix  of the form


so that the new points, represented by the columns of matrix


satisfy the following conditions:

  • their centroid is at the coordinate origin;

  • their average distance from the origin is .

From now on we assume that  and  are normalized.

3.2. Polynomial equations

The fundamental matrix 

is estimated from

point correspondences in two views. The algorithm is well known, see [8] for details. In the minimal case there are either one or three real solutions. Otherwise, the solution is generically unique.

Let both cameras be identically calibrated and have Euclidean image planes, i.e.


where is the focal length and is the principal point. It follows that


where we introduce a new variable .

Substituting and  into constraints (12) and (13), we get ten quartic equations in variables and . Let  be the l.h.s. of (12). By Proposition 3, up to three of  are linearly independent. Let , , and . The objective is to find all feasible solutions of the following system of polynomial equations


Let us define the ideal . Unfortunately, ideal  is not zero-dimensional. There is a one-parametric family of solutions of system (19) corresponding to the unfeasible case . We state without proof the decomposition


where is the radical of  and is the quotient ideal, which is already zero-dimensional. By (20), the affine variety of  is the union of a finite set of points in  and a conic in the plane .

3.3. Gröbner basis

In this subsection, the Gröbner basis of ideal  will be constructed. We start from rewriting equations (19) in form


where is a coefficient matrix, and


is a monomial vector. Let us consider the following sequence of transformations:


Here each is the

reduced row echelon form

of . The monomials in  are ordered so that the left of matrix  is an identity matrix for each .

We exploit below some properties of the intermediate polynomials in (23), e.g. they can be factorized or have degree lower than one expects from the corresponding monomial vector. All of these properties have been verified in Maple by using randomly generated instances of the problem over the field of rationals.

Let us denote by the th row of matrix . Now we describe in detail each transformation of sequence (23).

  • The last row of corresponds to a 3rd degree polynomial. Matrix  of size is obtained from  by appending 3 new rows and 10 new columns. The rows are: , and . Monomial vector


    where we underlined the new monomials (columns) of matrix .

  • The polynomials corresponding to the last three rows of  are divisible by . Matrix  of size is obtained as follows. We append 6 new rows to , which are , and for . Monomial vector .

  • Matrix  of size is obtained from  by appending 6 new rows: , and for . Monomial vector .

  • The last row of  corresponds to a 2nd degree polynomial. Thus we proceed with the polynomials of degree up to . We eliminate from  rows and columns corresponding to all 4th degree polynomials and monomials respectively. Matrix  of size is obtained from  as follows. We hold the rows of  with numbers 4, 10, 11, 12, 13, 16, 17, 19, and append 3 new rows: , and . Monomial vector

  • Finally, matrix  of size is obtained from by appending 3 new rows: , and . Monomial vector .

The last six rows of matrix constitute the (reduced) Gröbner basis of ideal  w.r.t. the graded reverse lexicographic order with .

3.4. Internal and external parameters

Given the Gröbner basis of , the action matrix  for multiplication by  in the quotient ring can be easily constructed as follows. We denote by  the right lower submatrix of . Then the first three rows of  are the last three rows of . The rest of  consists of almost all zeros except


The six solutions are then found from the eigenvectors of matrix 

, see [4] for details. Complex solutions and the ones with are excluded.

Having found the calibration matrix , we compute the essential matrix  from formula (11) and then the externals  and  using the standard procedure, see e.g. [17]. Note that, due to Proposition 2 and the cheirality constraint [8, 17], the trace of the estimated matrix  must equal .

Finally, the denormalized entities (see subsection 3.1 and the definition of matrix ) are found as follows:

  • fundamental matrix is ;

  • calibration matrix is ;

  • essential matrix is unchanged, as ;

  • externals  and  are also unchanged.

4. Experiments on Synthetic Data

In this section we test the algorithm on synthetic image data. The default data setup is given in the following table:

Distance to the scene 1
Scene depth 0.5
Baseline length 0.1
Image dimensions

4.1. Numerical accuracy

First we verify the numerical accuracy of our method. The numerical error is defined as the minimal relative error in the calibration matrix, that is


Here stands for the Frobenius norm, counts all real solutions, and


is the ground truth calibration matrix. The numerical error distribution is reported in Fig. 1.

Figure 1. of numerical error for noise free data. The median error is

4.2. Number of solutions

Figure 2. The number of real (left) and feasible (right) solutions for the internal calibration. Noise free data

In general, the algorithm outputs six solutions for the calibration matrix, counting both real and complex ones. The number of real solutions is usually two or four. The number of real solutions with positive  (we call such solutions feasible) is equal to one in most cases. The corresponding distributions are demonstrated in Fig. 2.

4.3. Behaviour under noise

To evaluate the robustness of our solver, we add two types of noise to the initial data. First, the image noise which is modelled as zero-mean, Gaussian distributed with a standard deviation varying from 0 to 1 pixel in a

image. Second, the angle noise resulting from inaccurately found relative rotation angle . In practice, is computed by integrating the angular velocity measurements obtained from a 3d gyroscope. Therefore, a realistic model for the angle noise is quite complicated and depends on a number of factors such as the value of , the measurement rates, the inertial sensor noise model, etc. In a very simplified manner, the angle noise can be modelled as  [12], where  has the Gaussian distribution with zero mean and standard deviation . In our experiments  ranges from  to .

Figure 3. The relative error in the calibration matrix at varying level of noise (left) and at varying number of points (right). The number of image points on the left figure is . The standard deviation for the image noise on the right figure is 1 pixel

Fig. 3 demonstrates the behaviour of the algorithm under increasing the image and angle noise. Here and below each point on the diagram is a median of trials.

4.4. Comparison with the existing solvers

We compare our algorithm for the minimal number of points () with the 6-point solver from [20] and the 5-point solver from [19].

Fig. 4 depicts the relative focal length error at varying levels of image noise. Here and below means a -miscalibration in the parameters  and , i.e. the data was generated using the ground truth calibration matrix , whereas the solutions were found assuming that

Figure 4. The relative focal length error against noise standard deviation in pixels
Figure 5. The rotational (left) and translational (right) errors in degrees against noise standard deviation in pixels

In Fig. 5, the rotational and translational errors of the algorithms are reported. As it can be seen, for realistic levels of image noise, the 5-point algorithm expectedly outperforms as our solution as the 6-point solver. However, it is worth emphasizing that our solution is more suitable for the internal self-calibration of a camera rather than for its pose estimation. Once the self-calibration is done, it is more efficient to switch to the 5- or even 4-point solvers from [12, 17].

We also compared the speed of the algorithms. The average running times over trials are  ms (our 7pt),  ms (6pt) and  ms (5pt) on a system with 2.3 GHz processor. The most expensive step of our algorithm is the computation of the five reduced row echelon forms in sequence (23).

5. Experiments on Real Data

In this section we validate the algorithm by using the publicly available EuRoC dataset [1]. This dataset contains sequences of the data recorded from an IMU and two cameras on board a micro-aerial vehicle and also a ground truth. Specifically, we used the “Machine Hall 01” dataset (easy conditions) and only the images taken by camera “0”. The image size is (WVGA) and the ground truth calibration matrix is

Figure 6. The pair of undistorted images with the time stamps “1403636646263555584” and “1403636646613555456” from the EuRoC dataset and the matched points

The sequence of 68 image pairs was derived from the dataset and the algorithm was applied to every image pair, see example in Fig. 6. Here it is necessary to make a few remarks.

  • Since the algorithm assumes the pinhole camera model, every image was first undistorted using the ground truth parameters.

  • As it was mentioned in Subsection 4.2

    , the feasible solution is almost always unique. However in rare cases multiple solutions are possible. To reduce the probability of getting such solutions we additionally assumed that the principal point is sufficiently close to the geometric image center. More precisely, the solutions with


    were marked as unfeasible and hence discarded. This condition almost guarantees that the algorithm outputs at most one solution.

  • Given the readings of a triple-axis gyroscope the relative rotation angle can be computed as follows. The gyroscope reading at time  is an angular rate 3-vector . Let , where . Then the relative rotation matrix  between the th and th frames is approximately found from the recursion


    where and the matrix exponential is computed by the Rodrigues formula


    The relative rotation angle is then derived from .

  • The image pairs with the relative rotation angle less than  degrees were discarded, since the motion in this case is close to a pure translation and self-calibration becomes unstable.

The estimated internal parameters for each image pair are shown in Fig. 7. The calibration matrix averaged over the entire sequence is given by


Hence the relative error in the calibration matrix is about .

Figure 7. The estimated focal length and principal point coordinates for the 68 image pairs. The dashed horizontal lines represent the average values

6. Conclusions

We have presented a new practical solution to the problem of self-calibration of a camera with Euclidean image plane. The solution operates with at least seven point correspondences in two views and also with the known value of the relative rotation angle.

Our method is based on a novel quadratic constraint on the essential matrix, see Eq. (6). We expect that there could be other applications of that constraint. In particular, it can be used to obtain an alternative solution of the problem from paper [12]. The investigation of that possibility is left for further work.

We have validated the solution in a series of experiments on synthetic and real data. Under the assumption of generic camera motion and points configuration, it is shown that the algorithm is numerically stable and demonstrates a good performance in the presence of noise. It is also shown that the algorithm is fast enough and in most cases it produces a unique feasible solution for camera calibration.


  • [1] M. Burri, J. Nikolic, P. Gohl, T. Schneider, J. Rehder, S. Omari, M.W. Achtelik, and R. Siegwart, The EuRoC micro aerial vehicle datasets, The International Journal of Robotics Research 35 (2016), no. 10, 1157–1163.
  • [2] M. Byröd, K. Josephson, and K. Åström, Improving numerical accuracy of Gröbner basis polynomial equation solver, Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on, IEEE, 2007, pp. 1–8.
  • [3] Bruno Caprile and Vincent Torre, Using vanishing points for camera calibration, International journal of computer vision 4 (1990), no. 2, 127–139.
  • [4] D. Cox, J. Little, and D. O’Shea, Ideals, varieties, and algorithms, vol. 3, Springer, 2007.
  • [5] O.D. Faugeras, Three-dimensional computer vision: A geometric viewpoint, MIT Press, 1993.
  • [6] R. Hartley, Estimation of relative camera positions for uncalibrated cameras, European Conference on Computer Vision, Springer, 1992, pp. 579–587.
  • [7] by same author, Self-calibration from multiple views with a rotating camera, European Conference on Computer Vision, Springer, 1994, pp. 471–478.
  • [8] R. Hartley and A. Zisserman, Multiple view geometry in computer vision, Cambridge University Press, 2003.
  • [9] J. Heikkilä, Geometric camera calibration using circular control points, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (2000), no. 10, 1066–1077.
  • [10] A. Heyden and K. Åström, Euclidean reconstruction from image sequences with varying and unknown focal length and principal point

    , Conference on Computer Vision and Pattern Recognition, IEEE, 1997, pp. 438–443.

  • [11] Z. Kukelova, M. Bujnak, and T. Pajdla,

    Polynomial eigenvalue solutions to the 5-pt and 6-pt relative pose problems

    , British Machine Vision Conference, vol. 2, 2008.
  • [12] B. Li, L. Heng, G.H. Lee, and M. Pollefeys, A 4-point algorithm for relative pose estimation of a calibrated camera with a known relative rotation angle, IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2013, pp. 1595–1601.
  • [13] H. Li, A simple solution to the six-point two-view focal-length problem, European Conference on Computer Vision, Springer, 2006, pp. 200–213.
  • [14] David Liebowitz and Andrew Zisserman, Combining scene and auto-calibration constraints, Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, vol. 1, IEEE, 1999, pp. 293–300.
  • [15] S.J. Maybank, Theory of reconstruction from image motion, Springer-Verlag, 1993.
  • [16] S.J. Maybank and O.D. Faugeras, A theory of self calibration of a moving camera, International Journal of Computer Vision 8 (1992), no. 2, 123–151.
  • [17] D. Nistér, An efficient solution to the five-point relative pose problem, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (2004), no. 6, 756–770.
  • [18] L. Quan and B. Triggs, A unification of autocalibration methods, Asian Conference on Computer Vision, 2000, pp. 917–922.
  • [19] H. Stewénius, C. Engels, and D. Nistér, Recent developments on direct relative orientation, ISPRS Journal of Photogrammetry and Remote Sensing 60 (2006), no. 4, 284–294.
  • [20] H. Stewénius, D. Nistér, F. Kahl, and F. Schaffalitzky, A minimal solution for relative pose with unknown focal length, Image and Vision Computing 26 (2008), no. 7, 871–877.
  • [21] P. Sturm, Critical motion sequences for monocular self-calibration and uncalibrated Euclidean reconstruction, Conference on Computer Vision and Pattern Recognition, IEEE, 1997, pp. 1100–1105.
  • [22] by same author, Self-calibration of a moving zoom-lens camera by pre-calibration, Image and Vision Computing 15 (1997), no. 8, 583–589.
  • [23] B. Triggs, Autocalibration and the absolute quadric, Conference on Computer Vision and Pattern Recognition, IEEE, 1997, pp. 609–614.
  • [24] R.Y. Tsai, A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf TV cameras and lenses, IEEE Journal on Robotics and Automation 3 (1987), no. 4, 323–344.
  • [25] Z. Zhang, A flexible new technique for camera calibration, IEEE Transactions on pattern analysis and machine intelligence 22 (2000), no. 11, 1330–1334.