DeepAI
Log In Sign Up

Fitting Generalized Essential Matrices from Generic 6x6 Matrices and its Applications

This paper addresses the problem of finding the closest generalized essential matrix from a given 6x6 matrix, with respect to the Frobenius norm. To the best of our knowledge, this nonlinear constrained optimization problem has not been addressed in the literature yet. Although it can be solved directly, it involves a large amount of 33 constraints, and any optimization method to solve it will require much computational time. Then, we start by converting the original problem into a new one, involving only orthogonal constraints, and propose an efficient algorithm of steepest descent-type to find the goal solution. To test the algorithm, we evaluate with both synthetic and real data. From the results with synthetic data, we conclude that the proposed method is much faster than applying general optimization techniques to the original problem with 33 constraints. To conclude and to further motivate the relevance of our method, we develop an efficient and robust algorithm for estimation of the general relative pose problem, which will be compared with the state-of-the-art techniques. It is shown, in particular, that some existing approaches to solving the relative pose estimation problem can be considerably improved, if combined with our method for estimating the closest generalized essential matrix. Real data to validate the algorithm is used as well.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

01/21/2021

Fast and Robust Certifiable Estimation of the Relative Pose Between Two Calibrated Cameras

The Relative Pose problem (RPp) for cameras aims to estimate the relativ...
04/14/2019

Gyroscope-aided Relative Pose Estimation for Rolling Shutter Cameras

The rolling shutter camera has received great attention due to its low c...
03/30/2020

Certifiable Relative Pose Estimation

In this paper we present the first fast optimality certifier for the non...
08/30/2022

Prior-Aware Synthetic Data to the Rescue: Animal Pose Estimation with Very Limited Real Data

Accurately annotated image datasets are essential components for studyin...
02/10/2018

A generalized matrix Krylov subspace method for TV regularization

This paper presents an efficient algorithm to solve total variation (TV)...
11/23/2018

MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Data labeling for learning 3D hand pose estimation models is a huge effo...
04/11/2021

USACv20: robust essential, fundamental and homography matrix estimation

We review the most recent RANSAC-like hypothesize-and-verify robust esti...

1 Introduction

The epipolar constraint is one of the fundamental geometry constraints in computer vision. It relates the rigid transformation between two cameras with different external parameters

[1, 2] and correspondences between points in the two images. It is one of the most common tools for scene reconstruction, known as passive techniques, i.e. two cameras looking at the same scene at different positions. These techniques have also been used in robot vision applications, such as visual odometry [3].

For many years, authors focused on perspective cameras to build this stereo pair [2], see Fig. 1LABEL:sub@fig:central_rep. However, these cameras have, among several disadvantages, a limited field of view. To overcome this, some authors have developed new camera systems. Special emphasis has been devoted to omnidirectional cameras, which can be either set up using catadioptric camera systems (combination of perspective cameras with mirrors) [4, 5, 6], or multi-perspective camera systems [7, 8], both ensuring a wider field of view. In most of these cases, these omnidirectional cameras are non-central (e.g. see for example [9] for omnidirectional catadioptric cameras).

Other types of imaging sensors have been proposed. Most of them cannot be modeled by the perspective camera model due to their physical constraints. Examples include pushbroom cameras [10] or refractive imaging devices [11]. Cameras that cannot be modeled by the perspective camera model are called non-central. To handle any type of camera (central or non-central), we consider a pair of camera systems that are parameterized by the general camera model [12, 13, 14]. In this model, camera pixels are mapped onto generic 3D straight lines in the world. If all these 3D lines intersect in a single point, the system will be central (see Fig. 1LABEL:sub@fig:central_rep), otherwise it is non-central (see Fig. 1LABEL:sub@fig:ncentral_rep).

For a central camera stereo problem, one can define a matrix that encodes the incident relation between the projection lines of both cameras [1, 2] (epipolar constraint). This matrix is called essential. However, in the case of general camera systems, such a matrix cannot be used because the central constraints are not fulfilled. Instead, it can be proved that there is a matrix that expresses the incident relation between the projection lines of two general camera systems [15, 16]. Such a matrix is called generalized essential

and has a special block structure, involving rotation and skew-symmetric matrices of order

. Often, computer vision problems where this matrix has to be determined are affected with noise and the result is a matrix of order that fails to fit the structure characterizing the generalized essential matrices. In these cases, one needs to find the closest generalized essential matrix from a generic matrix. From a mathematical point of view, this is a nonlinear constrained matrix optimization problem. The fact that it is a nonconvex problem raises many difficulties to find a global minimum.

Examples of applications that require the estimation of generalized essential matrices are: the computation of the minimal relative pose position for general camera models [17]; the estimation of the camera motion using multiple multi-perspective camera systems [7, 8]; general structure-from-motion algorithms [18, 19]; and the estimation of the camera absolute pose using known coordinates of 3D straight lines [20]. Since the special structure of generalized essential matrices involves many nonlinear constraints (a rotation matrix and its product with a skew-symmetric matrix), finding the correct parameters of these matrices may slow down significantly the algorithms. An important advantage of having a method to approximate general essential matrices from generic matrices, is to allow us to get rid of some of those constraints (or, at least, to reduce the tolerance of these constraints in the optimization processes) to turn the algorithms faster.

(a) Central Camera Systems
(b) General Camera Systems
Figure 1: Representation of the differences between central and general camera systems, Figs. LABEL:sub@fig:central_rep and LABEL:sub@fig:ncentral_rep, respectively. In this paper we address the general case, in which the camera system does not have a single view point.

In this paper, we show that the problem under investigation can be reformulated as an optimization problem with only orthogonal constraints. Then, we present an efficient algorithm to find a solution. In addition, we give theoretical arguments to explain why the problem has a global minimum (some open questions will be pointed out). We recall that methods for problems with orthogonal constraints are available in the literature [21, 22, 23, 24], but their difficulties depend significantly on the objective function. In our problem, the objective function is not easy to handle, raising many challenging issues. To address them, techniques from matrix theory and optimization on manifolds are required.

The method proposed in this paper aims at approximating a generalized essential matrix , from a general matrix , with the assumption that the latter was estimated by ignoring some of the generalized essential constraints. Our motivation for developing such a method is twofold: 1) when, for some reason, methods that estimate do not consider some of the generalized essential matrix constraints or ignore them altogether (such as DLT techniques), methods such as ours are helpful to obtain a true generalized essential matrix; and 2) when a large tolerance for the generalized essential constraints is used to speed up the computation of , in complex optimization techniques, methods like ours can be used to correct the result. Experiments illustrating the latter situation will be presented in Sec. 5.

We point out that the main goal of this work is not to propose a technique for the well-known problem of estimating a generalized essential matrix from a set of projection lines (as did in [25] for general camera systems and [26] for multi-camera systems), but instead, to estimate from a previously computed by other techniques. However, to reinforce the importance of our method, we propose a new approach for the general relative pose problem, which results from a combination of known techniques with posterior estimation of the closest generalized essential matrix.

We recall that, for the central case, similar approximation techniques have been used in pose position estimation. For example, in [27], for central perspective cameras, the author estimates a true essential matrix with respect to the Frobenius norm, by firstly using DLT techniques to compute a rough estimative to the essential matrix. He proved that this method performs almost as well as the best iterative algorithms, being faster in many cases. More recently, other methods have been developed using similar approximation techniques for the central perspective camera; see, for instance, [28, 29]. We believe that the method we are proposing in this paper will contribute for the development of similar techniques, but for general camera models.

1.1 Notation

Small regular letters denote scalars, e.g.

; small bold letters denote vectors, e.g.

; and capital bold letters denote matrices, e.g. . In a matrix , denotes the submatrix composed by the lines from to and columns from to . We represent general projection lines using Plücker coordinates [30]:

(1)

where and

are the line direction and moment, respectively. The operator

denotes a vector that can be represented up to a scale factor.

The hat operator represents the skew-symmetric matrix that linearizes the cross product, i.e. . denotes the Frobenius norm (also known as -norm) of the matrix , which can be defined as a function of the trace:

(2)

stands for the group of rotation matrices of order 3, i.e., the group of orthogonal matrices with determinant equal to 1. To conclude, denotes an diagonal matrix, with diagonal entries equals to , represents the matrix exponencial of , and the asterisk symbol in denotes a minimizer of an optimization problem.

1.2 Outline of the Paper

This paper is organized as follows. In the next section we start by revisiting the similar, but simpler, problem of finding the closest essential matrix arising in central camera systems. Then we formulate mathematically the problem under investigation in this work and explain how it could be solved in a straightforward fashion. In Sec. 3, we reformulate the original problem and derive an efficient solution for it. In Sec. 4, our method is compared with the direct solution using synthetic data. Results with the motivation on the use of our technique are shown in Sec. 5. Experimental results are discussed in the respective experimental sections. Finally, in Sec. 6, some conclusions are drawn.

2 Problem Definition and Direct Solutions

For a better understanding of the generalized essential matrix parametrization, we first revisit the simpler case of the regular essential matrix corresponding to Fig. 1LABEL:sub@fig:central_rep. An essential matrix aims at representing the incident relation between two projection lines of two cameras looking at the same 3D points in the world. The rigid transformation between both coordinate systems is taken into account.

Without loss of generality, it is assumed that both cameras are represented at the origin of each coordinate system, ensuring that all the 3D projection lines of each camera pass through the origin of the respective camera coordinate system. Under this assumption, one can represent 3D projection lines by the respective directions, here denoted as and for 3D projection lines that must intersect in the world, where and represent the left and right rays, respectively. Moreover, we assume that the transformation between both camera systems is given by a rotation matrix and a translation vector . Using this formulation, an essential matrix can be defined by:

(3)

Hence, the problem of finding the closest essential matrix from a general , may be formulated as:

(4)

It has an explicit solution (see, for instance, Theorem 5.9 in [1]):

(5)

where the orthogonal matrices and and scalars

are given by the singular value decomposition of

:

(6)

It turns out that, for the non-central case (the one addressed in this paper), and since the perspective constraints are note verified, we cannot represent the 3D projection lines only with its directions [12]. One has to parametrize these 3D projection lines using a general 3D straight lines representation (unconstrained 3D straight lines), see Fig. 1LABEL:sub@fig:ncentral_rep. Let us represent lines of both cameras as and , represented in each coordinate system and parameterized by Plücker coordinates (see Sec. 1.1). The incident relation between both 3D projection lines is given by:

(7)

where is a generalized essential matrix [15, 16, 31] and denotes the set of generalized essential matrices:

(8)

As observed in (8), a generalized essential matrix has a special form. It is built up by block matrices that depend on rotation and translation parameters. Likewise most algorithms for the estimation of essential matrices, in many situations is estimated without enforcing the respective constraints. This happens, in particular, when using DLT techniques [2] or iterative schemes that do not take into account all the constraints, to speed up the respective process.

One of the goals of this paper is to estimate a true generalized essential matrix (i.e., a matrix satisfying the constraints associated with (8)) that is closest to a general with respect to the Frobenius norm. Formally, this problem can be formulated as:

(9)

The critical part in the previous optimization problem is to ensure that belongs to the space of solutions . There is however a trivial way of ensuring this, which can be derived directly from (8). These constraints are associated with the fact that must be built up by stacking both and . This corresponds to the following optimization problem:

(10)
subject to

Some drawbacks associated with (10) are related with the large amount of constraints. In total, 33 constraints are involved, being many of them quadratic. As we shall see in the experimental results, this will increase significantly the computational time required to fit the generalized essential matrix.

In the next section, we derive an efficient algorith to solve (9), that exploits the particular features of the objective function and of the constraints.

3 An Efficient Solution

Now, we describe a method for the efficient approximation of the generalized essential matrix from a given matrix, with respect to the Frobenius norm. Our approach, first, represents the problem independently from the translation parameters, which means that only orthogonal constraints will be involved (Sec. 3.1). Then, we propose an efficient algorithm to solve the reformulated optimization problem (Sec. 3.2). To conclude this section, a pseudocode of the algorithm will be included (Sec. 3.3).

3.1 Reformulation of the Problem

Since our goal is to represent the problem independently from the translation parameters, we start by eliminating the skew-symmetric constraints in (9). To ease the notation, we define the following submatrices: ; ; ; and .

We start by finding a workable expression for the objective function in terms of and :

(11)

Attending to the linearity of the trace function and the fact that , the following expression is obtained for the objective function:

(12)

where:

(13)

is a constant. Let us denote and . With respect to the Frobenius norm, it is well-known and easy to show that the nearest skew-symmetric from a given generic matrix is the so-called skew-symmetric part of (check Theorem 5.3 in [32] with ):

(14)

Hence, we can replace in (12), yielding:

(15)

where:

(16)

Writing again the Frobenius norm in terms of the trace of a matrix gives:

(17)

with being the constant given in (16). This allows a new reformulation of the problem (9) as:

(18)
subject to

which has only orthogonal constraints.

3.2 Solving the Problem

Many optimization problems with orthogonal constraints have been investigated in the last two decades; see for instance [21, 23, 24, 33]. The right framework to deal with this kind of problems is to regard them as optimization problems on matrix manifolds. Tools from Riemaniann geometry, calculus on matrix manifolds, and numerical linear algebra are required. Similar techniques can be used in our particular problem (18), because the set of rotation matrices is a manifold. It has also a structure of Lie group (see [34]) and is a compact set. We recall that this latter property guarantees the existence of at least a global minimum for (18); see Part III of [35]. However, the complicated expression of the objective function turns hard to find an explicit expression for those global minima. In addition, the lack of convexity of our problem (neither the objective function nor the constraints are convex) may only guarantee the approximation of local minima.

It turns out however that some numerical experiments (not showed in this paper) suggest that the approximation produced by Algorithm 1, in Sec. 3.3, is indeed the global one. We could observe this because different initial guesses led to convergence to the same matrix. Unfortunately, a theoretical confirmation that Algorithm 1 always converges to a global minimum is still unknown. Nevertheless, it can be guaranteed that when is close to being a generalized essential matrix (this happens in many practical situations, as shown later in Section 4), a local minimizer for (18) will be very close to the global one. To give more insight into this claim, let us consider the original formulation (9) and denote by a local minimizer, by a global minimizer and assume that:

(19)

for a certain positive value . Since , we have:

(20)

which shows that:

(21)

This means that if then . For instance, if , then with an error not greater that . Note that both and are generalized essential matrices.

Once a local minimum of (18) has been computed, the corresponding skew-symmetric matrix needed in (9) will be:

(22)

Hence, the required nearest generalized essential matrix from will be, therefore, given by:

(23)

The algorithm we will propose in Sec. 3.3 is of steepest descent type. Loosely speaking, these methods are essentially based on the property that the negative of the gradient of the objective function points out the direction of fastest decrease. For more details on general steepest descent methods see, for instance, [35, Sec. 8.6] or [36, Ch. 3]. In our situation, one has to account the constraint that must be a rotation matrix and so our algorithm will evolve on the manifold . Hence we have to resort to steepest descent methods on matrix manifolds (see [34, Ch. 3]). For particular manifolds, tailored methods exploiting its special features have been proposed by many authors. For instance, for the complex Stiefel manifold, Manton [23, Alg. 15] proposed a modified steepest descent method based on Euclidean projections, where the length of the descent direction is calculated by the Armijo’s step-size rule. This method has been adapted and improved by Abrudan et al. in [33, Table II] for the manifold of unitary matrices, where geodesics replace the Euclidean projections. When dealing with manifolds that are Lie groups, geodesics are defined upon the matrix exponential, which is a much studied matrix function [37].

The strategy adopted for the steepest descent algorithm to be described below has been inspired in [23, 33] and we refer the reader to those papers for more technical details. The particular nature of the objective function (17

) is exploited in order to improve the efficiency of the method. In particular, we propose a specific technique to choose an initial guess (thus reducing the number of iterations) and avoid expensive methods based on Schur decompositions and Padé approximation for the computation of matrix exponentials.

Before displaying the steps of our algorithm, one needs to find the Euclidean and the Riemannian gradients of the objective function. After some calculations (see [38] for formulae on derivatives of the trace function), the Euclidean gradient of at (17) is:

(24)

and the Riemaniann gradient is:

(25)

Note that is a “tangent vector” that is actually a skew-symmetric matrix. Geodesics on (i.e., curves giving the shortest path between two points in the manifold) can be defined through the matrix exponential as:

(26)

where is a skew-symmetric matrix representing a translation and is a real scalar. Algorithm 1 summarizes the main steps of our method.

3.3 Algorithm and Computational Remarks

In a few words, Algorithm 1 starts with an initial approximation , finds the skew-symmetric matrix (the gradient direction on the manifold), and performs several steps along geodesics until convergence. The positive scalar controls the length of the “tangent vector” and, in turn, the overall convergence of the algorithm. To find an almost optimal , the algorithm uses the Armijo’s step-size rule as did in [23].

1:;
2:;
3:;
4: is an initial guess;
5:;
6:error ;
7:Choose a tolerance tol;
8:while error > tol do
9:     ;
10:     ;
11:     ;
12:     ;
13:     ;
14:     while  do
15:         ;
16:         ;
17:         ;      
18:     while  do
19:         ;
20:         ;      
21:     ;
22:     error;
23:     ;
24:;
25:;
26:.
Algorithm 1 Given a matrix , this algorithm approximates the closest generalized essential matrix from for a given tolerance tol.
(a) As a function of the computational speed.
(b) As a function of the number of iterations.
(c) As a function of the computational speed.
(d) As a function of the number of iterations.
Figure 2: Comparison between the method presented in this paper against general optimization techniques applied to (10), as a function of the variation of the Noise Level. The Tolerance value is set to and . We evaluate the methods for both the computational speed and the number of iterations, Figs. LABEL:sub@fig:results_evaluation_noise_speed_e9 & LABEL:sub@fig:results_evaluation_noise_iter_e9, and LABEL:sub@fig:results_evaluation_noise_speed_e6 & LABEL:sub@fig:results_evaluation_noise_iter_e6, for the Tolerance level of and , respectively.

Now we draw our attention for some important computational remarks about the algorithm.

  • As the algorithm runs, the function , which involves the computation of traces of products of matrices, is called several times. Note that the efficient computation of does not require matrix products. Instead, it can be carried out through the formula:

    (27)

    where the operator denotes the Hadamard product, i.e. entry-wise product. If and are matrices of order , the direct computation of the matrix product needs operations, while the trace at (27) just requires ;

  • The exponential of the skew-symmetric matrix in lines 12 and 19 of the algorithm can computed by means of the well-known Rodrigues’ formula [39]:

    (28)

    which involves (at leading cost) the computation of just one matrix product. The direct use of the MATLAB function (which is based on the scaling and squaring method combined with Padé approximation) would be much more expensive.

  • Note that the trace of any skew-symmetric matrix is always zero and so:

    (29)

    This guarantees that matrices do not leave the rotation manifold ; and

  • To conclude, we remark that the choice of the initial guess influences the running time of the algorithm. An obvious choice for

    would be the identity matrix

    of order . It turns out that other choices of may reduce significantly the number of iterations in the algorithm. In our experiments, we have chosen as the initial guess the rotation matrix that maximizes of the sum defining in (17). We recall that this problem has an explicit solution based on the singular value decomposition of (see [40, Sec. 12.4]):

    (30)

    with and being the orthogonal matrices arising in the singular value decomposition of , that is, . Since, in general, , it is expected that will be more close to the minimizer than .

4 Implementation of Our Method

In this section, Algorithm 1 is compared with general optimization techniques applied to the direct formulation of the problem (10), using synthetic data.

(a) As a function of the computational speed, for a Noise Level of .
(b) As a function of the number of iterations, for a Noise Level of .
(c) As a function of the computational speed, for a Noise Level of .
(d) As a function of the number of iterations, for a Noise Level of .
Figure 3: Comparison between Algorithm 1 and general optimization techniques applied to (10), as a function of the Tolerance considered in the algorithms. The colors of the curves and associated algorithms are identified in Fig. 2. We have considered two distinct values to the Noise Level: and , which correspond to the cases in which our method performed worst than general optimization techniques, but just in terms of the number of iterations (see Fig. 2). We evaluate the methods for both the number of iterations and computational speed, Figs. LABEL:sub@fig:results_evaluation_tolerance_iter & LABEL:sub@fig:results_evaluation_tolerance_iter_2 and LABEL:sub@fig:results_evaluation_tolerance_speed & LABEL:sub@fig:results_evaluation_tolerance_speed_2 respectively.

4.1 Experimental Results

Our method (here denoted as OUR) is tested against the following widely used general optimization techniques:

interior-point:

Solution of (10), by the interior-point method [41];

sqp:

Solution of (10) by the sequential quadratic programming method [42];

active-set:

Solution of (10) by the active-set method [43].

Likewise the general optimization techniques, OUR algorithm was optimized (part of it was implemented in C++), and can be accessed from MATLAB. All the results shown below were implemented on this framework.

For the data-set, we first generate random rotation matrices and random translation vectors . With these rotation and translation elements, we build a generalized essential matrix , as defined in (7).

To carry out the experiments, we propose a variation of the deviation of a generic matrix in from a true generalized essential matrix. The procedure is as follows: we first generate an error matrix

, in which the respective elements are randomly generated from a normal distribution with standard deviation equal to the variable

Noise Level, and then compute the “noisy” matrix as . All the aforementioned methods are then applied.

Two tolerance values for the algorithms were selected, & , and we change the variable Noise Level from to . Results for both the computational speed and the number of iteration are displayed in Figs. 2LABEL:sub@fig:results_evaluation_noise_speed_e9 & 2LABEL:sub@fig:results_evaluation_noise_speed_e6 and 2LABEL:sub@fig:results_evaluation_noise_iter_e9 & 2LABEL:sub@fig:results_evaluation_noise_iter_e6, respectively. For each value of the Noise Level, random trials were generated.

In addition to the evaluation of the deviation from the generalized essential matrix constraints, we have tested the proposed method against general optimization algorithms as a function of the tolerance of the algorithms as well (here denoted as Tolerance value). For that purpose, we fixed a Noise Level equal to and to  111These values for the Noise Level have been chosen to illustrate the case where our method performed worst than the other methods, with respect to the number of iterations., and select a Tolerance value ranging from to . The results for both the computational speed and the number of iterations are shown in Figs. 3LABEL:sub@fig:results_evaluation_tolerance_speed &3LABEL:sub@fig:results_evaluation_tolerance_speed_2 and 3LABEL:sub@fig:results_evaluation_tolerance_iter & 3LABEL:sub@fig:results_evaluation_tolerance_iter_2, respectively. Likewise the previous case, for each level of tolerance, random trials were generated.

Next, we discuss these experimental results.

4.2 Discussion

As observed in Figs. 2LABEL:sub@fig:results_evaluation_noise_speed_e9 and 2LABEL:sub@fig:results_evaluation_noise_speed_e6, our method is significantly faster than the general optimization techniques. Its computational speed rarely changes as a function of the deviation from the generalized essential matrix constraints. In fact, as shown in Figs. 3LABEL:sub@fig:results_evaluation_tolerance_speed and 3LABEL:sub@fig:results_evaluation_tolerance_speed_2, which represents the worst scenario for OUR method, it can be seen that it performs times faster than any other method. While our method requires less than seconds, the other methods can take between and seconds. This is a remarkable advantage of our method for applications requiring real-time computations, such as the camera relative pose estimation, which will be addressed later in Sec. 5.

In Figs. 2LABEL:sub@fig:results_evaluation_noise_iter_e9 and 2LABEL:sub@fig:results_evaluation_noise_iter_e6 one can observe that, contrarily to general optimization techniques, the relationship between the number of iterations and Noise Level is nearly linear. With the exception of the sqp method, for low levels of noise, in general our method requires less iterations. However, as described in the previous paragraph, the computational speed is significantly lower for any Noise Level, independently of the number of the required iterations.

In addition to the evaluation in terms of deviation from the true generalized essential matrices, we have also compared the proposed method with general optimization techniques as a function of the tolerance in the algorithms. We have considered a Noise Level of and , which causes the worst results for OUR method when evaluating in terms of deviation from a generalized essential matrix. In all results displayed in Fig. 3LABEL:sub@fig:results_evaluation_tolerance_speed and 3LABEL:sub@fig:results_evaluation_tolerance_speed_2, OUR method was significantly faster than any other algorithm, despite the fact that, for low levels of tolerance, the number of iterations required by our method being usually higher.

Still regarding the evaluation in terms of Tolerance, from Figs.  3LABEL:sub@fig:results_evaluation_tolerance_iter and 3LABEL:sub@fig:results_evaluation_tolerance_iter_2 one can easily see that, in the direct solutions, the number of iterations varies dramatically for different levels of noise. While, for OUR method, that variation is less significant.

5 Results in real applications

This section includes practical examples illustrating how Algorithm 1 can be combined with other techniques to improve the results.

In Sec. 5.1 more advantages of using generalized essential matrix approximations are evidenced, in a relative pose problem for general catadioptric cameras. In Sec. 5.2 another application of Algorithm 1, using real data, will be shown. To conclude this section, in Sec. 5.3, we discuss the experimental results.

5.1 A Relative Pose Problem with Non-Central Catadioptric Cameras

Let us consider a relative position estimation problem, using a non-central catadioptric camera, composed with a perspective camera and a spherical mirror [9, 44].

(a) Example of the simulated world scene.
(b) Example of the correspondence between image points.
Figure 4: Representation of the simulated environment created for the evaluation. At Fig. LABEL:sub@fig:synthetic_world we show the 3D scene simulated, including: the set of 3D points; the camera system; and the subset of the path that the camera must follow. Fig LABEL:sub@fig:synthetic_image shows an example of the projection of the 3D points in one image frame, and its correspondence with the projected points in the previous frame.

We synthetically generate a set of 3D points in the world (see Fig. 4LABEL:sub@fig:synthetic_world) and, then, define a path for the camera. While the camera is following the path, we compute the projection of the 3D points onto the image of the catadioptric camera system [44] (see Fig. 4LABEL:sub@fig:synthetic_image). Then, with the knowledge of the matching between pixels at consecutive image frames, we aim at computing the rotation and translation parameters ensuring the intersection of the respective inverse projection lines resulting from the images of the 3D points in consecutive image frames, in the world.

A general technique to handle this problem can be mathematically formulated as ( is a matrix in ):

(31)

where

(32)

and represent the matching between the image projection lines on left and right cameras, respectively, and is the number of matching points.

We consider six distinct methods for the computation of the relative pose, based on the estimation of the generalized essential matrix:

Full

Denotes the relative pose estimation that aligns 3D straight lines in the world to ensure that they intersect, by the optimization problem (31). A tolerance of was considered for the constraints;

Without Constraints

Denotes a method similar to Full, i.e. the problem of (31). However, in this case, a different value of was considered for the tolerance of constraints;

OUR + WC:

Consists in, first, estimating an initial solution using the Without Constraints method and, then, applying OUR method to estimate a true generalized essential matrix (Algorithm 1), with tolerance for the constraints;

interior-point + WC:

Same as OUR + WC but now the approximation is given by solving (10) with the interior-point;

SQP + WC:

Same as interior-point + WC, but with the approximation of (10) obtained with the sqp algorithm;

active-set + WC:

Same as interior-point + WC but now (10) is solved by active-set.

The results for the distribution of the computational time required to compute each image frame are shown in Fig. 5 (box plot graph). These results are commented later in Sec. 5.3.

Figure 5: Results for the distribution of the computational time required for computing the camera relative pose, for each of the methods described in Sec. 5.1.

Note that now we are dealing with a different optimization problem from (10), despite the constraints coincide.

5.2 Experiments with Real Data

To conclude the experimental results, we apply Algorithm 1 to an absolute pose estimation problem in the framework of general camera models, and known coordinates of 2D straight lines in the world coordinate system [20]. We consider a non-central catadioptric camera (with a spherical mirror) and moved the camera along a path in the lab.

3D lines in the world are associated with a set of pixels in the image (see Fig. 6). The goal is to find the generalized essential matrix that aligns the 3D inverse projection lines from these pixels with the known 3D straight lines in the world, in order to guarantee their intersection.

This problem can be solved by the same strategy proposed in the previous subsection, i.e. by using the optimization problem of (31), but in this case with the following objective function:

(33)

where: represent the known 3D straight lines in the world; the inverse projection lines corresponding pixels that are images of the line; are the number of image pixels that are images of the line ; and is the number of known 3D straight lines in the world.

We consider the six methods proposed in Sec. 5.1. A comparison between the trajectories of OUR + WC222Notice that the other approximate techniques would give the same results as OUR + WC. The difference between them depends only on the required computational time. method and the FULL are shown in Fig. 6. The distribution of the required computational time for each frame is shown at Fig. 7.

Figure 6: Results obtained with real images. At the left, we show an example of an image acquired by the non-central catadioptric camera. On the right, we show the results for the trajectory (top view) computed with the FULL and OUR+WC methods, described in Sec. 5.1.

5.3 Discussion of the Results

In this section we discuss the experimental results shown in the previous subsections. We start by analysing the results of the approximation of general matrices, in which we compare the performance of the proposed method against the direct solution (these are the main results of this paper) in a non-central catadioptric relative camera pose. Next, we discuss the experiments with real data, in which we compare performance of our approximation technique against the direct solution, in an absolute pose problem.

Figure 7: Distribution of the computational time obtained for the absolute pose problem using a real non-central catadioptric camera. We consider all the algorithms described in Sec. 5.1.

In all these tests, we have imposed as the maximum number of iterations for Algorithm 1. It is worth noting that the algorithm has never reached such a number of iterations, which means that it always converged.

Notice that one of the main contributions of this paper (Sec. 3) is not to estimate the relative pose of an imaging device, but, instead, to find generalized essential matrices from general matrices (that do not verify (10)).

5.3.1 Evaluation in a non-central catadioptric relative pose problem

When considering the experiments carried out in Sec. 5.1, first, we conclude that increasing the tolerance of constraints on the FULL algorithm (i.e. not fully considering the underlying constraints of the generalized essential matrix (31)) and, then, recover a true generalized essential matrix, by Algorithm 1, leads to significant savings in computational time. See the comparison between FULL and all the other methods in Figs. 5 and 7. In fact, it can be seen that the differences between OUR + WC and Without Constraints (the optimization without fully considering the underlying constraints of the generalized essential matrix) can be neglected, while this does not happen for the direct solution. We recall that the Without Constraints method does not produce a true generalized essential matrix, while the other ones do.

In addition, from Fig. 4, one can conclude that this procedure (compute and then find the closest ) does not diminishing significantly the results.

To conclude, one can see that estimating for and, then, find that approximates (see (9)) will result in a much faster algorithm than looking directly for .

5.3.2 Validation using Real Data

To conclude the experimental results, we validate the proposed fitting technique (Algorithm 1) against the direct solution (with the above mentioned general optimization techniques on the problem defined in (10)) in a real application of an absolute camera pose estimation, when using non-central catadioptric cameras, see Fig. 6.

When considering the results of Fig. 7, one can easily see that, while the use of the direct solution and all the three general optimization techniques will have an impact on the computation time (see the results for interior-point+WC, SQP+WC, and active-set against Without Constraints), the difference between OUR+WC and Without Constraints can be neglected, being much faster than the FULL technique.

Still from Fig. 6, one can conclude that approximating (a true generalized essential matrix) from a general matrix does not degrade significantly the results, being much faster than estimating directly (that is shown by comparing FULL and OUR+WC in Fig. 7).

In fact, although these tests have different goals (one is related with the relative camera pose and the other with the absolute pose), these results are very similar to the ones presented in Sec. 5.1, validating the results using synthetic data.

6 Conclusions

This paper addresses the fitting of generalized essential matrices, from general matrices, and its implications. We have presented a novel technique to estimate a generalized essential matrix, that is close to a general matrix which, to the best of our knowledge, has not been yet addressed in the literature, despite the fact that there are some direct solutions (which can be solved with general iterative techniques) that can be used to find such a solution.

Our contributions are: first, we express the problem using only orthogonal constraints and, then, we propose a specific optimization technique to find efficiently the goal solution. We test our method with synthetic data, comparing the use of our approximation techniques against the direct solution and general optimization techniques.

To conclude the paper, we presented some results to show the motivation of using approximating techniques such as the one presented in this paper, in real applications. We evaluate our method against the direct solution in a relative and absolute pose problems (the latter with real data), in which we prove that: 1) estimating (general matrix) and then fitting (a true generalized essential matrix) speed up significantly the required computational time; and 2) does not degrade significantly the results. We also concluded that, contrarily with the direct solution, when using our method the required additional computational time (i.e. the computation time that is required after the estimation of ) can be neglected.

References

  • [1] Y. Ma, S. Soatto, J. Košecká, S. S. and Sastry (2001) An Invitation to 3-D Vision: From Images to Geometric Models, Springer Interdisciplinary Applied Mathematics.
  • [2] R. Hartley, and A. Zisserman, A. (2004) Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press.
  • [3] D. Nister, O. Naroditsky, and J. Bergen (2004) Visual odometry

    , IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR),

    1:652–659.
  • [4] V. Nalwa (1996) A true omnidirectional viewer, technical report, Bell Laboratories.
  • [5] S. Nayar (1997) Catadioptric Omnidirectional Camera, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR), pp. 482–488.
  • [6] B.Micusik, T.Pajdla (2004) Autocalibration & 3D Reconstruction with Non-central Catadioptric Cameras, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR) pp. 58–65.
  • [7] J. H. Kim, H. Li and R. Hartley (2010) Motion Estimation for Nonoverlapping Multicamera Rigs: Linear Algebraic and Geometric Solutions, IEEE Trans. Pattern Analysis and Machine Intelligence, 32(6):1044–1059.
  • [8] G. H. Lee, M. Pollefeys, F. and Fraundorfer (2014) Relative Pose Estimation for a Multi-Camera System with Known Vertical Direction, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR), pp. 540–547.
  • [9] R. Swaminathan, M. D. Grossberg, and S. K. Nayar (2002) Non-Single Viewpoint Catadioptric Cameras: Geometry and Analysis, Int’l J. Computer Vision, 66(3):211–229.
  • [10] R. Gupta, and R. I. Hartley (1997) Linear Pushbroom Cameras, IEEE Trans. Pattern Analysis and Machine Intelligence, 19(9):963–975.
  • [11] T. Treibitz, Y. Y. Schechner, and H. Singh (2008) Flat Refractive Geometry, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1–8.
  • [12] M. Grossberg and S. Nayar (2001) A General Imaging Model and a Method for Finding its Parameters, IEEE Proc. Int’l Conf. Computer Vision (ICCV), 2:108–115.
  • [13] P. Sturm and S. Ramalingam (2004) A Generic Concept for Camera Calibration, Proc. European Conf. Computer Vision (ECCV), pp. 1–13.
  • [14] P. Miraldo and H. Araujo (2013) Calibration of Smooth Camera Models, IEEE Trans. Pattern Analysis and Machine Intelligence, 35(9):2091–2103.
  • [15] R. Pless (2003) Using Many Cameras as One, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR), 2:587–593.
  • [16] P. Sturm (2005) Multi-View Geometry for General Camera Models, IEEE Proc. Conf. Computer Vision and Pattern Recognition (CVPR), 1: 206–212.
  • [17] H. Stewénius, D. Nistér, M. Oskarsson, and K. Åström (2005) Solutions to Minimal Generalized Relative Pose Problems, Workshop on Omnidirectional Vision
  • [18] S. Ramalingam, S. K. Lodha, and P. Sturm (2006) A generic structure-from-motion framework, Computer Vision and Image Understanding, 103:218–228.
  • [19] E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd (2009) Generic and real-time structure from motion using local bundle adjustment, Image and Vision Computing, 27:1178–1193.
  • [20] P. Miraldo, H. Araujo, and N. Gonçalves (2015) Pose Estimation for General Cameras Using Lines, IEEE Trans. Cybernetics, 45:2156–2164.
  • [21] A. Edelman, T. Arias, and S. Smith (1998) The geometry of algorithms with orthogonal constraints, SIAM Journal on Matrix Analysis and Applications, 20:303–353.
  • [22] B. Jiang and Y.-H. Dai (2015) A framework of constraint preserving update schemes for optimization on Stiefel manifold, Mathematical Programming, Series A, 153:535–575.
  • [23] J. Manton (2002) Optimization Algorithms Exploiting Unitary Constraints, IEEE Trans. Signal Processing, 50:635–650.
  • [24] Z. Wen and W. Yin (2013) A feasible method for optimization with orthogonality constraints, Mathematical Programming, Series A, 142:397–434.
  • [25] H. Li, R. Hartley, and J-H. Kim (2008) A linear approach to motion estimation using generalized camera models, IEEE Proc. Computer Vision and Pattern Recognition (CVPR), pp. 1-8.
  • [26] L. Kneip and H. Li (2014) Efficient Computation of Relative Pose for Multi-Camera Systems, IEEE Proc. Computer Vision and Pattern Recognition (CVPR), pp. 446-453.
  • [27] R. Hertley (1997) In Defence of the Eight-Point Algorithm, IEEE Trans. Pattern Analysis and Machine Intelligence, 19:580-593.
  • [28] K. Yamaguchi, D. McAllester, and R. Urtasun (2013) Robust Monocular Epipolar Flow Estimation, IEEE Proc. Computer Vision and Pattern Recognition (CVPR), pp. 1863-1869.
  • [29] J. Zaragoza, T-J. Chin, M. S. Brown, and D. Suter (2013) As-Projective-As-Possible Image Stitching with Moving DLT, IEEE Proc. Computer Vision and Pattern Recognition (CVPR), pp. 2339-2344.
  • [30] H. Pottmann, and J. Wallner (2001) Computational Line Geometry, Springer-Verlag Berlin Heidelberg.
  • [31] P. Miraldo and H. Araujo (2015) Generalized essential matrix: Properties of the singular value decomposition, Image and Vision Computing, 34:45–50.
  • [32] J. R. Cardoso, C. S. Kenney, and F. Silva Leite (2003), Computing the square root and logarithm of a real

    -orthogonal matrix

    , Applied Numerical Mathematics, 46:173–196.
  • [33] T. Abrudan, J. Eriksso, and V. Koivunen (2008)

    Steepest Descent Algorithms for Optimization Under Unitary Matrix Constraint

    , IEEE Trans. on Signal Processing, 56:635–650.
  • [34] P.-A. Absil, R. Mahony, and R. Sepulchre (2007) Optimization Algorithms on Matrix Manifolds, Princeton University Press, New Jersey, USA.
  • [35] D. Luenberger, Y. Ye (2008) Linear and Nonlinear Programming, Third Ed., Springer: New York.
  • [36] J. Nocedal, S. Wright (1999) Numerical Optimization. Springer-Verlag: New York.
  • [37] C.B. Moler, C.F. Van Loan (2003) Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later, SIAM Rev. 45: 3–49.
  • [38] H. Lütkepohl (1996) Handbook of Matrices, Jonh Wiley and Sons, 1996.
  • [39] R. Murray, L. Li, and S. Sastry (1994) A Mathematical Introduction to Robotic Manipulation. CRC Press, Boca Raton, FL,.
  • [40] G. Golub, and C. Van Loan (1996) Matrix Computations. Third Ed., Johns Hopkins University Press: Baltimore and London.
  • [41] Y. Nesterov and A. Nemirovskii (1994) Interior-Point Polynomial Algorithms in Convex Programming, SIAM: Studies in Applied and Numerical Mathematics.
  • [42] P. T. Boggs and J. W. Tolle (1995) Sequential Quadratic Programming, Acta Numerica, 4:1–51.
  • [43] K. G. Murty and F. T. Yu (1988) Linear Complementarity, Linear and Nonlinear Programming Berlin: Heldermann.
  • [44] A. Agrawal, Y. Taguchi, and S. Ramalingam (2010) Analytical Forward Projection for Axial Non-Central Dioptric and Catadioptric Cameras, Proc. European Conf. Computer Vision (ECCV), pp. 129-143.