Stable Camera Motion Estimation Using Convex Programming

12/18/2013 ∙ by Onur Ozyesil, et al. ∙ Princeton University Weizmann Institute of Science 0

We study the inverse problem of estimating n locations t_1, ..., t_n (up to global scale, translation and negation) in R^d from noisy measurements of a subset of the (unsigned) pairwise lines that connect them, that is, from noisy measurements of ± (t_i - t_j)/t_i - t_j for some pairs (i,j) (where the signs are unknown). This problem is at the core of the structure from motion (SfM) problem in computer vision, where the t_i's represent camera locations in R^3. The noiseless version of the problem, with exact line measurements, has been considered previously under the general title of parallel rigidity theory, mainly in order to characterize the conditions for unique realization of locations. For noisy pairwise line measurements, current methods tend to produce spurious solutions that are clustered around a few locations. This sensitivity of the location estimates is a well-known problem in SfM, especially for large, irregular collections of images. In this paper we introduce a semidefinite programming (SDP) formulation, specially tailored to overcome the clustering phenomenon. We further identify the implications of parallel rigidity theory for the location estimation problem to be well-posed, and prove exact (in the noiseless case) and stable location recovery results. We also formulate an alternating direction method to solve the resulting semidefinite program, and provide a distributed version of our formulation for large numbers of locations. Specifically for the camera location estimation problem, we formulate a pairwise line estimation method based on robust camera orientation and subspace estimation. Lastly, we demonstrate the utility of our algorithm through experiments on real images.



There are no comments yet.


page 19

page 25

page 26

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Global positioning of objects from partial information about their relative locations is prevalent in many applications spanning fields such as sensor network localization [8, 58, 16, 18], structural biology [31], and computer vision [27, 9]. A well-known instance that attracted much attention from both the theoretical and algorithmic perspectives is that of estimating the locations from their pairwise Euclidean distances . In this case, the large body of literature in rigidity theory (cf. [4, 54]) provides conditions under which the localization is unique given a set of noiseless distance measurements. Also, much progress has been made with algorithms that estimate positions from noisy distances, starting with classical multidimensional scaling [48] to the more recent semidefinite programming (SDP) approaches (see, e.g., [8, 7]).
Here we consider a different global positioning problem, in which the locations need to be estimated from a subset of (potentially noisy) measurements of the pairwise lines that connect them (see Figure 1 for a noiseless instance). The line connecting and is identified with the rank- projection matrix defined by


Notice that there is no available information about the Euclidean distances between the points. The entire information of pairwise lines is represented as a measurement graph , where the ’th node in corresponds to the location and each edge is endowed with the corresponding projection matrix .

Figure 1: A (noiseless) instance of the line estimation problem in , with locations and pairwise lines.

The noiseless version of this problem (i.e., realization of locations from exact line measurements (1)) was previously studied in several different contexts such as discrete geometry, sensor network localization, and robotics, under various formulations (see [65, 64, 17, 18, 49]). The concepts and the results for the noiseless case, to which we refer here as parallel rigidity theory, are aimed at characterizing the conditions for the existence of a unique realization of the locations (of course, up to global translation, scaling and negation of the locations , since the pairwise lines are invariant under these transformations).
However, location estimation from (potentially) noisy line measurements did not receive much attention previously. The camera location estimation part of the structure from motion (SfM) problem in computer vision (see, e.g., [27]), where ’s represent the camera locations in , is an important example of the abstract problem with noisy measurements. To the best of our knowledge, a structured formulation (in terms of the pairwise lines) of the camera location estimation problem and its relation to the existing results of parallel rigidity theory (characterizing conditions for well-posed instances of the problem) were not considered previously. We now give more details on the camera location estimation problem and the existing techniques for its solution.

Camera Location Estimation in SfM: Structure from motion (SfM) (depicted in Figure 2) is the problem of recovering a D structure by estimating the camera motion corresponding to a collection of D images (cf. §4 for technical details).

Figure 2: The structure from motion (SfM) problem.

Classically, SfM is solved in three stages: Feature point matching between image pairs (as in Figure 8) and relative pose estimation of camera pairs based on extracted feature points Estimation of camera motion, i.e. global camera orientations and locations, from relative poses D structure recovery based on estimated camera motion by reprojection error minimization (e.g. bundle adjustment of [56]). Although the first and the third stages are relatively well understood and there exist accurate and efficient algorithms in the literature for these stages, existing methods for camera motion estimation, and specifically for the camera location estimation part, are sensitive to errors that result from mismatched feature correspondences. Among the widely used methods are incremental approaches (e.g. [1, 52, 53, 14, 70, 28, 19]), that integrate images to the estimation process one by one or in small groups. These incremental methods usually result in accumulation of estimation errors at each step and also require several applications of bundle adjustment for improved accuracy, hence leading to computational inefficiency. Alternatively, global estimation of the camera motion, i.e. solving for all camera orientations and/or locations simultaneously, can potentially yield a more accurate estimate. A generally adapted procedure to prevent high computational costs is to estimate the camera motion and the D structure separately. In principle, considering the large reduction in the number of variables, camera motion estimation can be performed jointly for all images, followed by D structure recovery using a single instance of reprojection error minimization. Obviously, such a procedure presumes a stable and efficient motion estimation algorithm. Most of the existing methods for motion estimation perform orientation and location estimation separately. Orientation estimation turns out to be a relatively well-posed problem and there exist several efficient and stable methods, e.g. [42, 3, 26, 24, 57]. On the other hand, global estimation of camera locations, specifically for large, unordered sets of images, usually suffers from instability to errors in feature correspondences (resulting in solutions clustered around a few locations, as for [3, 9, 51]

), sensitivity to outliers in pairwise line measurements (e.g., the

approach of [50, 34]) and susceptibility to local solutions in non-convex formulations (e.g., [24]). Hence, a well-formulated, robust, efficient method for global location estimation (scalable to large sets of images) with provable convergence and stability guarantees is of high value.
Early works of [23, 9] on location estimation reduce the problem to that of solving a set of linear equations that originate from the pairwise line measurements. Finding solutions to these linear systems can be done in a computationally efficient manner. However, these solutions have been empirically observed to be sensitive to errors in the pairwise line measurements. The sensitivity of such solutions is expressed by the tendency of the estimated locations to cluster, regardless of the true locations (cf. Figure 3 for such a clustering solution for a real data set, and also the optimization problem (3) and the following discussion). The multistage linear method of [51] attempts to resolve this issue by first performing pairwise reconstructions, then registering these in pairs, and finally aligning them by estimating relative scales and locations. Nonetheless, this approach does not produce satisfactory results in terms of estimation accuracy. Another interesting and efficient method is the Lie algebraic averaging approach of [24]. However, this non-convex method is susceptible to convergence to local optima. The spectral formulation of [3], which is based on a novel decomposition of the essential matrix and is similar to [23, 9] in its formulation of the problem, yields an efficient linear solution for the locations, though, it also suffers from spurious clustered solutions. Yet another formulation related to our work is the quasi-convex formulation of [50], which relies on (iteratively) optimizing a functional of the norm and requires the estimation of the signed directions of the pairwise lines, i.e. knowledge of . However, norm is highly susceptible to outliers, resulting in unsatisfactory solutions for real image sets. Also, the requirement for estimating signed line directions may introduce additional error to the problem. A similar idea is employed in [34] to simultaneously estimate the D structure, which exhibits the same difficulties. There are also various works that aim to improve the high sensitivity and inefficiency of these quasi-convex methods (see, e.g., [42, 45]). Another method requiring the estimation of the signed directions of the pairwise lines is studied in [57]. In contrast to the sensitivity of the quasi-convex method of [50] to outliers, the method in [57] is based on optimizing a functional of the norm, and hence produces more accurate location estimates. Additionally, [40] introduces a framework based on classical rigidity theory (involving pairwise distance information), which aims to identify rigid instances of the joint motion and structure estimation problem. Also, an SDP approach is introduced in [40] in order to jointly estimate motion and structure from noisy feature correspondences. We note that our work is significantly different from [40]: While we use parallel rigidity, [40] employs classical rigidity, leading to completely different SDP formulations.

Figure 3: A D snapshot of the collapsing solution of the least squares (LS) method [3, 9], for the Notre-Dame data set from [52]. The solution of [52] is taken as the ground truth.

Main Contributions and Broader Context: In this paper we make the following principal contributions for the problem of location estimation from pairwise lines:

The main contribution of our work is the introduction of a new semidefinite relaxation (SDR) formulation for estimating locations from pairwise line measurements. Empirically, we observe that solutions obtained by the SDR do not suffer from the clustering phenomenon of location estimates that plague many of the existing algorithms.

To quantify the performance of our SDR formulation, we prove exact (in the noiseless case, cf. Proposition 2.3) and stable (in the noisy case, cf. Theorem 8 and Corollary 9) location recovery results. We also provide a provably convergent and efficient alternating direction method to solve our SDR.

We provide a distributed version (scalable to a large number of locations) of the SDR formulation, based on spectral partitioning and convex programming. Additionally, we prove exact location recovery (in the noiseless case, cf. Proposition 3.4) for the distributed approach.

We formulate the camera location estimation problem of SfM in terms of the pairwise line measurements (between the camera locations in ). Moreover, we show how to improve the stability of our approach to outlier pairwise measurements by robust preprocessing of the available pairwise camera information and describe a robust iterative approach for camera orientation estimation.

We also demonstrate the efficiency and the accuracy of our formulations via synthetic and real data experiments. We note that these experiments show a specifically important characteristic of the SDR formulation: As long as the level of noise in the pairwise lines is below some threshold, the SDR always produces rank- solutions, i.e. it actually solves the original non-convex program, i.e. the relaxation is tight111This is sometimes referred to as “exact rank recovery”, and is not to be confused with our exact “location” recovery results for the SDR in the presence of noiseless pairwise lines.. Also, for higher noise levels, even when the SDR does not produce rank- solutions, its solution typically has a large spectral gap (i.e., it can be well approximated by a rank-

matrix). In other words, we do not observe a sharp phase transition in the quality of the relaxation.

Since the existing results of parallel rigidity theory (cf. Appendix A and, e.g., [65, 64, 17, 18, 49]) have not been previously applied to the camera location estimation problem, we provide a summary of the main results of parallel rigidity theory, which completely characterize the conditions for the problem to be well-posed (in , for arbitrary ). Also, we formulate a randomized algorithm to efficiently decide in the existence of these conditions.

In the literature, convex programming relaxations (and in particular semidefinite programs) have previously served as convex surrogates for non-convex (particularly NP-hard) problems arising in several different areas, such as sensor network localization (from pairwise distances) [54, 8], low-rank matrix completion [12], phase retrieval [13]

, robust principal component analysis (PCA) 

[11], multiple-input multiple-output (MIMO) channel detection [43, 66], and many others (also see [21, 38, 61, 71]). Notably, the SDP formulation for sensor network localization [54, 8] is not guaranteed (even in the noiseless case) to provide the unique configuration of a globally rigid framework (cf. [4], for global rigidity and other fundamental concepts in “classical” rigidity theory). Only if the framework is “uniquely localizable” [54], then the SDP is guaranteed to provide the unique solution in the noiseless case. In contrast, our SDR formulation is guaranteed to provide the unique solution (up to global scale, translation and negation) for a parallel rigid framework (cf. §2.1). Similar to our case, the tightness of the relaxation, i.e. obtaining rank- solutions from semidefinite relaxations, is also observed in several different SDR formulations (see, e.g., [15, 5] and the survey [71]).

Organization of the Paper: In §2 we provide the connection to parallel rigidity theory and introduce the SDR formulation. Also, we prove exact (in the noiseless case) and stable (in the noisy case) location recovery results, and formulate an alternating direction method for the SDR. In §3 we introduce the distributed approach and prove its well-posedness. In §4, we formulate the camera location estimation problem in SfM as a problem of estimating locations from their pairwise line measurements. We also present the robust camera orientation and pairwise line estimation procedures. We evaluate the experimental performance of our algorithm in §5, using synthetic and real data sets. Lastly, §6 is a summary.

Reproducibility: The methods and algorithms presented in this paper are packaged in a MATLAB toolbox that is freely available for download from the first author’s webpage at


We denote vectors in

, , in boldface. and are used for the identity and all-ones matrices, respectively. and denote the (Euclidean) sphere in and the special orthogonal group of rotations acting on , respectively. We use the hat accent, to denote estimates of our variables, as in is the estimate of . We use star to denote solutions of optimization problems, as in . For an symmetric matrix ,

denote its eigenvalues (in ascending order) and

denotes that is positive semidefinite (i.e., , for all ). Also, for an matrix , denotes the vector formed by the diagonal entries of , and conversely, for , denotes the diagonal matrix formed by the entries of . Lastly, we use the letters and to denote the number of locations and the number of edges of graphs that encode the pairwise line information.

2 Location Estimation

Consider a graph of pairwise lines, with each edge endowed with a pairwise line corresponding to the locations (i.e., satisfying (1)). Given this information, we first address the question of unique realizability of the locations in the next subsection, followed by our SDP formulation for location estimation (from noisy pairwise lines).

2.1 Parallel Rigidity

The unique realizability of locations (or the solvability problem) from pairwise line measurements was previously studied, mainly under the name of parallel rigidity theory (see, e.g., [65, 64, 17, 18, 49]). However, to the best of our knowledge, the concepts and the results of parallel rigidity theory have not been related to the well-posedness of the camera location estimation part of SfM, which is why we study them again here. Note that while camera orientation estimation from noiseless pairwise ratios (cf. §4) only requires the connectivity of the measurement graph, connectivity alone is insufficient for camera location estimation (see Figure 4 for such an instance). To address this problem, we now briefly discuss the main results of parallel rigidity theory. For further details on fundamental concepts and results in parallel rigidity theory, see Appendix A.

Figure 4: (a) A formation of locations on a connected graph, which is not parallel rigid (in and ). Non-uniqueness is demonstrated by two non-congruent location solutions and , each of which can be obtained from the other by an independent rescaling of the solution for one of its maximally parallel rigid components, (b) Maximally parallel rigid components of the formation in (a), (c) A parallel rigid formation (in and ) obtained from the formation in (a) by adding the extra edge linking its maximally parallel rigid components, (d) A formation of locations, which is parallel rigid in , but is not parallel rigid in

We call the (noiseless) pairwise line information a “formation” and consider the following fundamental questions: Using this information, can we uniquely realize the points , of course, up to a global translation, scale and negation (i.e. can we realize a set of points congruent to )? Is unique realizability a generic property (i.e. is it only a function of the underlying graph, independent of the particular realization of the points, assuming they are in generic position) and can it be decided efficiently? If we cannot uniquely realize , can we efficiently determine maximal components of that can be uniquely realized? These questions were previously studied in several different contexts like discrete geometry, bearing and angle based sensor network localization and robotics, under various formulations and names (see [65, 64, 17, 18, 49, 35, 36, 37] and references therein).
The identification of parallel rigid formations is addressed in [65, 64, 17, 18] (also see [35] and the survey [32]), where it is shown that parallel rigidity in () is a generic property of the measurement graph equivalent to unique realizability, and admits a complete combinatorial characterization (a generalization of Laman’s condition from classical rigidity to parallel rigidity): For a graph , let denote the set consisting of copies of each edge in . Then, is generically parallel rigid in if and only if there exists a nonempty , with , such that for all subsets of , we have


where denotes the vertex set of the edges in .

To elucidate the conditions of Theorem 4, let us consider the simple examples provided in Figure 4: For , if exists, the certificate set satisfying the conditions in Theorem 4 is simply a subset of the edge set . The graphs in the subfigures (b) and (c) are minimally parallel rigid in , i.e. is the certificate. On the other hand, the graphs in the subfigures (a) and (d) do not have sufficiently many edges to be parallel rigid, i.e. even if we consider to be the set of all edges , we get . For , let us first consider the (triangle) graphs in the subfigure (b): If we set to be the set of two copies of any of the two edges in , and a single copy of the remaining edge (e.g., , where denotes the ’th copy of the edge ), then is a certificate. For the graph in the subfigure (c), satisfies the conditions of Theorem 4. Also, for the graph in the subfigure (d), is the certificate. On the other hand, if we consider the graph in the subfigure (a), can satisfy the first condition , if and only if , for some , . However, in this case, has a subset consisting of two copies of each of the three edges in a triangle graph, which violates the condition (2).
The conditions of Theorem 4 can be used to design efficient algorithms (e.g., adaptations of the pebble game algorithm [33], with a time complexity of ) for testing parallel rigidity. In Appendix A we detail a randomized algorithm (having time complexity ) for testing parallel rigidity. Moreover, polynomial time algorithms for finding maximally parallel rigid components of non-parallel rigid formations are provided in [36, 35].
In the presence of noiseless pairwise line measurements, parallel rigidity (and hence, unique realizability) is equivalent to error-free location recovery. However, for real images, we are provided with noisy measurements. In this case, instead of uniqueness of the solution of a specific location estimation algorithm, we consider the following question: Is there enough information for the problem to be well-posed, i.e. if the noise is small enough, can we estimate the locations stably? For formations which are not parallel rigid, instability may result from independent scaling and translation of maximally rigid components. In this sense, we consider problem instances on parallel rigid graphs to be well-posed.
Considering the existing results of parallel rigidity theory, we take the following approach for location estimation: Given a (noisy) formation on , we first check for parallel rigidity of , if the formation is non-rigid we extract its maximally rigid components (using the algorithm in [36]) and estimate the locations for the largest maximally rigid component. The details of our formulation for (stable) location estimation are provided in the next section.

2.2 Location Estimation by SDR

In this section we introduce and describe our location estimation algorithm. Suppose that we are given a set of (noisy) pairwise line measurements on . Also, for each , let denote the matrix of projection onto the -dimensional subspace orthogonal to this line. Firstly, let us consider the least squares approach studied in [3, 9]222In [3], the least squares approach is studied specifically for the location estimation problem in with a slightly different definition for the pairwise measurements ’s (also see §4), whereas in [9], it is studied in arbitrary dimension using unnormalized ’s (i.e., ’s do not necessarily satisfy )., which is equivalent to solving the following (non-convex) program

subject to (3b)

Note that we can rewrite the cost function in (3) as (since the projection matrices satisfy ), which is why we call (3) the least squares approach. The constraints in (3) are designed to exclude the trivial case , for some . In fact, (3) is an eigenvalue problem and hence can be efficiently solved. However, for large and noisy data sets, this formulation turns out to be “ill-conditioned” in the following sense: The solution has the tendency to “collapse”, i.e. the problem is not sufficiently constrained to prevent the (less trivial, but still undesired) solutions of the form and , where has a (relatively) small degree in (for such collapsing solutions in , see Figure 3 for a real data set, and Figure 9 for synthetic data). For large data sets having nodes with significantly varying degrees, collapsing solutions of (3) (sometimes occurring in more than one group of points) can be observed even for low levels of noise in the ’s. It is worthwhile noting that the problem of collapsing solutions is not caused by the quadratic nature of the cost function in (3): Formulations of (3) using sum of (unsquared) projection errors, i.e.

, as the cost function (which are also significantly harder to solve) that we studied using (heuristic) iteratively-reweighted least squares solvers exhibit even worse characteristics in terms of collapsing solutions.

We overcome the collapsing behavior in two steps, first by introducing non-convex “repulsion constraints” in (3) and then formulating an SDR version of the resulting non-convex problem. The non-convex problem is given by

subject to (4b)

where is a constant fixing the (undetermined) scale of the solution (wlog we take ) and the cost function is rewritten in a slightly different way. The repulsion constraints are non-convex constraints making the estimation problem difficult even for small-sizes. We introduce a matrix of size , whose blocks are given by . Consequently, and . To rewrite the cost function in (4) linearly in terms of , we define a Laplacian matrix , whose blocks are given by


and which satisfies . Note that is symmetric, since and , and also positive semidefinite, since for , . Also, for every edge , we define a matrix , whose ’th block is given by


and which allows us to rewrite the inequality constraints in (4), linearly in as . Moreover, the equality constraint can be rewritten linearly in as , for .
We now relax the only non-convex constraint, that is , to formulate the following SDR (known as “matrix lifting”):

subject to (7b)

After solving for in (7), we compute the location estimates

by a deterministic rounding procedure, i.e. we compute the leading eigenvector

of and let be given by the ’th block of .

2.3 Stability of SDR

In this section we analyze the SDR (7) in terms of exact location recovery in the presence of noiseless pairwise line information and stable recovery with noisy line information.
We first introduce some notation to simplify our statements. Consider a set of locations in generic position and let denote the unit vector from to . Then, the (noiseless) formation corresponding to is given by , where . We also let denote the projection matrices onto the -dimensional subspace orthogonal to , i.e. , denote the Laplacian matrix corresponding to ’s (cf. (5)), and denote the matrix of the noiseless locations, i.e. where the i’th block of is equal to .
We start with the simpler exact recovery result. [Exact Recovery in the Noiseless Case] Assume that the (noiseless) formation is parallel rigid. Then, the SDR  (with ), followed by the deterministic rounding procedure, recovers the locations exactly, in the sense that any rounded solution of the SDR is congruent to . Wlog, we assume and . Then we have , i.e. is a minimizer of (7) (since is also feasible by construction). The parallel rigidity of the formation implies , where the only eigenvector of with non-zero eigenvalue and the eigenvectors of (in (7)) with nonzero eigenvalues form an orthogonal basis for the nullspace of (see, Appendix A). Let , , denote the (normalized) eigenvectors of corresponding to its positive eigenvalues. Consider an arbitrary minimizer of (7). Since , where , satisfies for all . Also, by the feasibility of , we get , i.e. for all . Hence, form an orthogonal basis for the nullspace of . This implies , where is of the form for some (by the feasibility of ), establishing the uniqueness of the solution up to scale. As a result, applying the rounding procedure to any solution of (7) yields exact recovery of ’s (of course, up to congruence).  
Our next result is the stability of the SDR with noisy pairwise line information.
Noise Model and Error Function: We let each edge of the measurement graph be endowed with a line measurement , where with and . Also, denotes the Laplacian of the graph , where is the (diagonal) degree matrix of (whose ’th diagonal element is equal to the degree of the ’th node) and is the adjacency matrix of .
This time we assume (wlog), and . For a solution of the SDR (7), we consider the following error function as our measure of stability


We are now ready to present the main result of this section: [Stability of SDR Solution] Consider a set of noisy pairwise line measurements related to the (noiseless) parallel rigid formation as in the noise model given above, and let be a solution of . Then,


where, the (data dependent) constants and are given by and . Here, the parameter is given by .
Proof. See Appendix B.
We can obtain the stability of the estimated locations, i.e. the rounded solution of (7), as a corollary of Theorem 8: [Location Stability] Let be as in Theorem 8 and denote its normalized eigenvector corresponding to its largest eigenvalue. Then


We use a particular implication of the Davis-Kahan theorem (see, e.g., Ch. 7 in [6]) in order to relate the recovery error in the rounded solution to the error in the solution of the SDR. To this end, observe that


For a given symmetric matrix and a subset of the real line, let denote the projection matrix onto the subspace spanned by the eigenvectors of , whose eigenvalues are in . Then Davis-Kahan theorem implies


where . In our case, if we let for and for , where , we obtain


Here, we use the fact and the feasibility of , i.e. that , to get (in fact, we can construct a solution of the SDR satisfying the stronger bound , see e.g. [47], however we ignore this slight improvement for simplicity). Also, considering (11) and (38) from the proof of Theorem 8, we recover the claim of the corollary.

Remark 1

We note that the bounds in Theorem 8 and Corollary 9 are quite loose when compared to our experimental observations. Nevertheless, the recovery error is within a constant factor of the noise level . Also observe that Proposition 2.3, i.e. exact recovery in the noiseless case, is implied by Theorem 8 when .

Remark 2

The proximity of the solution of (7) to the space of positive semidefinite rank- matrices can be considered as a measure of the quality of the relaxation. In our experiments with real images and simulated data, we make the following observations: As long as the noise level is below some threshold, we always get , i.e. we can actually solve the non-convex problem efficiently by the SDR (7). For higher levels of noise, is no longer a rank-1 matrix, but it typically has a large spectral gap . In other words, we do not observe a sharp phase transition in the quality of the SDR, and the relaxation is stable under various noise models. Figure 5 provides an experimental evaluation in of our observations about the stability of relaxation, using synthetic measurements generated under the noise model (21) of §5.1 (and assuming in (21), i.e. no outlier measurements, for simplicity) for graphs of nodes and various edge density . We observe that even if the location recovery performance (represented by normalized root mean squared error (NRMSE) defined in (22)) degrades as the noise level increases, the tightness of the relaxation is preserved up to relatively high noise levels.

Figure 5: Spectral gap of solutions of (7) and logarithmic recovery error (see (22) for NRMSE) versus noise level , for graphs with nodes and various edge density (results are averaged over random realizations of noisy line measurements and locations)

2.4 Alternating Direction Augmented Lagrangian Method

The SDR (7) is solvable in polynomial time by the classical primal-dual interior-point SDP algorithms (e.g. [59]). However, in case of a dense measurement graph (i.e., assuming ), the interior-point methods become impractical for large number of locations, with a time complexity of (and a space complexity of ) for each iteration of the algorithm. In practice, the computational bottleneck becomes an issue for problem sizes of . In this respect, here we provide the details of alternating direction augmented Lagrangian method (ADM), which is a first-order method with superior computational efficiency [63]. ADM is an iterative algorithm based on minimization of an augmented Lagrangian function of the dual SDP. In comparison to interior-point methods that aim to satisfy complementary slackness while maintaining primal-dual feasibility at each iteration, ADM aims to construct a primal-dual feasible pair while maintaining complementary slackness. At each iteration, ADM minimizes the dual augmented Lagrangian function first with respect to dual variables, then with respect to dual slack variables and finally updates the primal variables. In the minimization over each variable, ADM regards the other variables as fixed.
In order to obtain an ADM framework for the SDP (7), we rewrite it in standard form and procure the ADM framework (involving variables of larger dimensions) for standard form SDPs developed in [63]. Such an approach yields a (provably) convergent algorithm, however, in general it has a high computational cost (due to the high dimensionality of the variables associated with the standard form SDP). In our case, we are able to simplify the ADM framework for the standard form of (7) significantly and hence do not artificially increase the computational cost by rewriting (7) in standard form (we also experimentally observed that the “ad-hoc” approach in [63] developed for SDPs involving inequality constraints, which is at least as costly as our resulting algorithm, did not produce a convergent algorithm for (7)). We provide the details of rewriting (7) in standard form, constructing the ADM framework for the augmented Lagrangian of its dual and the simplification of the resulting algorithm in Appendix C. A pseudo-code version of our ADM algorithm is given below (see Appendix C for the linear operators and the efficient computation of ).

  Initialize: s.t. , , and
  for  do
  end for
Algorithm 1 Alternating direction augmented Lagrangian method (ADM) for SDP (7)

We refer the reader to [63] for practical details related to termination rules using measures of infeasibility, stagnation detection, updating the parameter for faster convergence, additional step size parameter used to update the primal variables and , and also for convergence analysis of ADM. Considering the convergence rate analysis of ADM provided in [29], we need iterations in order to achieve an accuracy. Note that, at each iteration, the most computationally expensive step of Algorithm 1 is the spectral decomposition of . However, since we experimentally observe a stable SDP relaxation resulting in a low-rank primal solution , computation of and can be greatly simplified by computing only a few negative eigenvalues of (e.g., by using Arnoldi iterations [2]). As a result, assuming complexity for spectral decomposition, the time complexity of (already significantly less compared to interior point methods) can be even further reduced.

3 Distributed Approach

The ADM framework introduced in §2.4 provides a computationally feasible alternative to classical SDP solvers and allows us to solve the SDR (7) beyond . However, for large sets of images (), the need for a distributed algorithm is apparent. In this section, we provide the details of a distributed algorithm for translation estimation, based on spectral graph partitioning and convex programming.
The main structure of our distributed location estimation algorithm is the following: Given a maximum problem size, i.e. an integer denoting the maximum number of locations our computational resources can efficiently estimate by (7), we first partition into subsets (that we call “patches”) of sizes at most (by maintaining sufficient connectivity in the induced subgraphs and sufficient overlaps between patches). Then, for each induced subgraph, we extract the maximally parallel rigid components. We then find for each rigid component the “local” coordinate estimates by the SDR (7). Finally, we stitch the local estimates into a global solution by convex programming.
We note that, the main idea of our distributed approach, i.e. division of the problem into smaller subproblems and then constructing the global solution from the local solutions, is also adapted for various problems in the literature (see, e.g., [16]). However, depending on the structure and the challenges of the specific problem studied, these distributed methods usually have significant differences. For instance, as compared to [16], while the same algorithm (namely the eigenvector method (EVM)) is used in our approach to remove the pairwise sign ambiguities between local solutions (cf. §3.2), the steps involving the graph partitioning and extraction of well-posed local problems, computation of local solutions, and estimation of global locations from (sign corrected) local estimates are significantly different.

3.1 Graph Partitioning

In order to partition into sufficiently overlapping subsets (for high quality global reconstruction) with sufficiently dense induced graphs (for high quality local estimation) of sizes bounded by , we use the following algorithm, which bears a resemblance with the graph partitioning algorithm of [39]. Starting with , at the ’th iteration partition each graph in (where,

denotes the set of graphs to be partitioned) into two subgraphs using the spectral clustering algorithm of 

[44]. Then, extend the computed partitions to include the -hop neighborhoods of their vertices in (and, of course, the induced edges). Assign the (extended) partitions with sizes smaller than to the set of patches, and those with sizes larger than to . Repeat until there is no subgraph left to partition, i.e. until the ’th iteration, where .
After the partitioning step, we extract the maximally parallel rigid components of each patch as described in §2.1 (after this stage we use the term patch for parallel rigid patches). We then remove the patches that are subsets of another patch from the set of patches. We also remove the patches that do not have sufficient overlap (i.e. overlap size , also see next section) with any other patch (which happens very rarely and is required since they cannot be used in the global location estimation). At this stage, we get a patch graph , where denotes the patches and if and only if there is sufficient overlap between the patches and . Here, if is not connected (which was never the case in our experiments), we can either extract the largest connected component of or extend the patches to include their -hop neighborhoods until is connected for the next steps of our algorithm. We then compute the “local” coordinate estimates for these rigid patches (whose negation signs, scales and translations with respect to the global coordinate system are undetermined at this stage) by solving the SDR (7). The computation of the local coordinates for each patch can be done in parallel in a multi-core processing environment, where each processing unit computes the local coordinates of one or more patches.

3.2 Pairwise Patch Registration

After solving the SDR (7) for each patch , we obtain estimates of the representations of the locations in the coordinate system of each patch. The representations satisfy


where denotes the global coordinates of the ’th location, and and denote the signed scale and translation of patch , respectively (we use the signed scale, i.e. can assume negative values, because of the unknown negation). Given , , our objective is to estimate the locations by formulating an efficient algorithm, which will be robust to possible outliers in the estimates . In this respect, firstly observe that any algorithm designed to minimize the errors in the linear equations (14) should also exclude trivial solutions of the form and (for some ), for all and . However, similar to the location estimation from noisy pairwise lines problem, the existence of this null-space (spanned by the trivial solutions) results in collapsing solutions for under-constrained optimization programs. As in the case of the least squares solver for the location estimation problem, we experimentally observed such collapsing solutions for the spectral method designed to minimize the sum of squared norms of the errors in equations (14) by excluding the solutions in the null-space.
Collapsing solutions can be avoided simply by requiring , for all , which is a non-convex constraint. Similar to the construction of the SDR (7), the non-convexity (resulting from the unknown patch signs allowing to assume negative values) can be resolved by using matrix lifting. An efficient method in this direction is the adaptation of the partial matrix lifting (only applied to the variables involved in non-convex constraints) method of [15] to our problem. In this method, using the sum of squared norms of the errors in equations (14) as the cost function, the unconstrained variables (’s and ’s) are analytically computed as functions of the constrained variables (in our case, ’s) and the resulting quadratic form (in ’s) is used to define a matrix lifting relaxation for the constrained variables (see [15] for further details). However, this time, instead of using a matrix lifting method, we pursue a different approach: To overcome the non-convexity in , we first estimate the unknown sign of each patch and then impose the convex constraints for the sign-corrected patches (i.e. patches with the local estimates replaced with , where is the estimate of the negation of patch ). Estimation of patch signs from pairwise signs (see (15) for pairwise sign estimation) is performed using the eigenvector method (EVM) (see, e.g., [16]), which is a robust and efficient spectral algorithm allowing a reliable estimation of patch signs. Using the estimated signs, we can minimize the sum of unsquared norms in equations (14) (which cannot be used as a convex cost function in the matrix lifting approach), and hence maintain robustness to outliers in the estimates . In our experiments with simulated data and real images, this two step formulation produced more accurate location estimates compared to the matrix lifting alternative (with similar running times, since the partial matrix lifting results in a semidefinite program with a matrix variable of size ), making it our choice for stitching the global solution. We now provide the details of the sign estimation procedure, whereas the final step of location estimation from sign-corrected patches is covered in §3.3.
In order to estimate the patch signs , the relative pairwise signs , , are estimated first. This is accomplished by solving the following least squares problem for each


where denote the relative (signed) scale and translation between and , respectively. The relative sign estimate is given by .
Using the relative sign estimates , the sign estimates are computed by EVM, which is a spectral method for finding signs with the goal of satisfying as many equations of the form for as possible (see [16] for the details). Here, we note that, although the sum of squared norms cost function in (15) can be replaced by the sum of (unsquared) norms cost to improve robustness to outliers in ’s, we prefer the more efficient least squares version (in fact, (15) has a closed-form solution) since we did not experimentally observe a significant improvement in the accuracy of signs estimated by EVM.

3.3 Global Stitching of Local Patches

Stitching the local patches into a globally consistent -dimensional map comprises the last step of our distributed approach. As we discussed in §3.2, we aim to efficiently estimate the global locations using the linear equations (14), while maintaining robustness to outliers in ’s and preventing collapsing solutions. In this respect, using the estimated patch signs (i.e., estimates of signs of ) in (14), we maintain robustness by minimizing sum of (unsquared) norms of errors in equations (14), while simply constraining to prevent collapse. Hence, we solve the following convex program (using, e.g. [59]), in which we jointly estimate the scales and translations of the sign-corrected patches (i.e., patches with the local estimates replaced with ) and the global locations

subject to (16b)

3.4 Well-posedness of the Distributed Problem

Similar to the well-posedness of location estimation from pairwise lines, we consider the following question for the distributed problem: Do the local coordinate estimates provide enough information to yield a well-posed instance of the global location estimation problem? Once again, we consider an instance of the distributed problem to be well-posed if the global locations can be uniquely (of course, up to congruence) estimated from the noiseless local coordinates . We (partially) answer this question in Proposition 3.4, where it is shown that the local coordinate estimates provided via the specific construction of the patch graph given in §3.1 are sufficient for well-posedness. This result is established by proving exact recovery of global locations from noiseless local coordinates using the two step global location construction algorithm. [Exact Recovery from Noiseless Local Coordinates] Consider a graph of patches and a set of (noiseless) local coordinates corresponding to the global locations (i.e., satisfy  for a set of signed scales and translations of the patches, for all and ). Then, if is connected and satisfies if and only if , the two step global location construction algorithm (i.e., estimation of patch signs by and EVM [16] followed by global location estimation by ) recovers the global locations exactly when provided with the noiseless local coordinates (i.e., for and ), in the sense that any solution of the algorithm is congruent to .
Proof. See Appendix D.

Remark 3

We note that, in the presence of noiseless pairwise lines (on parallel rigid ), and assuming that we can obtain a connected patch graph from using the graph partitioning procedure of §3.1, Propositions 2.3 and 3.4 imply exact recovery of from the lines .

Remark 4

The conditions imposed on in Proposition 3.4 (i.e. connectivity and that, for all ,