Analysis of minima for geodesic and chordal cost for a minimal 2D pose-graph SLAM problem

by   Felix H. Kong, et al.
University of Technology Sydney

In this paper, we show that for a minimal pose-graph problem, even in the ideal case of perfect measurements and spherical covariance, using the so-called "wrap function" when comparing angles results in multiple suboptimal local minima. We numerically estimate regions of attraction to these local minima for some numerical examples, and give evidence to show that they are of nonzero measure. In contrast, under the same assumptions, we show that the chordal distance representation of angle error has a unique minimum up to periodicity. For chordal cost, we also search for initial conditions that fail to converge to the global minimum, and find that this occurs with far fewer points than with geodesic cost.


Existence of local minima of a minimal 2D pose-graph SLAM problem

In this paper, we show that for a minimal pose-graph problem, even in th...

Training Recurrent Neural Networks via Dynamical Trajectory-Based Optimization

This paper introduces a new method to train recurrent neural networks us...

Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

Understanding the loss surface of neural networks is essential for the d...

Elimination of All Bad Local Minima in Deep Learning

In this paper, we theoretically prove that we can eliminate all suboptim...

Derivative-free global minimization for a class of multiple minima problems

We prove that the finite-difference based derivative-free descent (FD-DF...

Search for Common Minima in Joint Optimization of Multiple Cost Functions

We present a novel optimization method, named the Combined Optimization ...

On the Distribution of Minima in Intrinsic-Metric Rotation Averaging

Rotation Averaging is a non-convex optimization problem that determines ...

I Introduction

Simultaneous Localization and Mapping (SLAM) is concerned with simultaneously estimating the pose of a robot (localization) and building a map of its surroundings (mapping). This capability has been useful in many areas, such as unmanned aerial vehicles [1, 2], autonomous ground vehicles [3], [4], and a plethora of other applications [5].

Currently, the “modern” approach to SLAM is to represent the robot’s trajectory as a graph: that is, to represent the robot’s poses as nodes, and measurements from those poses as edges. Then, given this graph, typically a weighted least-squares optimization problem is solved to estimate the most likely robot poses given the robot’s measurements [5]. However, even in 2D SLAM, the optimization problem is usually nonlinear and nonconvex, resulting in the possibility for iterative solvers to converge to a local instead of global minimum. Although modern solvers appear to achieve a global minimum much of the time, it is as of yet unclear under what conditions local minima exist, and how many there are, even for very small problems.

It is known that the cost on error in orientations of poses is a major contributor to the nonlinearity of the problem [6, 7]. Because of this, choice of a particular representation of orientation error can affect the existence, number, and nature of minima in a pose-graph optimization problem. In 2D SLAM, one common method to evaluate orientation error is to directly subtract the two (scalar) angles, then “wrap” this difference to be on the interval

, resulting in the “geodesic distance” between two orientations. Open-source SLAM software implementations such as Google’s Cartographer

[8], and also other popular software such as MATLAB’s Navigation Toolbox implementation use geodesic distance via the “wrap” function in their cost functions. Representing orientation error using geodesic distance is intuitive and widely known; however, it has been linked empirically to convergence to suboptimal local minima [9].

Another method to evaluate orientation error is to use the “chordal distance” (e.g. [10, 11]), which is calculated using the Frobenius norm of the difference in rotation matrices. The use of the chordal distance in the cost function of a pose-graph optimization problem has been investigated in several previous works. By using chordal cost and reformulating the problem as an equality-constrained optimization problem, global optimality can be certified for a pose-graph problem by solving a semidefinite program [11, 12]. Built upon this work, methods to speed up the computation of the global optimality certificate are proposed in [13]. Solvers have also been developed that yield certifiably globally optimal solutions [14, 15, 16].

(a) Regions , marked by red lines.
(b) Poses at local minimum using geodesic cost.
Fig. 3: (a) This heatmap motivates why there may be local minima when using geodesic distance. This is a plot of one period of , the angular part of the cost function using geodesic distance. Clearly, local minima exist in and , which suggest the that there may be local minima in the full cost function. See Section III for details. (b) Poses corresponding to a local minimum of a minimal 2D pose-graph problem using geodesic cost. Note the large orientation error of pose 2. See Section V for more detail about this example problem.

While practical SLAM problems are much larger, analyzing a small problem allows clear conclusions to be made, which can inform insights into larger problems. Several papers have investigated “minimal” SLAM problems in an attempt to show the fundamental structure and limitations of different formulations of SLAM. In the formulation in [7], the authors concluded that for noise in some bounded interval, there is a unique global minimum, and no local minima. However, the authors assumed that the angle differences are always within . This allowed the angular terms to be treated as linear, which facilitates analysis, but cannot be assumed in general. In another paper [9], the authors compare the use of geodesic and chordal distance in a feature-based SLAM problem with two robot poses and a single landmark, which is considered to have no orientation. For that problem, using chordal cost, the authors concluded that a unique global minimum exists, regardless of noise.

This paper compares the influence of using geodesic or chordal distance on the convergence properties of a minimal pose-graph problem. In this paper, we study a planar pose-graph problem with three poses and three measurements. Our contributions are:

  • We prove that even in the case of perfect measurements with spherical covariance, if geodesic distance is used, multiple suboptimal local minima exist (see Figure (b)b for an example of a local minimum). This is a direct result of representing orientation error using geodesic distance (see Figure (a)a). This clarifies the work in [7]; in particular, answers the question of what happens when angle differences outside of are considered.

  • We numerically estimate the regions of attraction to the local minima for some examples with varying noise magnitude, and show that they are of nonzero measure. Conservative regions of attraction to global minimum have been investigated for higher-dimensional problems using Gauss-Newton [17]; in this paper, due to the small size of the problem, we can approximate the size of the region of attraction to the global (and local) minima, with little conservativism.

  • We build upon [9], asking the question: “Does a unique global minimum for chordal cost with any noise magnitude exist for the 3-pose case too?” By adding orientation information to the landmark, that problem becomes the 3-pose problem considered in this paper. We prove that for the noise-free case, a unique global minimum exists, but provide a counterexample to show that uniqueness of the global minimum does not hold for arbitrary noise in the 3-pose case.

  • Finally, we search for points that failed to converge to the global minimum in the case of chordal cost. Across all three example problems, only four singleton points are found, each failing to converge due to numerical issues. This is a significant reduction in area compared to the regions of attraction to local minima in the geodesic cost case.

The paper is structured as follows: we first define the minimal SLAM problem using geodesic and chordal cost in Section II, and rewrite them in more convenient formulations. Then, we analyze the number and nature of minima for geodesic and chordal cost in Sections III and IV, respectively. In Section V, we analyze a few examples and compute their regions of attraction to local minima when using geodesic cost. Finally, we conclude with Section VI.

Ii Two formulations of a 3-pose planar pose-graph SLAM problem

Ii-a Notation and conventions

In this paper, we use the semicolon to mean vertical vector concatenation. For an angle

, let be its corresponding rotation matrix.

Ii-B The 3-pose problem

In this paper we consider a 2D pose-graph problem with three poses and three measurements. Let each of the poses have a position , and an orientation for ; for short we write . Let the vector and the the vector . In this paper we will treat and as fixed, so we exclude them from and .

Suppose at each pose the robot has taken some measurements from pose to pose : a relative position , and a relative rotation ; as before, we use the shorthand . We assume the most ideal case, that each measurement in and

has variance

; hence let , where

is the (square) identity matrix whose size is determined by context. Hence

is a spherical covariance matrix [18]. For simplicity, we have assumed that and have the same variance. However, the analysis in this paper holds if the position and orientations have different variances.

In this paper’s formulation of the 3-pose problem, we assume there are three measurements, resulting in three relative positions and three relative rotations . Throughout the paper, we assume that none of the measurements are zero, and that none of the poses are equal to another.

With these definitions, a pose-graph optimization problem can be set up to find robot poses that best satisfy these measurements, according to some cost function. One formulation of the cost function is what we will call the “geodesic cost” :


where is the Mahalanobis distance with respect to covariance , is the set , and returns the angle equivalent to on the interval .

The other cost function we consider in this paper will be called “chordal cost” :


where is the Frobenius norm of a matrix. The factor of is introduced so and have the same linearization at the origin, c.f. [12, Remark 1]. The two cost functions and are different ways of quantifying the same qualitative idea: they evaluate how well given poses “match” the measurements. Then, by minimizing either


can be found that explain the measurements well.

Notice that and share the same , and differ only in and . For our particular problem, simplifies to:


since and does not depend on .

Ii-C Dimensionality reduction via Schur complement

The decision space for the optimization problems minimizing and is . In this subsection we reduce it to two dimensions by noticing that the problem of minimizing can be solved in closed-form given any . The following lemma will aid us in this [18]:

Lemma 1

The linear least-squares problem of minimizing has a unique solution if has full column rank, and :


where .

Now, we rewrite for use with Lemma 1:


where , is the matrix that is , and


The matrix is . Hence by Lemma 1,


evaluates to [18, Theorem 1]:


where the constants , , and are determined by the measurement and covariance data only. Notice that , since the measurements are assumed to be nonzero.

Then, if we let


then instead of solving (3) and (4), we can instead solve the two-dimensional problems:


We will use and interchangeably, and similarly with and . The following lemma tells us what minima in and imply about minima in and , which will be used in the proofs of Theorem 1 and 2.

Lemma 2

Consider the problem of minimizing a function of two variables with . Suppose also there exists a function that for fixed ,


Then, if




If additionally is known to have a unique minimum, then is its unique minimum.

For any , by (20) and (19),


and hence (21). Uniqueness of the minimum on yields uniqueness of .

Iii Analyzing local minima of “geodesic cost”

In this section, we consider the pose-graph optimization problem (17). We further reduce it to a set of one-dimensional optimization problems, and use these 1D optimization problems to analyze the local and global minima of (17) and the original problem (3).

Iii-a Representing (17) as three 1D optimization problems

Let be the square with and . Because wrap() is -periodic in and , it suffices to consider only when analyzing . Figure (a)a shows a surface plot of .

(a) Surface plot of on for .
(b) Surface plot of on .
(c) 1D problem costs .
Fig. 7: Plots from the noise-free 3-pose example problem in Section V using geodesic cost. (a) Notice that on each square , there are three minima of , one on each region (c.f. Figure (a)a). (b) Now for : suboptimal local minima are marked by pink x’s; we show their existence in Theorem 1. The global minimum in is marked by a black x. (c) 1D optimal costs plotted on their domain of definition in . Notice minima exist for with cost approximately equal to 20, which are marked with black ‘x’s.

On the region , the function is:


To keep notation compact, we use the shorthand


Then, on , can be replaced by:


Hence we can rewrite this as for on some appropriate regions of . This results in a natural subdivision of into three regions (see Figure (a)a): for , , which corresponds to the (open) lower right triangle, for , , which corresponds to the (open) upper left triangle, and for , the middle region (also open). Notice we have included in any points on the non-differentiable boundary where .

Then, for each , for , can be rewritten:


For each , it can be seen that (26) is a least-squares cost function in . That is, for any given and , we can find the optimal that minimizes by again using Lemma 1. This reduces the 2D optimization problem of minimizing to minimizing a 1D problem. We rewrite as:


where the column vectors , and . The matrix is equal to . Hence the angular component of the cost function can be written:


where , and


Hence can also be re-written as a set of one-dimensional cost functions:


For each , this is obviously smooth, and has derivatives:


Hence we have reduced the dimension of the optimization problem from 2D in to a set of three 1D optimization problems in . Figure (c)c shows the one-dimensional for each region for the example problem in Section V.

We also define the 2- and 6-dimensional cost functions for each . In place of , we consider three corresponding 2D problems:


and in place of ,

Remark 1

Note however that even if some is a global minimum of , this does not necessarily imply it is a global minimum of . This is because only on ; may be less than outside .

Iii-B Main result: Existence of multiple local minima of

Now that we have represented as a triplet of 1D problems , we use them to analyze and .

In this section, we assume that the measurements are “perfect”; that is,


When the measurements are perfect, , the global minimum on , should match measurements exactly: . This can be seen by checking that . We are more interested in proving the existence of suboptimal local minima.

We claim that even in the ideal case of spherical covariance and perfect measurements, there are multiple suboptimal local minima of and . The proofs contain only elementary linear algebra and vector calculus, and have been relegated to the appendix.

Lemma 3

Assume (35) holds. Then, there are no global minima of in .

Theorem 1

Assume that the measurements are perfect, i.e. (35). Then, has at least two suboptimal local minima on , one in , and the other in . Each of these correspond to (suboptimal) local minima of .

However, in practice, (35) does not hold, and there is usually some inconsistency in the measurements:


It is easy to show that the boundaries of the regions vary with ; see Figure 14 for some examples. In the event that is “large”, the number of minima on may change.

Remark 2

For “large enough” measurement mismatch , there may not exist minima on the open set . In the proof of Theorem 1, suppose and fixed and , i.e. we are in case “b” in the proof. Then, Theorem 1 relies on finding a minimum in the interval , where . However, with , the interval of on which is defined shrinks. If it shrinks enough so that the minimum of found through Theorem 1 is not actually in , there will not be a minimum in . The same logic applies to .

In conclusion, the use of geodesic distance in the cost function results in a nonsmooth cost function that has multiple suboptimal local minima, even in the case of perfect measurements.

Iv Analyzing minima of chordal cost

In this section we investigate the minima of optimization problem (18) and (4). In contrast to the previous section, we show that if measurements are perfect, the use of chordal distance yields a unique global minimum, and no suboptimal global minima.

Expanding and simplifying from (2):


Figure (a)a shows for the example problem considered in Figure (b)b. Hence (18) can be rewritten:


Figure (b)b shows for our example. We will also make use of the , the Jacobian of :


and the Hessian


Iv-a Main result: Unique existence of global minimum of

The main claim of this section is the following theorems. Again, the proofs are elementary and have been relegated to the appendix.

Theorem 2

Assume that the measurements are perfect, i.e. (35) holds. Then, is the unique minimum on .

However, for the case of imperfect measurements, this is no longer true. In the worst case, for , we have two distinct global minima on :

Theorem 3

If with , multiple distinct global minima of exist on .

Hence it cannot be true that a unique global minimum exists for arbitrary noise. This is a significant difference of the 3-pose problem compared to the “one-step” problem in [9], which had a unique minimum for any noise magnitude.

(a) Cost for chordal distance .
(b) Chordal cost on .
Fig. 10: Plots from the noise-free 3-pose example problem in Section IV using chordal cost. (a) Compared to in Figure (a)a, is much more well-behaved. (b) Compared to Figure (b)b, is smooth and has a unique minimum exists on . Maxima are marked with pink ‘x’s, the unique global minimum is marked with a black ‘x’.
Fig. 13: Poses at local minima of for example problems 2 (left) and 3 (right), which incorporate noisy, imperfect measurements (c.f. Table I).
Fig. 14: Sampled initial conditions (IC’s) that did not converge to the global minimum for each example problem in the first three columns of Table I. (Top row) All IC’s that did not converge to the global minimum converged instead to a local minimum. The diagonal lines indicate the boundaries of . (Bottom row) IC’s not converging to the global minimum are marked as for better visibility; these IC’s all failed to converge due to numerical issues.

V Examples and Discussion

In this section we consider several numerical examples, firstly to illustrate the results of Theorem 1 in the case of perfect measurements, and secondly to give more intuition about the noisy case, which is far more common in practice.

Theorem 1 applies to any planar 3-pose problem with perfect measurements, and not just a single, contrived example. To emphasize this, we consider three problems with three different ground truths. To investigate the effect of noise, we have applied three levels of noise, one to each example problem: . Noise of was added to each orientation measurement; no noise was added to position measurements. Figure 13 shows poses corresponding to the local minima of the two noisy 3-pose problems, and Table I summarizes these problems. We have also verified the existence of these local minima separately using MATLAB’s Navigation Toolbox.

Ground truth (rad) (rad)
% IC’s on converging
to local min for
1 0 0.2%
2 0.1 0.4%
3 19.8%
TABLE I: Percentage of sampled points converging to a local minimum

Figure 14 shows plots of initial conditions (IC’s) on that failed to converge to the global minimum; both chordal and geodesic cost were considered for each example problem in Table I. Even though the problems have different ground truth poses, comparison still makes sense, since the positions have been “optimized out” (c.f. Section II-C), and the - and - axes in these plots are deviations of and . For each problem, a uniform grid of was constructed. Each grid point was used as the initial condition for MATLAB’s fminunc solver. If a global minimum was reached, the grid point was omitted from the plot; otherwise, it was plotted. Hence the top row of Figure 14 shows an approximation of the region of attraction (ROA) to suboptimal local minima for each problem. While we have only shown that a finite number of sampled IC’s converged to a local minimum, it seems reasonable due to smoothness that all points on some continuous area “in between” these sampled IC’s will also converge to the local minimum (obviously, this depends on the choice of solver).

The top row of Figure 14 yields several interesting conclusions. For and , the sampled IC’s in always converged to the global minimum. This is consistent with the experience of many users that although wrap() is used in the cost function, good results are obtained. A common method for initializing for consecutive poses is to use odometry; if odometry measurements are reasonably accurate, the initial conditions are likely to be in , i.e. the linear region of wrap(). This is also consistent with the conclusions in [7], namely that in the noise-free case, a unique minimum that is globally optimal exists if wrap() is assumed to be the identity.

However, for each problem, in the case of geodesic cost, there were some IC’s that failed to converge to the global minimum (shown in blue). All of these IC’s converged successfully instead to a local minimum. In the case of , the local minimum in the top left region disappeared (c.f. Remark 2). However, while the total number of suboptimal local minima decreased, the region of attraction to the other local minimum is enormous; almost 20% of the initial conditions on converged to it.

The bottom row of Figure 14 shows the same investigation applied to chordal cost . Clearly, many fewer IC’s fail to converge to the global minimum. Even though Theorem 2 guarantees a unique global minimum on , there were several singleton initial conditions across the that failed to converge to the global minimum. The result of fminunc for each of these initial conditions had a large gradient (around 20), and had Hessians with condition number on the order of , suggesting numerical issues pertaining to the choice of solver and tolerances.

We emphasize that the nature of IC’s failing to converge to the global minimum for is different from those in , which converged successfully, albeit to suboptimal local minima. While some issues pertaining to numerical solvers persist, it is clear from Figure 14 that has considerable advantages over when it comes to convergence.

Vi Conclusion

In this paper, we have shown that for a minimal pose-graph problem, even in the case of ideal measurements, the use of geodesic distance in the cost function results in multiple suboptimal local minima. For several numerical examples, we give evidence that the regions of attraction to these local minima are of nonzero measure, and show that some of these regions of attraction increase in size as noise is added.

In contrast, under the same idealized conditions, the use of the chordal cost instead of geodesic cost yields a unique global minimum, up to periodicity. In our examples, which have various total noise magnitudes up to , the region of attraction of the global minimum is shown to be the whole of , except for one or two points due to numerical issues. However, for extremely large noise , we show that multiple distinct global minima exist, even for chordal cost.

While we cannot claim our results apply directly to larger problems, the existence of these regions of attraction due to geodesic cost for this ideal, minimal problem suggests that similar regions may exist for larger problems. Going forward, a clear future direction would be to extend to the 3D and the -pose 2D case. Also, investigating the connection to Lie-algebraic methods, which share the benefits of using chordal cost, would be a valuable addition to our understanding of the fundamental nature of pose-graph SLAM problems.


ral In this appendix we give short proofs to save space. The full proofs are available at:

[Lemma 3] We first consider the case , aiming to show that for every , . Consider the difference in the 1D optimal costs on and :


since measurements are perfect. For , this is negative; therefore and hence no global minima exist in . The proof for is very similar and therefore has been omitted. arxiv The following propositions will help us in Theorem 1.

Proposition 1


ral This can be shown through somewhat arduous but simple plane geometry, and is omitted to save space. arxiv We know that . We aim to show that also equals this expression, so that on .

Now, can be directly evaluated:


where is the identity matrix. Recall also that (c.f. (9)). Then, let and be position measurements in the - and -directions between poses and , in the frame of pose . Finally, letting and , we can directly evaluate and :


Now we turn our attention to , and show that it can also be written . Figure 15 shows a diagram of the geometry used to write in this way. From Figure 15, it is evident that

by (43) and (44), and the fact that multiplying both of them by does not change the value of the two-argument arctangent.

Fig. 15: Drawing of geometry for the proof of Proposition 1.
Proposition 2

On , .

On , via Proposition 1, . Also, by direct calculation,


Hence for , . [Theorem 1] Consider first , which corresponds to the top left triangle, where . Then, by (35),


via (45). We aim to show that on an interval where . Now, , if and only if


We consider two cases: case “a”, where , and case “b”, where . Since we are not considering the non-differentiable boundaries of , we ignore the case where .

In case “a”, (48) is always true, so for all , and any critical point will be a minimum. We cannot solve (46) directly, but we can show a solution exists. By substitution, it can be shown that , and that . Since is continuous, by the intermediate value theorem, for some .

For case “b”, on . As in case “a”, we still use the intermediate value theorem, but instead evaluate , which equals:


which is strictly negative if . Hence there exists a root of on , where , which implies a minimum of on .

It is trivial to check that for , satisfies the inequality constraints at the boundaries of ; hence the minimum is in . Since minimizes , by Lemma 2, is a global minimum of . However, while is a global minimum of , only on . By Remark 1 and Lemma 3, is merely a local minimum of . Again by Lemma 2, the minimum of and the existence of implies that is a global minimum of