I Introduction
Simultaneous Localization and Mapping (SLAM) is the backbone of several robotics applications. SLAM is already widely adopted in consumer applications (e.g., robot vacuum cleaning, warehouse maintenance, virtual/augmented reality), and is a key enabler for truly autonomous systems operating in the wild, ranging from unmanned aerial vehicles operating in GPSdenied scenarios, to selfdriving cars and trucks.
Despite the relentless advances in SLAM, both researchers and practitioners are well aware of the brittleness of current SLAM systems. While SLAM failures are a tolerable price to pay in some consumer applications, they may put human life at risk in several safetycritical applications. For this reason, SLAM is often avoided in those applications (e.g., selfdriving cars) in favor of alternative solutions where the map is built beforehand in an offline (and typically humansupervised) manner, even though this implies extra setup costs.
Arguably, the main cause of SLAM failure is the presence of incorrect data association and outliers [1]. Incorrect data association is caused by perceptual aliasing, the phenomenon where different places generate a similar visual (or, in general, perceptual) footprint. Perceptual aliasing leads to incorrectly associating the measurements taken by the robot to the wrong portion of the map, which typically leads to map deformations and potentially to catastrophic failure of the mapping process. The problem is exacerbated by the fact that those outliers are highly correlated: due to the temporal nature of the data collection, perceptual aliasing creates a large number of mutuallyconsistent outliers. This correlation makes it even harder to judge if a measurement is an inlier or not, contributing to the brittleness of the resulting SLAM pipeline. Surprisingly, while the SLAM literature extensively focused on mitigating the effects of perceptual aliasing, to the best of our knowledge, there is no framework to directly model outlier correlation.
Contribution. This work provides a unified framework to model perceptual aliasing and outlier correlation in SLAM. We propose a radically new approach to obtain provablyrobust SLAM algorithms: rather than developing techniques to mitigate the impact of perceptual aliasing, we explicitly model perceptual aliasing using a discretecontinuous graphical model (DCGM). A simple illustration is given in Fig. 1. The figure shows a DCGM where the continuous variables, shown in blue, describe a standard SLAM formulation, i.e., a pose graph [1], where the triangles represent the trajectory of a moving robot while the edges represent measurements. The figure shows that we associate a discrete variable (large red circles) to each edge/measurement in the pose graph. The discrete variables decide between accepting or rejecting a given measurement. The red edges in the top portion of the figure model the correlation between discrete variables. The expert reader can recognize the top of the figure (graph in red), to be a discrete Markov Random Field (MRF) [2]. The proposed model can naturally capture outlier correlation: for instance, we can model the correlation between three nearby edges, , , in Fig. 1, as a clique involving the corresponding discrete variables (, , ) in the MRF (red triangle in the figure). Similarly, we can capture the temporal correlation of wheel slippage episodes by connecting variables corresponding to consecutive edges.
Our second contribution is the design of a semidefinite (SDP) relaxation that computes a nearoptimal estimate of the variables in the DCGM. Inference in DCGM is intractable in general, due to the nonconvexity of the corresponding estimation problem and to the presence of discrete variables. We show how to obtain an SDP relaxation with perinstance suboptimality guarantees, generalizing previous work on provablycorrect SLAM without outliers [3, 4, 5, 6, 7]. The SDP relaxation can be solved in polynomial time by offtheshelf convex solvers without relying on an initial guess.
Our last contribution is an experimental evaluation on standard SLAM benchmarking datasets. The experimental results show that the proposed DCGM model compares favorably with stateoftheart methods, including Vertigo [8], RRR [9] and DCS [10]. Moreover, they confirm that modeling outlier correlation further increases the resilience of the proposed model, which is able to compute correct SLAM estimates even when of the loop closures are highlycorrelated outliers. Our current (Matlab) implementation is slow, compared to stateoftheart methods, but the proposed approach can be spedup by designing a specialized solver along the lines of [7]. We leave these numerical aspects (which are both interesting and nontrivial on their own) to future work.
Paper structure. Section II provides preliminary notions on MRFs and pose graph optimization. Section III presents our new hybrid discretecontinuous graphical model. Section IV presents a semidefinite programming relaxation for inference in the DCGM. Section V presents experimental results, while Section VI concludes the paper.
Ii Preliminaries and Related Work
This section reviews basic concepts about Markov Random Fields and Pose Graph Optimization.
Iia Markov Random Fields (MRFs)
Markov Random Fields
(MRFs) are a popular graphical model for reconstruction and recognition problems in computer vision and robotics
[11, 2, 12]. A pairwise MRF is defined by a set of nodes we want to label, and a set of edges or potentials, representing probabilistic constraints involving the labels of a single or a pair of nodes. Here we consider binary MRFs, where we associate a binary label to each node .The maximum a posteriori (MAP) estimate of the variables in the MRF is the assignment of the node labels that attains the maximum of the posterior distribution of an MRF, or, equivalently, the minimum of the negative logposterior [11]:
(1) 
where is the set of unary potentials (probabilistic constraints involving a single node), is the set of binary potentials (involving a pair of nodes). Intuitively, if (resp. ), then the unary terms encourage (resp. ) labels for node . Similarly, if , then the binary term encourages nodes and to have the same label since that decreases the cost (1) by . While several choices of unary and binary potentials are possible, the expression in eq. (1) is a very popular model, and is referred to as the Ising model [2, Section 1.4.1].
IiB Pose Graph Optimization (PGO)
Pose Graph Optimization (PGO) is one of the most popular models for SLAM. PGO consists in the estimation of a set of poses (i.e., rotations and translations) from pairwise relative pose measurements. In computer vision a similar problem (typically involving only rotation) is used as a preprocessing step for bundle adjustment in Structure from Motion (SfM) [16].
PGO estimates poses from relative pose measurements. Each tobeestimated pose , , comprises a translationvector and a rotation matrix , where in planar problems or in threedimensional problems. For a pair of poses , a relative pose measurement , with and , describes a noisy measurement of the relative pose between and . Each measurement is assumed to be sampled from the following generative model:
(2) 
where and represent translation and rotation measurement noise, respectively. PGO can be thought as an MRF with variables living on manifold: we need to assign a pose to each node in a graph, given relative measurements associated to edges of the graph. The resulting graph is usually referred to as a pose graph.
Assuming the translation noise is Normally distributed with zero mean and information matrix
and the rotation noise follows a Langevin distribution [5, 4] with concentration parameter , the MAP estimate for the unknown poses can be computed by solving the following optimization problem:(3) 
where denotes the Frobenius norm. The derivation of (3) is given in [4, Proposition 1]. The estimator (3) involves solving a nonconvex optimization, due to the nonconvexity of the set SO(). Recent results [4, 7] show that one can still compute a globallyoptimal solution to (3), when the measurement noise is reasonable, using convex relaxations.
Unfortunately, the minimization (3) follows from the assumption that the measurement noise is lighttailed (e.g., Normally distributed translation noise) and it is known to produce completely wrong pose estimates when this assumption is violated, i.e., in presence of outlying measurements.
IiC Robust PGO
The sensitivity to outliers of the formulation (3) is due to the fact that we minimize the squares of the residual errors (quantities appearing in the squared terms): this implies that large residuals corresponding to spurious measurements dominate the cost. Robust estimators reduce the impact of outliers by adopting cost functions that grow slowly (i.e., less than quadratically) when the residual exceeds a given upper bound . This is the idea behind robust Mestimators, see [17]. For instance, the Huber loss in Fig. 2 grows linearly outside the quadratic region . Ideally, one would like to adopt a truncated least squares (LS) formulation (Fig. 2) where the impact of arbitrarily large outliers remains bounded. Such a formulation, however, is nonconvex and nondifferentiable, typically making the resulting optimization hard.
Traditionally, outlier mitigation in SLAM and SfM relied on the use of robust Mestimators, see [18, 16]. Agarwal et al. [10] propose Dynamic Covariance Scaling (DCS), which dynamically adjusts the measurement covariances to reduce the influence of outliers. Olson and Agarwal [19] use a maxmixture distribution to accommodate multiple hypotheses on the noise distribution of a measurement. Casafranca et al. [20] minimize the norm of the residual errors. Lee et al. [21]
use expectation maximization. Pfingsthorn and Birk
[22] model ambiguous measurements using a mixture of Gaussians.An alternative set of approaches attempts to explicitly identify and reject outliers. A popular technique is RANSAC, see [23], in which subsets of the measurements are sampled in order to identify an outlierfree set. Sünderhauf and Protzel [8, 24] propose Vertigo
, which augments the PGO problem with latent binary variables (then relaxed to continuous variables) that are responsible for deactivating outliers. Latif
et al. [9], Carlone et al. [25], Graham et al. [26], Mangelson et al. [27] look for large sets of “mutually consistent” measurements.
Iii Discretecontinuous Graphical Models
for Robust Pose Graph Optimization
We propose a novel approach for robust PGO that addresses the three main limitations of the state of the art. First, rather than mitigating outlier correlation, we explicitly model it. Second, our estimation scheme (Section IV) does not rely on any initial guess. Third, we go beyond recentlyproposed convex relaxation for robust rotation and pose estimation [28, 6, 29], and use a nonconvex loss, namely, the truncated LS cost in Fig. 2
. This circumvents issues with convex robust loss functions which are known to have low breakdown point (e.g., the Huber loss
[6] or norm [28, 6, 29] can be compromised by the presence of a single “bad” outlier).Iiia A unified view of robust PGO
Let us partition the edges of the pose graph into odometric edges and loopclosure edges . Perceptual aliasing affects exteroceptive sensors, hence —while we can typically trust odometric edges— loop closures may include outliers.
According to the discussion in Section IIC, an ideal formulation for robust PGO would use a truncated LS cost for the loopclosure edges in :
(4) 
where, for a positive scalar , the function is:
(5) 
While the formulation (4) would be able to tolerate arbitrarily “bad” outliers, it has two main drawbacks. First, is nonconvex, adding to the nonconvexity already induced by the rotations ( is a nonconvex set). Second, the cost is nondifferentiable, as shown in Fig. 2, hence also preventing the use of fast (but local) smooth optimization techniques.
The first insight behind the proposed approach is simple but powerful: we can rewrite the truncated LS cost (5) as a minimization over a binary variable:
(6) 
To show the equivalence between (6) and (5), we observe that for any such that (or ), the minimum in (6) is attained for and ; on the other hand, for any such that (or ), the minimum in (6) is attained for and .
We can now use the expression (6) to rewrite the cost function (4) by introducing a binary variable for each rotation and translation measurement:
(7) 
where and are simply the largest admissible residual errors for a translation and rotation measurement to be considered an inlier. Intuitively, decides whether a translation measurement is an inlier () or an outlier (); has the same role for rotation measurements. While the expression (7) resembles formulations in literature, e.g., Sünderhauf’s switchable constraints [8], the advantage of this interpretation is that we have a physical interpretation of the parameters and (maximum admissible residuals). Moreover, we will push the boundary of the state of the art by modeling the outlier correlation (next subsection) and proposing global semidefinite solvers (Section IV).
IiiB Modeling outlier correlation and perceptual aliasing
The goal of this section is to introduce extra terms in the cost (7) to model the correlation between subsets of binary variables, hence capturing outlier correlation. First or all, for the sake of simplicity, we assume that a unique binary variable is used to decide if both the translation and the rotation components of measurement are accepted, i.e., we set . This assumption is not necessary for the following derivation, but it allows using a more compact notation. In particular, we rewrite (7) more succinctly as:
(8) 
where for two matrices and of compatible dimensions , and –following [30]– we defined:
and for simplicity we called .
We already observed in Section IIA that to model the correlation between two discrete variables and we can add terms to the cost function, which penalize a mismatch between and whenever the scalar is negative. This leads to generalizing problem (8) as follows:
(9) 
where the set contains pairs of edges that are correlated, i.e., pairs of edges for which if is found to be an outlier, it is likely for to be an outlier as well.
Problem (9) describes a discretecontinuous graphical model (DCGM) as the one pictured in Fig. 1: the optimization problems returns the most likely assignment of variables in the graphical model, which contains both continuous variables () and discrete variables (). The reader can notice that if the assignment of discrete variables is given, (9) reduces to PGO, while if the continuous variables are given, then (9) becomes an MRF, where the second sum in (9) defines the unary potentials for each discrete variable in the MRF.
Iv Inference in DCGM via Convex Relaxation
The DCGM presented in Section III captures two very desirable aspects: (i) it uses a robust truncated LS loss function and (ii) it can easily model outlier correlation. On the downside, the optimization (9) is intractable in general, due to the presence of discrete variables and the nonconvex nature of the rotation set SO().
Here we derive a convex relaxation that is able to compute nearoptimal solutions for (9) in polynomial time. While we do not expect to compute exact solutions for (9) in all cases in polynomial time (the problem is NPhard in general), our goal is to obtain a relaxation that works well when the noise on the inliers is reasonable (i.e., similar to the one found in practical applications) and whose quality is largely insensitive to the presence of a large number of (arbitrarily “bad”) outliers.
In order to derive our convex relaxation, it is convenient to reformulate (9) using a more compact matrix notation. Let us first “move” the binary variables inside the norm and drop constant terms from the objective in (9):
(10) 
where we noted that is either zero or one, hence it can be safely moved inside the norm, and we dropped .
We can now stack pose variables into a single matrix . We also use a matrix representation for the binary variables where denotes the number of loop closures and
denotes the identity matrix of size
. Finally, we define:(11) 
The following proposition provides a compact reformulation of problem (10) using the matrix in (11):
Proposition 1 (Geometric Formulation of DCGM).
Intuitively, in (12) captures the terms on the first and last line in (10), while the sum including (one term for each loop closure ) captures the terms on the second line of (10) which couple discrete and continuous variables.
The final step before obtaining a convex relaxation is to write the “geometric” constraints in (12) in terms of linear algebra. Towards this goal, we will relax the set SO() (rotation matrices) to O() (orthogonal matrices), i.e., we drop the constraint that rotation matrices need to have determinant . In related work, we found the determinant constraint to be redundant [31]. Moreover, this is done for the sake of simplicity, while the determinant constraints can be still modeled as shown in [31]. The following proposition rewrites the constraints in (12) in algebraic terms.
Proposition 2 (Algebraic Formulation of DCGM).
If we relax the set SO() (Special Orthogonal Group) to O() (Orthogonal group), Problem (12) can we written as
(13) 
where denotes the block of in block row and block column , the symbol “*” denotes entries that are unconstrained (we follow the notation of [30]), and where enforces the block to be an isotropic diagonal matrix, i.e., a scalar multiple of .
Let us explain the constraints in (13), by using the block structure of described in (11). For , the diagonal blocks are in the form of , hence the first constraint in (13) captures the orthogonality of the rotation matrix included in each pose . For , the diagonal blocks are in the form of and since , , which is captured in the second constraint in (13); similar considerations hold for . Finally, the products (captured by the blocks when ) must be diagonal matrices, producing the third constraint in (13). We note that the constraint is quadratic since it requires imposing that the offdiagonal entries of are zero and the diagonal entries are identical.
It is now easy to derive an SDP relaxation for Problem (13).
Proposition 3 (Semidefinite Relaxation of DCGM).
The following SDP is a convex relaxation of Problem (13):
(14) 
Proof.
The SDP relaxation can be solved using offtheshelf convex solvers. In particular, we note that the constraint can be implemented as a set of linear equality constraints. The SDP relaxation (14) enjoys the typical perinstance optimality guarantees described in related work [4, 5, 6, 7]. In particular, if the solution of (14) has rank , then the relaxation solves (13) exactly. Moreover, the optimal objective of (14) is a lower bound for the optimal objective (13), a property that can be used to evaluate how suboptimal a given estimate is, see [4, 5].
V Experiments
This section presents two sets of experiments. Section VA reports the results of Monte Carlo runs on a synthetic dataset and shows that the proposed technique compares favorably with the state of the art, and that modeling outlier correlation leads to performance improvements. Section VB evaluates the proposed techniques in three real benchmarking datasets and shows that our approach outperforms related techniques while not requiring parameter tuning or initial guess.
Va Experiments On Synthetic Dataset
Methodology. For this set of experiments, we built a synthetic dataset composed of a simple trajectory on a grid of 20 by 10 nodes. Then we added random groups of loop closures between the rows (50 loop closures in total) as described in [8]
. Typically, in presence of perceptual aliasing, the outliers are in mutuallyconsistent groups, e.g., the SLAM frontend generates multiple false loop closures in sequence. To simulate this phenomenon, we set the loop closures in each group to be either all inliers or all outliers. We set the standard deviation of the translation and rotation noise for the inlier measurements (odometry and correct loop closures) to
m and rad. The maximum admissible errors for the truncated LS (5) is set to of the measurement noise. We tested the performance of our techniques for increasing levels of outliers, up to the case where of the loop closure are outliers. Fig. 7 shows the overlay of multiple trajectories (5 runs) estimated by our techniques versus the ground truth trajectory (green), when of the loop closures are outliers.Compared techniques. We evaluate the performance of the proposed technique, DCGM, which solves the minimization problem (9). In order to show that capturing outlier correlation leads to performance improvements, we also test a variation of the proposed approach, called DCGMd, which implements the minimization problem (8), where outliers are assumed uncorrelated (the “d” stands for decoupled). In both DCGM and DCGMd, we solve the SDP using cvx [32] in Matlab. If the resulting matrix does not have rank (in which case we are not guaranteed to get an exact solution to the nonrelaxed problem), we round the result to detect the set of outliers, and rerun the optimization without the outliers.
We benchmarked our approach against three other robust PGO techniques, i.e., Vertigo [8], RRR [9] and DCS [10]. For Vertigo we use the default parameters, while for RRR and DCS we report results for multiple choices of parameters, since these parameters have a significant impact on performance. In particular, for RRR we consider three cluster sizes () and for DCS we considered three values of the parameter [10]. For all these techniques, we used the odometric estimate as initial guess.
Results and Interpretation. Fig. 4 reports the average translation error for all the compared approaches and for increasing percentage of outliers. Vertigo’s error grows quickly beyond 30% of outliers. For DCS, the performance heavily relies on correct parameter tuning: for some choice of parameters () it has excellent performance while the approach fails for . Unfortunately, these parameters are difficult to tune in general (we will observe in Section VB that the choice of parameters mentioned above may not produce the best results in the real tests). The proposed techniques, DCGMd and DCGM, compare favorably against the state of the art while they are slightly less accurate than RRR, which produced the best results in simulation.


In order to shed light on the performance of DCGM and DCGMd, Fig. 9 reports the average percentage of outliers rejected by these two techniques. While from the scale of the yaxis we note that both techniques are able to reject most outliers, DCGM is able to reject all outliers in all tests even when up to of the loop closures are spurious. As expected, modeling outlier correlation as in DCGM improves outlier rejection performance. We also recorded the number of incorrectly rejected inliers: both approaches do not reject any inlier and for this reason we omit the corresponding figure.
In our tests, the SDP relaxation (14
) produces lowrank solutions with 2 relatively large eigenvalues, followed by 2 smaller ones (the remaining eigenvalues are numerically zero).



VB Experiments On Real Datasets
Methodology. In this section, we consider three realworld standard benchmarking datasets, the CSAIL dataset (1045 poses and 1172 edges), the FR079 dataset (989 poses and 1217 edges), and the FRH dataset (1316 poses and 2820 edges). A more detailed description of the datasets is given in [33]. We spoiled those datasets with 20 randomly grouped outliers. We benchmarked our approach against Vertigo, RRR, and DCS.
DCGM  Vertigo  RRR (=1)  RRR (=5)  RRR (=10)  DCS ()  DCS ()  DCS ()  

CSAIL  0.211  1.424  0.039  0.058  0.045  1.417  1.377  0.050 
FR079  0.018  0.052  0.547  0.547  0.547  0.052  0.319  0.542 
FRH  0.002  0.004  4.357  4.357  4.357  0.010  1.019  4.324 
Results and Interpretation. Table I presents the average translation error for all datasets and techniques. As in the simulated datasets, RRR has the smallest errors on the CSAIL dataset, Vertigo performs poorly, and the performance of DCS heavily depends on the parameter choice. The proposed approach ensures a good performance with an average error of cm. The results on FR079 and FRH are even more interesting. In both cases, RRR has a large error. Given that the main parameter in RRR is the cluster size, we believe that the size of the cluster of loop closures in CSAIL are within the range of values tested for RRR
while in FR079 and FRH, which both have much more loop closures than CSAIL, the cluster size is probably underestimated by the method.
Vertigo and DCS have acceptable performance on FR079 and FRH, except for some parameter choices (e.g., DCS performs poorly for ). This is probably due to the fact that these datasets have relatively good initial guess, hence Vertigo and DCS, which rely on iterative optimization, are able to reject most outliers. The proposed method, DCGM has the best performance on both FR079 and FRH. We attribute this performance boost to the fact that the proposed approach provides a more direct control on the maximum admissible error on each measurement, while the parameters in DCS and Vertigo have a less clear physical interpretation. This translates to the fact that it is more difficult for DCS and Vertigo to strike a balance between outlier rejection and inlier selection. Therefore, even when these approaches are able to discard most outliers, they may lose accuracy since they also tend to discard good measurements. The difficulty in performing parameter tuning for DCS is confirmed by the fact that the value (recommended by Agarwal et al. [10]) leads to good results on FR079 and FRH, but fails on CSAIL.Fig. 13 shows the trajectory estimates produced by DCGM for the three real datasets, CSAIL, FR079, and FRH.
Vi Conclusion
We introduced a discretecontinuous graphical model (DCGM) to capture perceptual aliasing and outlier correlation in SLAM. Then we developed a semidefinite (SDP) relaxation to perform nearoptimal inference in the DCGM and obtain robust SLAM estimates. Our experiments show that the proposed approach compares favorably with the state of the art while not requiring parameter tuning and not relying on an initial guess for optimization. This paper opens several avenues for future work. First, our Matlab implementation is currently slow: we plan to develop specialized solvers to optimize the SDP relaxations presented in this paper efficiently, leveraging previous work [7]. Second, we plan to extend our testing to 3D SLAM problems: the mathematical formulation in this paper is general, while for numerical reasons we had to limit our tests to relatively small 2D problems. Finally, it would be interesting to provide a theoretical bound on the number of outliers that the proposed technique can tolerate.
This appendix proves Proposition 1 by showing how to reformulate problem (10) using the matrix in (11). Let us start by rewriting problem (10) by replacing the (scalar) discrete variables with “binary” selection matrices :
(15) 
where we also rearranged the summands. Note the division by in the second and third sum in (15), needed to compensate for the fact that we are now working with matrices.
We now develop each term in (15). The first summation in (15) can be written as
(16) 
where is the Connection Laplacian [7] of the graph , which has the same set of nodes as the original pose graph, but only includes odometric edges . We can use a derivation similar to [30] to show that the Connection Laplacian of a generic graph can be written as
(17) 
where, the matrices and are given as follows:
(18) 
(19) 
The notation denotes the block of at block row and block column , while the th edge in is denoted as .
The first three terms in (15) are linear with respect to parts of the matrix in (11), so we write the sum of (16), (20), (21) compactly as where
(23) 
which is the first term in eq. (12). Here
denotes a zero matrix of size
.In order to complete the proof, we only need to show that the last sum in (15) can be written as , cf. (12). Towards this goal, we develop each squared norm in the last sum using a derivation similar to (16) and get:
(24) 
where denotes a graph with a single edge . We can write and as matrix blocks in :
(25) 
which enables to write each squared norm in terms of as follows:
(26) 
where
Comments
There are no comments yet.