The problem of group synchronization is critical for various tasks in data science, including structure from motion (SfM), simultaneous localization and mapping (SLAM), Cryo-electron microscopy imaging, sensor network localization, multi-object matching and community detection. Rotation synchronization, also known as rotation averaging, is the most common group synchronization problems in 3D reconstruction. It asks to recover camera rotations from measured relative rotations between pairs of cameras. Permutation synchronization, which has applications in multi-object matching, asks to obtain globally consistent matches of objects from possibly erroneous measurements of matches between some pairs of objects. The simplest example of group synchronization issynchronization, which appears in community detection.
The general problem of group synchronization can be mathematically formulated as follows. Assume a graph with vertices indexed by , a group , and set of group elements . The problem asks to recover from noisy and corrupted measurements of the group ratios . We use the star superscript to emphasize the ground truth solution of group synchronization. We note that one can only recover, or approximate, the original group elements up to a right group action. Indeed, for any , can also be written as and thus is also a solution. The above mentioned synchronization problems (rotation, permutation, and synchronization) correspond to the groups , , and , respectively.
The most challenging issue for group synchronization is the practical scenario of highly corrupted and noisy measurements. Traditional least squares solvers often fail to produce accurate results in such a scenario. Moreover, some basic estimators that seem to be robust to corruption often do not tolerate in practice high level of noise. We aim to propose a general method for group synchronization that may tolerate high levels and different kinds of corruption and noise. While our basic ideas are formally general, in order to carefully refine and test them, we focus on the special problem of rotation synchronization, which is also known as rotation averaging (Hartley et al., 2013)
. We choose this problem as it is the most common, and possibly most difficult, synchronization problem in 3D computer vision.
1.1 Related Works
Most previous group synchronization solvers minimize an energy function. For the discrete groups and , least squares energy minimization is commonly used. Relevant robustness results, under special corruption and noise models, are discussed in Abbe et al. (2014); Abbe (2017); Bandeira (2018); Chen et al. (2014); Huang & Guibas (2013); Huroyan (2018); Pachauri et al. (2013).
For Lie groups, such as , that is, the group of orthogonal matrices with determinant 1, where , least squares minimization was proposed to handle Gaussian noise (Bandeira et al., 2017; Eriksson et al., 2018; Govindu, 2004). However, when the measurements are also adversarially corrupted, this framework does not work well and other corruption-robust energy functions need to be used (Chatterjee & Govindu, 2013, 2018; Hartley et al., 2011; Maunu & Lerman, 2020; Wang & Singer, 2013). The most common corruption-robust energy function uses least absolute deviations. Wang & Singer (2013) prove that under a very special probabilistic setting with
, the pure minimizer of this energy function can exactly recover the underlying group elements with high probability. However, their assumptions are strong and they use convex relaxation, which changes the original problem and is expensive to compute.Maunu & Lerman (2020) apply a trimmed averaging procedure for robustly solving synchronization. They are able to recover the ground truth group elements under a special deterministic condition on the topology of the corrupted subgraph. However, the verification of this condition and its extension to , where , are nontrivial. Hartley et al. (2011) used the Weiszfeld algorithm to minimize the least-absolute-deviations energy function with . Their method iteratively computes geodesic medians. However, they update only one rotation matrix per iteration, which results in slow empirical convergence and may increase the possibility of getting stuck at local minima. Chatterjee & Govindu (2018) proposed a robust Lie-algebraic averaging method over . They apply an iteratively reweighted least squares (IRLS) procedure in the tangent space of to optimize different robust energy functions, including the one that uses least absolute deviations. They claim that the use of the
norm for deviations results in highest empirical accuracy. The empirical robustness of the two latter works is not theoretically guaranteed, even in simple settings. A recent deep learning method(Purkait et al., 2019) solves a supervised version of rotation synchronization, but it does not apply to the above unsupervised formulation of the problem.
Huang et al. (2017) use least absolute deviations minimization for solving 1D translation synchronization, where with addition. They propose a special version of IRLS and provide a deterministic exact recovery guarantee that depends on properties of the graph and its Laplacian. They do not explain their general result in an adversarial setting, but in a very special noisy setting.
Robustness results were established for least absolute deviations minimization in camera location estimation, which is somewhat similar to group synchronization (Hand et al., 2018; Lerman et al., 2018). These results assume special probabilistic setting, however, they have relatively weak assumptions on the corruption model.
Several energy minimization solutions have been proposed to synchronization (Birdal et al., 2018; Briales & Jiménez, 2017; Rosen et al., 2019; Arrigoni et al., 2016, 2018). This problem asks to jointly estimate camera rotations and locations from relative measurements of both. Neither of these solutions successfully address highly corrupted scenarios.
Other works on group synchronization, which do not minimize energy functions but aim to robustly recover corrupted solutions, screen corrupted edges using cycle consistency information. For a group with group identity denoted by , any , any cycle of length and any corresponding product of ground-truth group ratios along , , the cycle-consistency constraint is . In practice, one is given the product of measurements, that is, , and in order to “approximately satisfy the cycle-consistency constraint” one tries to enforce to be sufficiently close to . Zach et al. (2010) uses the cycle-consistency constraint to detect corrupted relative rotations in
. It seeks to maximize a log likelihood function, which is based on the cycle-consistency constraint, using either belief propagation or convex relaxation. However, no theoretical guarantees are provided for the accuracy of outlier detection. Moreover, the log likelihood function implies very strong assumptions on the joint densities of the given relative rotations.Shen et al. (2016)classify the relative rotations as uncorrupted if they belong to any cycle that approximately satisfies the cycle-consistency constraint. However, this work only exploits local information and cannot handle the adversarial corruption case, where corrupted cycles can be approximately consistent.
An iterative reweighting strategy, IR-AAB (Shi & Lerman, 2018), was proposed to detect and remove corrupted pairwise directions for the different problem of camera location estimation. It utilizes another notion of cycle-consistency to infer the corruption level of each edge. Lerman & Shi (2019) extend the latter idea, and interpret it as a message passing procedure, to solve group synchronization with any compact group. They refer to their new procedure as cycle-edge message passing (CEMP). While We follow ideas of Lerman & Shi (2019); Shi & Lerman (2018), we directly solve for group elements, instead of estimating corruption levels, using them to initial cleaning of edges and solving the cleaner problem with another method.
To the best of our knowledge, the unified frameworks for group synchronization are Gao & Zhao (2019); Lerman & Shi (2019); Perry et al. (2018). However, Gao & Zhao (2019) and Perry et al. (2018) assume special probabilistic models that do not address adversarial corruption. Furthermore, Gao & Zhao (2019) only applies to Lie groups and the different setting of multi-frequencies.
1.2 Contribution of This Work
Current group synchronization solvers often do not perform well with highly corrupted and noisy group ratios. In order to address this issue, we propose a rotation synchronization solver that can address in practice high levels of noise and corruption. Our main ideas seem to generalize to group synchronization with any compact group, but more careful developments and testing are needed for other groups. We emphasize the following specific contributions of this work:
We propose the message passing least squares (MPLS) framework as an alternative paradigm to IRLS for group synchronization, and in particular, rotation synchronization. It uses the theoretically guaranteed CEMP algorithm for estimating the underlying corruption levels. These estimates are then used for learning better weights for the weighted least squares problem.
While MPLS can be formally applied to any compact group, we refine and test it for the group . We demonstrate state-of-the-art results for rotation synchronization with both synthetic data having nontrivial scenarios and real SfM data.
2 Setting for Robust Group Synchronization
Some previous robustness theories for group synchronization typically assume a very special and often unrealistic corruption probabilistic model for very special groups (Pachauri et al., 2013; Wang & Singer, 2013). In general, simplistic probabilistic models for corruption, such as generating potentially corrupted group ratios according to the Haar measure on (Wang & Singer, 2013), may not generate some nontrivial scenarios that often occur in practice. For example, in the application of rotation synchronization that arise in SfM, the corrupted camera relative rotations can be self-consistent due to the ambiguous scene structures (Wilson & Snavely, 2014). However, in common probabilistic models, such as the one in Wang & Singer (2013), cycles with corrupted edges are self-consistent with probability zero. A more realistic model is the adversarial corruption model for the different problem of camera location (Lerman et al., 2018; Hand et al., 2018). However, it also assumes very special probabilistic models for the graph and camera locations, which are not realistic. A more general model of adversarial corruption with noise is due to Lerman & Shi (2019) and we review it here.
We assume a graph and a compact group with a bi-invariant metric , that is, for any , , , . For , or any Lie group, is commonly chosen to be the geodesic distance. Since is compact, we can scale and assume that .
We partition into and , which represent sets of good (uncorrupted) and bad (corrupted) edges, respectively. We will need a topological assumption on , or equivalently, . A necessary assumption is that is connected, though further restrictions on may be needed for establishing theoretical guarantees (Lerman & Shi, 2019).
In the noiseless case, the adversarial corruption model generates group ratios in the following way.
That is, for edges , the corrupted group ratio can be arbitrarily chosen from . The corruption is called adversarial since one can maliciously corrupt the group ratios for and also maliciously choose as long as the needed assumptions on are satisfied. One can even form cycle-consistent corrupted edges, so that they can be confused with the good edges.
In the noisy case, we assume a noise model for , where . In theory, one may need to restrict this model (Lerman & Shi, 2019), but in practice we test highly noisy scenarios.
3 Issues with the Common IRLS
We first review the least squares minimization, least absolute and unsquared deviations minimization and IRLS for group synchronization. We then explain why IRLS may not form a good solution for the group synchronization problem, and in particular for Lie algebraic groups, such as the rotation group.
The least squares minimization can be formulated as follows:
where one often relaxes this formulation. This formulation is generally sensitive to outliers and thus more robust energy functions are commonly used when considering corrupted group ratios. More specifically, one may choose a special function and solve the following least unsquared deviation formulation
The special case of (Hand et al., 2018; Hartley et al., 2011; Ozyesil & Singer, 2015; Wang & Singer, 2013) is referred to as least absolute deviations. Some other common choices are (Chatterjee & Govindu, 2013) and (Chatterjee & Govindu, 2018).
The least unsquared formulation is typically solved using IRLS, where at iteration one solves the weighted least squares problem:
In the first iteration the weights can be initialized in a certain way, but in the next iterations the weights are updated using the residuals of this solution. Specifically, for and iteration , the residual is and the weight is
where the function depends on the choice of . For , where , , where is a regularization parameter and here we fix .
The above IRLS procedure poses the following three issues. First, its convergence to the solution is not guaranteed, especially under severe corruption. Indeed, IRLS succeeds when it accurately estimates the correct weights for each edge. Ideally, when the solution is close to the ground truth , the residual must be close to the corruption level so that weight must be close to . However, if edge is severely corrupted (or edge has high noise) and either or is wrongly estimated, then the residual might have a very small value. Thus the weight in (5) can be extremely large and may result in an inaccurate solution in the next iteration and possibly low-quality solution at the last iteration.
The second issue is that for common groups each iteration of (4) requires either SDP relaxation or tangent space approximation (for Lie groups). However, if the weights of IRLS are wrongly estimated in the beginning, then they may affect the tightness of the SDP relaxation and the validity of tangent space approximation. Therefore, such procedures tend to make the IRLS scheme sensitive to corruption and initialization of weights and group elements.
At last, when dealing with noisy data where most of , , are significantly greater than , the current reweighting strategy usually gives non-negligible positive weights to outliers. This can be concluded from the expression of (e.g., for minimization) and the fact that in a good scenario and can be away from 0. Therefore, outliers can be overweighed and this may lead to low-quality solutions. We remark that this issue is more noticeable in Lie groups, such as the rotation group, as all measurements are often noisy and corrupted; whereas in discrete groups some measurements may be rather accurate (Shi et al., 2020).
4 Message Passing Least Squares (MPLS)
In view of the drawbacks of the common IRLS scheme, we propose the MPLS (Message Passing Least Squares), or Minneapolis, algorithm. It carefully initializes and reevaluates the weights of a weighted least squares problem by our CEMP algorithm (Lerman & Shi, 2019) or a modified version of it. We first review the ideas of CEMP in Section 4.1. We remark that its goal is to estimate the corruption levels and not the group elements . Section 4.2 formally describes MPLS for the general setting of group synchronization. Section 4.3 carefully refines MPLS for rotation synchronization. Section 4.4 summarizes the complexity of the proposed algorithms.
4.1 Cycle-Edge Message Passing (CEMP)
The CEMP procedure aims to estimate the corruption levels from the cycle inconsistencies, which we define next. For simplicity and for ease of computation, we work here with 3-cycles, that is, triangles in the graph. For each edge , we independently sample with replacement 50 nodes that form 3-cycles with and . That is, if is such a node then and are in . We denote this set of nodes by . We remark that the original version of CEMP in Lerman & Shi (2019) uses all 3-cycles and can be slower. We define the cycle inconsistency of the 3-cycle associated with edge and as follows
The idea of CEMP is to iteratively estimate each corruption level for from a weighted average of the cycle inconsistencies . To motivate this idea, we assume the noiseless adversarial corruption model and formulate the following proposition whose easy proof appears in Lerman & Shi (2019).
If , that is, , , then
If the condition of the above proposition holds, we call a good cycle with respect to , otherwise we call it a bad cycle.
CEMP also aims to estimate the conditional probability that the cycle is good, so The conditioning is on the estimates of corruption levels computed in the previous iteration. CEMP uses this probability as a weight for the cycle inconsistency . The whole weighted sum thus aims to estimate the conditional expectation of at iteration . This estimate is denoted by . The iteration of CEMP thus contains two main steps: 1) Computation of the weight of ; 2) Computation of as a weighted average of the cycle inconsistencies . In the former stage messages are passed from edges to cycles and in the latter stage messages are passed from cycles to edges. The simple procedure is summarized in Algorithm 1. We formulate it in generality for a compact group with a bi-invariant metric and a graph , for which the cycle inconsistencies, , were computed in advance. Our default parameters are later specified in Section 5.1. For generality, we write instead of 50.
Algorithm 1 can be explained in two different ways (Lerman & Shi, 2019). First of all, it can be theoretically guaranteed to be robust to adversarial corruption and stable to low level of noise (see Theorem 5.4 in Lerman & Shi (2019)
). Second of all, our above heuristic explanation can be made more rigorous using some statistical assumptions. Such assumptions are common in other message passing algorithms in statistical mechanics formulations and we thus find it important to motivate them here. We remark that our statistical assumptions are weaker than those of previous works on message passing(Donoho et al., 2009; Yedidia et al., 2003), and in particular, those for group synchronization (Perry et al., 2018; Zach et al., 2010). We also remark that they are not needed for establishing the above mentioned theory.
Our first assumption is that is a good cycle if and only if . Proposition 4.1 implies the only if part, but the other part is generally not true. However, under special random corruption models (e.g., models in Wang & Singer (2013); Ozyesil & Singer (2015)), the assumed equivalence holds with probability . We further assume that and
are both i.i.d. random variables and that for any, is independent of for . We further assume that for any
We also assume the existence of good cycle for any .
In view of these assumptions, in particular the i.i.d. sampling and (7), we obtain that the expression for used in Algorithm 1 coincides with the conditional probability that is a good cycle, that is, the conditional probability that :
Using the definition of conditional expectation, the equivalence assumption, the above i.i.d. sampling assumptions and (7), we show that the expression for used in Algorithm 1 coincides with the conditional expectation of :
4.2 General Formulation of MPLS
MPLS uses the basic ideas of CEMP in order to robustly estimate the residuals and weights of the IRLS scheme as well as to carefully initialize this scheme. It also incorporates a novel truncation idea. We explain in this and the next section how these new ideas address the drawbacks of the common IRLS procedure, which was reviewed in Section 3. We sketch MPLS in Algorithm 2, demonstrate it in Figure 1 and explain it below. For generality, we assume in this section a compact group , a bi-invariant metric on and a graph with given relative measurements and cycle inconsistencies computed in advance. Our default parameters are later specified in Section 5.1.
Instead of using the traditional IRLS reweighting function explained in Section 3, we use its truncated version with a parameter . We decrease as the iteration number increases in order to avoid overweighing outliers. By doing this we aim to address the third drawback of IRLS mentioned in Section 3. We remark that the truncated function is and the additional term is needed to ensure that the graph with weights resulting from is connected.
The initial step of the algorithm estimates the corruption levels by CEMP. The initial weights for the IRLS procedure follow (5) with additional truncation. At each iteration, the group ratios are estimated from the weighted least squares procedure in (4). However, the weights are updated in a very different way. First of all, for each the corruption level is re-estimated in two different ways and a convex combination of the two estimates is taken. The first estimate is a residual computed with the newly updated estimates . This is the error of approximating the given measurement by the newly estimated group ratio. The other estimate practically applies CEMP to re-estimate the corruption levels. For edge , the latter estimate of is denoted by . For interpretation, we can replace (7) with and use it to derive analogs of (4.1) and (4.1). Unlike CEMP, we use the single parameter, , as we assume that CEMP provides a sufficiently good initialization. At last, a similar weight as in (5), but truncated, is applied to the combined estimate .
We remark that utilizing the estimate for the corruption level addresses the first drawback of IRLS discussed in Section 3. Indeed, assume the case where and is close to 0. Here, computed by IRLS is relatively large; however, since , needs to be small. Unlike in IRLS, we expect that in MPLS should not be too small as long as for some , are sufficiently large. This happens as long as there exists some for which the cycle is good. Indeed, in this case is sufficiently large and for good cycles .
We further remark that is a good approximation of under certain conditions. For example, if for all , and , then plugging in the definition of to the expression of , using the fact that is sufficiently large and at last applying Proposition 4.1, we obtain that
This intuitive argument for a restricted case conveys the idea that “local good information” can be used to estimate . The theory of CEMP (Lerman & Shi, 2019) shows that under weaker conditions such information can propagate through the whole graph within a few iterations, but we cannot extend it to MPLS.
If the graph is dense with sufficiently many good cycles, then we expect that this good information can propagate in few iterations and that will have a significant advantage over . However, in real scenarios of rotation synchronization in SfM, one may encounter sparse graphs, which may not have enough cycles and, in particular, not enough good cycles. In this case, utilizing is mainly useful in the early iterations of the algorithm. On the other hand, when are close to , will be sufficiently close to . Aiming to address rotation synchronization, we decrease , the weight of , with . In other applications, different choices of can be used (Shi et al., 2020).
The second drawback of IRLS, discussed in Section 3, is the possible difficulty of implementing the weighted least squares step of (4). This issue is application-dependent, and since in this work we focus on rotation synchronization (equivalently, synchronization), we show in the next subsection how MPLS can deal with the above issue in this specific problem. Nevertheless, we claim that our framework can also be applied to other compact group synchronization problems and we demonstrate this claim in a follow up work (Shi et al., 2020).
4.3 MPLS for synchronization
Rotation synchronization, or synchronization, aims to solve 3D rotations from measurements of the 3D relative rotations . Throughout the rest of the paper, we use the following normalized geodesic distance for :
where is the matrix logarithm and the normalization factor ensures that the diameter of is . We provide some relevant preliminaries of the Riemannian geometry of in Section 4.3.1 and then describe the implementation of MPLS for , which we refer to as MPLS-, in Section 4.3.2.
4.3.1 Preliminaries: and
We note that is a Lie group, and its corresponding Lie algebra,
, is the space of all skew symmetric matrices, which is isomorphic to. For each , its corresponding element in is , where denotes matrix logarithm. Each can be represented as for some in the following way:
In other words, we can map any to and , where denotes the matrix exponential function. We remark that geometrically
is the tangent vector atof the geodesic path from to .
4.3.2 Details of MPLS-
We note that in order to adapt MPLS to the group , we only need a specific algorithm to solve the following formulation of the weighted least squares problem at iteration
where the last equality follows from the bi-invariance of . The constraints on orthogonality and determinant of are non-convex. If one relaxes those constraints, with an appropriate choice of the metric , then the solution of the least squares problem in the relaxed Euclidean space often lies away from the embedding of into that space. For this reason, we follow the common choice of according to (11) and implement the Lie-algebraic Averaging (LAA) procedure (Govindu, 2004; Chatterjee & Govindu, 2013, 2018; Tron et al., 2008). We review LAA, explain why it may be problematic and why our overall implementation may overcome its problems. LAA aims to move from to along the manifold using the right group action , where . For this purpose, it defines so that
and (12) can be transformed to the still hard to solve equation
LAA then maps and to the tangent space of by and . Applying (11) and the fact that the Riemannian logarithmic map, which is represented by , preserves the geodesic distance and using a “naive approximation”: . Therefore, LAA uses the following approximation
Consequently, LAA transforms (13) as follows:
However, the approximation in (14) is only valid when , , , which is unrealistic.
One can check that the following conditions: (), and for imply that , , and thus imply (14). Therefore, to make LAA work we need to give large weights to edges with small and provide a good initialization that is reasonably close to and so that for all are still close to the ground truth. Our heuristic argument is that good approximation by CEMP, followed by MPLS, addresses these requirements. Indeed, to address the first requirement, we note that good initialization by CEMP can result in and by the nature of , is large when is small. As for the second requirement, we assign the weights , obtained by CEMP, to each and find the minimal spanning tree for the weighted graph by Prim’s algorithm. We initialize the rotations by fixing , multiplying relative rotations along the computed minimal spanning tree and consequently obtaining for any node . We summarize our MPLS version of rotation averaging in Algorithm 3.
4.4 Computational Complexity
CEMP requires the computation of for and . Its computational complexity per iteration is thus of order as we use for all . Since we advocate few iterations () of CEMP, or due to its fast convergence under special settings (Lerman & Shi, 2019), we can assume that its total complexity is . The computational complexity of MPLS depends on the complexity of solving the weighted least squares problem, which depends on the group. For MPLS-, the most expensive part is solving the weighted least squares problem in the tangent space, whose complexity is at most . This is thus also the complexity of MPLS- per iteration. Unlike CEMP, we have no convergence guarantees yet for MPLS.
5 Numerical Experiments
We test the proposed MPLS algorithm on rotation synchronization, while comparing with state-of-the-art methods. We also try simpler ideas than MPLS that are based on the basic strategy of CEMP. All computational tasks were implemented on a machine with 2.5GHz Intel i5 quad core processors and 8GB memory.
We use the following default parameters for Algorithm 1: for ; ; and . If an edge is not contained in any 3-cycle, we set its corruption level as 1. For MPLS-, which we refer to in this section as MPLS, we use the above parameters of Algorithm 1 and the following ones for :
Here, denotes the empirical distribution of . That is, for , 1, 2, 3, we ignore , , , of edges that have highest