Unlabelled sensing has been recently introduced and explored in Unnikrishnan et al. (2015) as a problem with duality connections with the well-known problem of compressed sensing Donoho et al. (2006). In this problem, similar to linear regression, the response signal is modeled as a linear combination of a set of covariates. However, the correspondence of the responses to the covariates is modeled as having been shuffled by an unknown permutation matrix. For this reason, the problem has also been termed as linear regression with shuffled labels by Abid et al. (2017), linear regression with an unknown permutation by Pananjady et al. (2016) or linear regression without correspondence (RWOC) by Hsu et al. (2017), the latter of which will be used to refer to the problem herein. Although RWOC is, in general, an NP-hard problem, there have been several advances in recent years to propose signal to noise ratio (SNR) bounds for recovery of the permutation matrix and the regression coefficients Pananjady et al. (2016); Unnikrishnan et al. (2018). Conversely, the same works have also analyzed the SNR and sampling regime by which no recovery is possible.
Preceding the recent literature on RWOC from the theoretical statistics community, there have been many efforts in the computer vision community to solve a related subproblem in the form of point set registration. Point set registration is a problem that consists of simultaneously finding a transformation and a matching of point sets residing in two or three-dimensional image space such that some notion of energy between the matched sets is minimizedVan Kaick et al. (2011); Tam et al. (2013). The types of allowable transformations and energy functions used have differentiated varying methods Besl and McKay (1992); Myronenko and Song (2010); Zhou et al. (2016), that aim to solve this problem. However, point set registration problem can be seen as a specialization of RWOC since the transformation term can be seen as the two or three-dimensional regression coefficients and the set matching is equivalent to recovering a permutation matrix Hast et al. (2013); Irani and Raghavan (1999); Aiger et al. (2008); Mount et al. (1999); Tam et al. (2013); Indyk et al. (1999); Pokrass et al. (2013). In general, point set registration methods employ an iterative strategy of solving the transformation and updating the matching which works well in practice but there are no guarantees for reaching the global optima Chetverikov et al. (2002). Only a few point set registration methods provide approximate globally optimal solutions Yang et al. (2016); Zhou et al. (2016). These methods rely on severe constraints of the transformation domains in order to employ branch and bound techniques on discretizations.
Critically, the computer vision community has attempted to solve the point set registration problem through consideration of outliers and missing correspondences, which are typically encountered in real-world applications. A common technique used in point set registration to robustify the optimization against outliers is to employ random sampling consensus (RANSAC) subroutines Fischler and Bolles (1981); Torr and Zisserman (2000). The main advantages of RANSAC are that the randomization procedure employed can severely reduce the computational cost of an otherwise combinatorial search.
Motivated by applications in biological imaging data such as matching the neuronal populations of Caenorhabditis elegans (C. elegans) across different worms, we aim to unify the ideas presented in RWOC literature and robust point set registration methods to provide provably approximate solutions to the RWOC problem in the presence of outliers and missing data. Robustly and automatically matching and identifying neurons in C. elegans could expedite the post-experimental data analysis and hypothesis testing cycle Bubnis et al. (2019); Kainmueller et al. (2014); Nguyen et al. (2017).
Main contributions: The main contributions presented in this paper are the introduction of randomized algorithms for the recovery of the regression coefficients in the RWOC problem that takes into account noise, missing data, and outliers. Hsu et al. (2017) provide algorithms for the noisy case without generative assumptions; their algorithm takes into account square permutation matrices, which assumes that the entire signal is captured in the responses and does not take into account any missing correspondences or outliers. Unnikrishnan et al. (2015, 2018) provide combinatorial existence arguments; our method is designed for the practical purpose of matching point clouds that may have noisy measurements and outliers. This is undoubtedly the case in the application domain of neuron tracking and matching in biological applications. Specifically, we demonstrate the efficacy of the proposed method in the identification and tracking of in-vivo (C. elegans) neurons. In summary, our contributions are four-fold:
We introduce the notion of "robust" regression without correspondence (rRWOC) that models missing correspondences between responses and covariates as well as completely missed associations in the form of outliers and missing data.
We introduce a polynomial time algorithm to find the exact solution for the one-dimensional noiseless rRWOC and the approximate solution in the noisy regime.
We introduce a randomized approximately correct algorithm that is more efficient than pure-brute force approaches in multiple dimensional rRWOC.
We demonstrate biological applications of our approach to point-set registration problems in the context of automatically matching and identification of the cellular layout of the nervous system of C. elegans worms.
Paper organization: In section 2, we introduce our statistical regression model (rRWOC) that accounts for permuted correspondences, outliers, and noise. We then demonstrate the added computational complexity of recovery of rRWOC in contrast with simple linear regression and RWOC in a one-dimensional case in section 3.1. In section 3.2, we provide a randomized algorithm for the rRWOC problem in multiple dimensions with convergence bounds. Lastly, in section 4.1, we verify the theoretical recovery guarantees in simulated experiments and in section 4.2 show the neuroscience application of the proposed algorithms in the C. elegans neuron matching problem.
2 Regression model
First we introduce notation. Let and denote two d-dimensional point sets consisting of and points, respectively. Let us call the reference or source set which is assumed to be free from outliers. Let denote the target set which contains outliers and missing correspondences. Let the set of indices denote the indices of which are inliers. Conversely, let denote set of indices of which are outliers. By construction, these sets are a disjoint partition of the entire index set of target points: and . Let denote a possibly unbalanced permutation matrix where there are at most ones placed such that no row or column has more than a single one. All other entries are zeroes. Let denote the location of the one in the th row of the permutation matrix . Next, let denote the regression coefficients and denote zero-mean Gaussian noise. Lastly, let
denote the uniform distribution within some closed convex set. Given these definitions, we can define the robust regression without correspondence (rRWOC) model as
In contrast with linear regression, where the sole objective is to recover the coefficients , the two-fold objective of RWOC is to recover the correct permutation matrix , and the regression coefficients . To add to the complexity of the problem, the three-fold objective of rRWOC is to recover the inlier set , the permutation , and the coefficients .
To aid in the recovery of the solution in rRWOC, let us introduce the following two assumptions.
Assumption 1 (Maximal inlier set).
For point sets , , there exists a triple that is maximal in the sense that such that any other triple is not considered to be the underlying regression model.
The source point set is free of outliers while the target point set may consist of outliers.
Assumption 1 allows the identifiability of whether a given hypothetical index set can be considered to be the true underlying inlier set or not. In practical terms, suppose we generate simulated data with points in of which are outliers generated uniformly and the remainder generated with respect to a coefficient such that . There may be cases such that uniformly generated "outliers", , are structured such that there exists a coefficient and permutation such that where . In this case, is identifiable but not verifiable as "correct."
The utility of the assumption 2 is that it acts as a loose generative model on since the point cloud can be modeled as having been generated by a single process such that complexity reducing procedures such as coresets Boutsidis et al. (2013) can be employed. Furthermore, assumption 2 constrains the applications of the proposed model to a template to target type point set matching. This is in contrast with multiple object tracking Luo et al. (2014) since there may be outliers and missing data in both the source and target frames in those scenarios.
Equipped with the rRWOC model and the corresponding assumptions, we now demonstrate the progressive increase in the complexity of recovery of ordinary linear regression, RWOC, and rRWOC in one-dimension.
3.1 Optimal regression in
Linear regression in one-dimension with known correspondences and no outliers can be obtained in time by simply taking the ratio of the sum of the responses to the sum of the covariates: On the other hand, RWOC in the one-dimensional case can be solved in
steps via the method of moments and a simple sorting operation. Namely, first, the regressorcan be estimated using the ratio of the first moments of the covariates to the responses:
and then the permutation can be recovered using the re-arrangement inequality,
where denotes sorted and denotes sorted and and denote the permutation matrices that capture the sorting operations.
In the case with outlier elements in , the problem is non-trivial, even in one dimension, since sorting does not allow the identification of outliers. To solve the one dimensional rRWOC, we introduce algorithm 1 which recovers the triplet in an exhaustive fashion.
Proposition 1 (Correctness of Algorithm 1).
(The full proof is included in supplementary material) The overview of the proof is as follows. In the noiseless case, if then . The projection maps all reference points to their exact corresponding reference points. Thus the Hungarian algorithm will yield these as the assignments since they incur minimal cost. Therefore, we will have . The cardinality of inliers is lower bounded and not equal to since outlier points may by chance be transformed to points in as well. Contrarily, suppose the transformation for yields a larger hypothesized inlier set , such that then this means that there are more points in that are closer to than , contradicting the assumption that is the maximal inlier set. ∎
The time complexity of algorithm 1 can be analyzed as follows. The main computational cost is due to linear assignment which incurs a cost of if Jonker and Volgenant (1986) variant is used. Linear assignment is repeated times. If and are of the same order, then algorithm 1 has complexity .
However, if the ratio of inliers to outliers is relatively high, then it is possible to use randomization procedures like RANSAC Fischler and Bolles (1981); Torr and Zisserman (2000) to speed up the algorithm to yield the correct regression coefficient with high probability. This is demonstrated in algorithm 2.
Proposition 2 (Correctness of Algorithm 2).
Suppose there are inliers in and that . In iterations, algorithm 2 yields the correct regression coefficient with probability for an appropriately selected margin parameter .
The success of algorithm 1 relies on the fact that the exhaustive search eventually hits a tuple such that which yields the correct regression coefficient. Therefore, when randomly sampling , the probability of choosing a corresponding pair is . The probability of iterating times such hat no correct correspondence is selected is where is the desired success rate. Taking logs yields, ∎
The time complexity of randomized algorithm 2 is .
3.2 Randomized approximation algorithm for
The exhaustive approach for the dimensional case requires -subset comparisons of , in order to guarantee hitting correct (in the noiseless case) or approximately correct (in the noisy case) regression coefficients, with complexity . However, especially in higher dimensions, the randomized procedure enables substantial reduction of iterations in order to yield a high probability correct triplet of inlier set, permutation, and regression coefficients. The randomized algorithm for rRWOC in is demonstrated in algorithm 3. Conceptually, the idea of the algorithm is illustrated in figure 1. Random ordered -tuples of reference and target point sets are sampled and are used to align the remainder of the point set. The number of hypothetical inliers for each hypothetical correspondence is assessed by checking whether the transformed reference points are arbitrarily close to a target point. With high probability, if correct a -tuple correspondence is captured, the number of transformed reference points matching a target point will be high (Figure 1 top), otherwise it will result in a partial coverage (Figure 1 bottom).
Analogous to the analysis of algorithm 2, the probability of drawing inliers out of points with k outliers in is . The probability of matching the drawn inliers with the corresponding sampled reference points in is . Probability that any draw is not going to match is . The probability that draws will be incorrect is . If we set this to be the probability of failure , we then have the estimate for the number of draws we need to make as ∎
The complexity of algorithm 3 can be analyzed as follows. In each inner loop, the regression coefficient solution requires time, the Hungarian algorithm requires to compute the input distance matrix and then to optimize the permutation matrix. The rest of the operations are . Therefore, the overall time complexity is
In the worst case, where , the complexity reaches the exhaustive rate . However, allowing for a slight tolerance for failure rate, the speed up can be substantial.
Margin parameter () selection: Both of the proofs of the noiseless and the noisy cases of proposition 1 rely on knowledge of the true regression coefficient and the noise variance in order to estimate the margin coefficient and output the optimal regression coefficient with high probability. However, in practice, as in many RANSAC-like robust regression settings, these parameters cannot be known apriori, and
is typically determined via empirical heuristics and or cross-validationFischler and Bolles (1981).
In the noiseless case, an appropriate heuristic is choosing arbitrarily small since the correct regression should yield zero residual. However, for the noisy case, if available, supervised data should be used with known correspondences to estimate the actual dispersion of point correspondences.
4 Numerical results
To verify the theoretical guarantees of the proposed algorithms, simulated data in 3 dimensions was generated in both noisy and noiseless regimes. Furthermore, iterative solutions of and were obtained to demonstrate the suboptimality of local minima found using block coordinate descent for this non-convex problem.
The neuroscience application of rRWOC was demonstrated in the context of point set matching of neurons of C. elegans worms recorded using fluorescence microscopy imaging. The matching accuracy with respect to ground truth was assessed for rRWOC as well as a robust variant of the iterative closest point (ICP) algorithm Besl and McKay (1992) known as trimmed ICP Chetverikov et al. (2002).
Computational setup and code: All experiments were performed on an Intel i5-7500 CPU at 3.40GHz with 32GB RAM. MATLAB code for 3D versions of algorithm 3 are included in supplementary material along with sample C. elegans neuron point clouds.
4.1 Simulated data
Three dimensional source point set was generated by sampling for where . A random transformation was obtained by computing the QR factorization of a random gaussian matrix such that , taking the orthonormal rotation component . This was randomly scaled by a factor between so that . For , inlier target points were generated by transforming a random subset of by and adding gaussian noise with varying : . Furthermore, points in were randomly uniformly sampled from the convex hull of the inlier points: . This procedure yielded two unordered multisets, and . Using these unordered multisets as input to rRWOC, the regression coefficients were estimated. If , the event was considered a correct recovery, otherwise a failure. The margin parameter was set to be . Also, using the randomized algorithm 3, the success probability parameter was set to .
This procedure was repeated 100 times for varying , varying and varying to assess the empirical recovery rate as a function of outlier amount, SNR and missing correspondences in the target, respectively. The recovery rates vs. outlier ratio, and SNR can be seen in figure 2-middle. The recovery rates vs. missing data ratio and SNR can be seen in figure 2-left. Lastly, the comparison of the recovery rate of exhaustive and randomized rRWOC versus iterative closest point can be seen in figure 2-right.
These empirical results demonstrate that for a sufficiently high SNR and outlier ratio less than , the proposed algorithm yields almost perfect recovery rates. Furthermore, the comparisons with iterative closest point algorithm (ICP) shows that rRWOC is much more robust to outliers than ICP since the inclusion of any outliers results in failure of ICP to recover the true transformation.
4.2 Neuron matching of C. elegans
For this application, we have used the publicly available C. elegans fluorescence imaging dataset of Nguyen et al. Nguyen et al. (2017) found at http://dx.doi.org/10.21227/H2901H. The worm C. elegans is a widely known model organism for studying the nervous system due to the known structural connectome of the 302 neurons it contains. The data provided 3D z-stack videos of the head of the worm that consists of approximately 185 to 200 neurons captured for several minutes and imaged at 4 Hz. In figure 3, the depth-colored 2D projection of a video frame can be seen superimposed with annotation points delineating the locations of neurons. Figure 3 also highlights the need for a method of matching and aligning worm point clouds that is robust to outliers or missing associations. Here, we define outliers as points where there is no neuron present and define missing data as neurons with no detection present.
A frame from the video was randomly selected. From there, a randomly sampled annotation subset of 40 neurons was selected as the source point set . Of those, 30 points were randomly transformed using the procedure described in section 4.1, with ten outlier points added to yield the target set . The variance of the added noise for each neuron was estimated through a training procedure which involved computing the alignment of all frames of the video to the first frame and computing the positional covariance of each neuron in the aligned space using the approach of Evangelidis and Horaud (2018).
Since the positional variance of each neuron was uniquely identified using training data, we used variable margin parameters for rRWOC such that where is the covariance matrix of the th neuron and denotes the
th eigenvalue. Randomized RWOC (algorithm3) was deployed with . The results were compared with ICP. The recovery rates in terms of recovering the transformation as well as the permutation , are summarized in table 1. In general, rRWOC was able to recover both the transformation and permutation better than ICP, which tends to be initialization-dependent. In all of the experiments, ICP was initialized with random rotations.
Conclusion: In this paper, we expanded on the linear regression without correspondence model Unnikrishnan et al. (2018); Abid et al. (2017); Hsu et al. (2017); Pananjady et al. (2016) to account for missing data and outliers. Furthermore, we provided several exact and approximate algorithms for the recovery of regression coefficients under noiseless and noisy regimes. The proposed algorithms are combinatorial at worst with variable dimension. However, randomization procedures make the average case complexity in constant dimension tractable given enough tolerance for failure. We provided several theoretical guarantees for exact recovery and running time complexity. Furthermore, we empirically demonstrated the recovery rates of the proposed algorithms in simulated and biological data. This can be thought of as a general framework for dissociating the outliers from a model-based data transformation process. The same principles can apply for the cases where either the generative noise is non-Gaussian, or some prior information exists about the structure of the outliers. Case-specific noise analysis is required for a particular model selection. Future work can focus on finding theoretical bounds on the robustness of the inlier recovery as a function of the number of outliers and the statistics of the generative and outlier distributions.
- Abid et al. (2017) Abubakar Abid, Ada Poon, and James Zou. Linear regression with shuffled labels. arXiv preprint arXiv:1705.01342, 2017.
- Aiger et al. (2008) Dror Aiger, Niloy J Mitra, and Daniel Cohen-Or. 4-points congruent sets for robust pairwise surface registration. In ACM transactions on graphics (TOG), volume 27, page 85. Acm, 2008.
- Besl and McKay (1992) Paul J Besl and Neil D McKay. Method for registration of 3-d shapes. In Sensor Fusion IV: Control Paradigms and Data Structures, volume 1611, pages 586–607. International Society for Optics and Photonics, 1992.
- Boutsidis et al. (2013) Christos Boutsidis, Petros Drineas, and Malik Magdon-Ismail. Near-optimal coresets for least-squares regression. IEEE transactions on information theory, 59(10):6880–6892, 2013.
- Bubnis et al. (2019) Greg Bubnis, Steven Ban, Matthew D DiFranco, and Saul Kato. A probabilistic atlas for cell identification. arXiv preprint arXiv:1903.09227, 2019.
- Chetverikov et al. (2002) Dmitry Chetverikov, Dmitry Svirko, Dmitry Stepanov, and Pavel Krsek. The trimmed iterative closest point algorithm. In Object recognition supported by user interaction for service robots, volume 3, pages 545–548. IEEE, 2002.
- Donoho et al. (2006) David L Donoho et al. Compressed sensing. IEEE Transactions on information theory, 52(4):1289–1306, 2006.
Evangelidis and Horaud (2018)
Georgios Dimitrios Evangelidis and Radu Horaud.
Joint alignment of multiple point sets with batch and incremental expectation-maximization.IEEE transactions on pattern analysis and machine intelligence, 40(6):1397–1410, 2018.
- Fischler and Bolles (1981) Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
- Hast et al. (2013) Anders Hast, Johan Nysjö, and Andrea Marchetti. Optimal ransac-towards a repeatable algorithm for finding the optimal set. 2013.
- Hsu et al. (2017) Daniel J Hsu, Kevin Shi, and Xiaorui Sun. Linear regression without correspondence. In Advances in Neural Information Processing Systems, pages 1531–1540, 2017.
- Indyk et al. (1999) Piotr Indyk, Rajeev Motwani, and Suresh Venkatasubramanian. Geometric matching under noise: Combinatorial bounds and algorithms. In SODA, pages 457–465, 1999.
- Irani and Raghavan (1999) Sandy Irani and Prabhakar Raghavan. Combinatorial and experimental results for randomized point matching algorithms. Computational Geometry, 12(1-2):17–31, 1999.
- Jonker and Volgenant (1986) Roy Jonker and Ton Volgenant. Improving the hungarian assignment algorithm. Operations Research Letters, 5(4):171–175, 1986.
- Kainmueller et al. (2014) Dagmar Kainmueller, Florian Jug, Carsten Rother, and Gene Myers. Active graph matching for automatic joint segmentation and annotation of c. elegans. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 81–88. Springer, 2014.
- Kuhn (1955) Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
- Luo et al. (2014) Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, Xiaowei Zhao, and Tae-Kyun Kim. Multiple object tracking: A literature review. arXiv preprint arXiv:1409.7618, 2014.
- Mount et al. (1999) David M Mount, Nathan S Netanyahu, and Jacqueline Le Moigne. Efficient algorithms for robust feature matching. Pattern recognition, 32(1):17–38, 1999.
- Myronenko and Song (2010) Andriy Myronenko and Xubo Song. Point set registration: Coherent point drift. IEEE transactions on pattern analysis and machine intelligence, 32(12):2262–2275, 2010.
- Nguyen et al. (2017) Jeffrey P Nguyen, Ashley N Linder, George S Plummer, Joshua W Shaevitz, and Andrew M Leifer. Automatically tracking neurons in a moving and deforming brain. PLoS computational biology, 13(5):e1005517, 2017.
- Pananjady et al. (2016) Ashwin Pananjady, Martin J Wainwright, and Thomas A Courtade. Linear regression with an unknown permutation: Statistical and computational limits. In 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 417–424. IEEE, 2016.
- Pokrass et al. (2013) Jonathan Pokrass, Alexander M Bronstein, Michael M Bronstein, Pablo Sprechmann, and Guillermo Sapiro. Sparse modeling of intrinsic correspondences. In Computer Graphics Forum, volume 32, pages 459–468. Wiley Online Library, 2013.
- Tam et al. (2013) Gary KL Tam, Zhi-Quan Cheng, Yu-Kun Lai, Frank C Langbein, Yonghuai Liu, David Marshall, Ralph R Martin, Xian-Fang Sun, and Paul L Rosin. Registration of 3d point clouds and meshes: a survey from rigid to nonrigid. IEEE transactions on visualization and computer graphics, 19(7):1199–1217, 2013.
- Torr and Zisserman (2000) Philip HS Torr and Andrew Zisserman. Mlesac: A new robust estimator with application to estimating image geometry. Computer vision and image understanding, 78(1):138–156, 2000.
- Unnikrishnan et al. (2015) Jayakrishnan Unnikrishnan, Saeid Haghighatshoar, and Martin Vetterli. Unlabeled sensing: Solving a linear system with unordered measurements. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 786–793. IEEE, 2015.
- Unnikrishnan et al. (2018) Jayakrishnan Unnikrishnan, Saeid Haghighatshoar, and Martin Vetterli. Unlabeled sensing with random linear measurements. IEEE Transactions on Information Theory, 64(5):3237–3253, 2018.
- Van Kaick et al. (2011) Oliver Van Kaick, Hao Zhang, Ghassan Hamarneh, and Daniel Cohen-Or. A survey on shape correspondence. In Computer Graphics Forum, volume 30, pages 1681–1707. Wiley Online Library, 2011.
- Yang et al. (2016) Jiaolong Yang, Hongdong Li, Dylan Campbell, and Yunde Jia. Go-icp: A globally optimal solution to 3d icp point-set registration. IEEE transactions on pattern analysis and machine intelligence, 38(11):2241–2254, 2016.
- Zhou et al. (2016) Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. Fast global registration. In European Conference on Computer Vision, pages 766–782. Springer, 2016.
Appendix A Proof of proposition 1
a.1 Noiseless case
Among the hypothetical regression coefficients obtained through all possible pairs of and , if a correct correspondence is encountered (i.e. , we have where is the true coefficient. Therefore if we let then . Using this estimate, the distances of the remaining covariates regressed to their corresponding responses is
Therefore, when computing via the Hungarian algorithm Kuhn (1955), each column of the distance matrix corresponding to inlier points in (i.e. ) will have at least one zero element. Thus, the optimal assignment will include all of the permutations since they incur zero cost. Since there are of them by assumption 1, then . This is inequality because there might be additional outlier points that are by chance close to the regressed points.
Conversely, for a pair where , we have the estimated coefficient . The distances of the remaining covariates regressed with this estimate to their corresponding responses are
Therefore, without loss of generality, assuming (if the correspondence can be automatically inferred by choosing any . If there aren’t any , then this implies is a point without correspondence in ), we have
for some . can be explicitly stated as
On the other hand,
Therefore, when computing via Hungarian algorithm, there will less than assignments in the optimal assignment such that . Otherwise, this would imply the coefficient is a coefficient that explains the inliers, which by assumption 1 cannot be the case. Thus, .
This shows that the maximal cardinality of a hypothetical inlier set is at least , and it is only achieved for a coefficient that is obtained by a correct correspondence pair. This is sufficient to show that algorithm 1 recovers the true coefficient under the noiseless regime.
a.2 Noisy case
Let the noise model of the inlier regression be . Therefore, if a correct correspondence is encountered, we have where is the true coefficient. The coefficient estimated from this pairing is . When this coefficient is applied to we see that
Therefore, if is small (i.e. in the SNR regime of Pananjady et al. (2016)), we have for with high probability. Thus the row-wise minimal cost assignment in the Hungarian algorithm will be with high probability. However, even if , if we set margin such that , with high probability we will have that
where denotes the regression coefficient obtained via incorrect correspondence . Therefore, if is sufficiently small, with high probability, algorithm 1 recovers the coefficient for some where denotes the set of inliers.