1 Introduction
Several applications in computer vision such as image alignment, panoramic mosaics, 3D reconstruction, motion tracking, object recognition, among others, rely on feature-based approaches. These approaches can be divided in three distinct steps. First, characteristic image features are detected on the input images. They can come either in the form of special image points
[1] or as distinguished image regions [2], and should meet some repeatability criteria to ensure they can be detected on images, regardless of their location, pose, illumination conditions, scale, etc. Second, a unique description is given to each detected feature. In the case of feature points, this is done by describing its surroundings [1, 3, 4]. For regions, their silhouettes and/or the contained texture information are employed [5, 6]. Finally, the descriptors of both images are matched together. Ideally, these matches correspond to samples of the underlying model to be estimated. Unluckily, the presence of outliers (i.e. matches that do not correspond to this model) complicates the estimation task. Moreover, the locations of matching points are frequently contaminated with noise commonly referred to as measurement noise [7].Introduced by Fischler and Bolles [8], Random Sample Consensus (RANSAC) is an iterative method that deals with outliers, and is used to simultaneously solve the correspondence problem while estimating the implicit global transformation. By randomly generating hypotheses on the transform matching the points, it tries to achieve a maximum consensus in the input dataset in order to deduce the matches according to the transformation, known as inliers. Once the inliers are discriminated, they are used to estimate the parameters of the underlying transformation by regression. RANSAC achieves good accuracy when observations are noiseless, even with a significant number of outliers in the input data.
Instead of using every sample in the dataset to perform the estimation as in traditional regression techniques, RANSAC tests in turn many random sets of sample pairs. Since picking an extra point decreases exponentially the probability of selecting an outlier-free sample
[9], RANSAC takes the minimum sample size (MSS) to determine a unique candidate transform, thus incrementing its chances of finding an all-inlier sample, i.e., a sample exclusively composed of inliers. This transform is assigned a score based on the cardinality of its consensus set. Finally the method returns the hypothesis that achieved the highest consensus. To refine the resulting transformation, a last-step minimization is performed using only its inliers.RANSAC weaknesses. The RANSAC method is able to give accurate results even when there are many outliers in the input data. However, it leaves some room for improvement. First, the probability of RANSAC obtaining a reasonable result increases with the number of iterations, however it may never reach the optimal solution. In addition, RANSAC results have a high degree of variability for the same input data, and this variability increases with the amount of input points and their measurement noise. Second, although robust to outliers, RANSAC is not particularly immune to measurement noise on the input data [7]. Furthermore, the final minimization step weighs uniformly the assumed inlier match, ignoring whether they are real inliers or how strongly they may be affected by noise. Third, the maximum tolerable distance parameter should be sufficiently tight to obtain a precise transformation, however it must also be loose to find enough input samples [10]. Because of such considerations, setting this parameter is a difficult task, even under low measurement noise. Fourth, even though the total amount of iterations can be set according to the theoretical bound proposed by the original authors [8], more iterations are typically required [7]. Finally and most importantly, the accuracy of RANSAC is based on the single sample where the best model was found. Although this may be accurate in some cases, it nevertheless ignores the other all-inlier models that may have been generated throughout the iterations, and which could be used to improve the resulting accuracy.
Contributions. To make up for these issues, we propose Random Sample Aggregated Consensus (RANSAAC), a simple yet powerful method that combines the random sample consensus scheme with a statistical approach. By aggregating the random hypotheses weighted by their consensus set cardinalities, the proposed approach improves systematically on RANSAC. We give practical implementations of this idea on 2D parametric transformation models, by proposing a simple strategy that allows to rapidly aggregate 2D parametric transformations using negligible computational resources.
Its main benefits over traditional RANSAC are fourfold. First, results are both more accurate and with less variability (in terms of standard deviation of the error). The improvement in accuracy over the traditional approach is on average by a factor of two to three, and is even more important with higher noise, more inliers available, and higher outlier ratios. Moreover, it improves systematically over other state-of-the-art RANSAC extensions. Second, as with the original RANSAC method, the accuracy is dramatically improved by adding a local optimization step, which in fact seems suited for our approach because it avoids discarding the potentially useful intermediate models generated during its execution. Third, by including this step, the theoretical adaptative stopping criterion proposed in
[8] becomes more realistic and could be used to effectively stop the iterations without affecting the final accuracy. Finally, using the proposed 2D transformation aggregation method adds an almost negligible extra computational cost to the RANSAC algorithm.2 RANSAC and its variants
2.1 The RANSAC method
Given an unknown transformation with parameters and a dataset of pairs consisting of samples, RANSAC computes
(1) |
where is usually the squared distance and the cost function is defined as
(2) |
The parameter is a critical parameter of the algorithm and will be discussed below. Finally, the optional last-step minimization refines the previous result by computing
(3) |
where the inlier matches are defined by
(4) |
2.1.1 RANSAC iterations and model error
The authors of RANSAC [8] show that by uniformly drawing samples, with
(5) |
where is the minimum sample size (MSS), one can ensure with confidence that at least one outlier-free sample is obtained from a dataset having inlier rate . Therefore, the total number of iterations could be set according to Eq. (5). However, since the inlier rate is in general not known beforehand, this value is updated every time a better hypothesis is found, by using Eq. (5) and taking as the current inlier ratio.
However, as noted by Chum et al. [9], the number of iterations of Eq. (5) is overly optimistic, so that an outlier-free sample does not guarantee an accurate estimation either. This is due not only to the noise on the position of the input points, but also to the underlying noise on the model hypotheses themselves [11, 12]. This other noise is caused by several factors such as limited numerical precision, poor conditioning of the model estimation method, or by other sometimes non-avoidable reasons such as the rounding of pixel intensities or point coordinates. In practice, the number of samples required in RANSAC should be set typically to two or three times the theoretical number [7].
2.1.2 Distance parameter
The parameter depends on the measurement noise of the input samples. Assume that these are 2D points with Gaussian noise of zero mean and standard deviation . Then the direct transfer error, computed as
(6) |
i.e. the distance between the projected point on the first image onto the second image and its match, is the sum of squared Gaussian variables which follows a distribution with degrees of freedom. The probability that this error is lower than a given value can be computed from the cumulative distribution . Capturing a fraction of the inliers is ensured by setting , which means that a true inlier will be incorrectly rejected percent of the time. Then if the error of the estimated transformation for a particular set of matches () is measured as the direct transfer error, this implies setting for [13]. However, if the error is given by the symmetric transfer error, namely
(7) |
then the distribution has four degrees of freedom (i.e. ). Therefore, one must set for or for . However, since in practice, the standard deviation of the noise is not available, this parameter is usually set experimentally. Throughout this article, this parameter was fixed to and the error was assumed to be measured as in Eq. (7).
2.1.3 Final refinement step
The original RANSAC method reduces the effect of measurement noise by re-estimating the transform using a last-step minimization on the resulting inliers. Depending on the underlying model, both linear and non-linear minimization methods exist [13, 14], with the latter one achieving better results with a higher computational cost. Although this final stage does in general improve RANSAC results, it gives equal importance to each match for the minimization. Then if RANSAC is not able to completely discriminate the outliers within the input data, this introduces a bias in the computation often resulting in a performance drop.
2.2 RANSAC Alternatives or Improvements
One of the aims of robust statistics is to reduce the impact of outliers. In computer vision, a robust estimator must be able to find the correct model despite of the noisy measurements and the high percentage of outliers.
In the context of regression analysis, the Least-Median-of-Squares (LMedS) and the Least Trimmed Squares (LTS) are well-known classical methods and still widely used
[15]. Both approaches resemble RANSAC by generating hypotheses from minimalistic samples. However instead of discarding the outliers, both methods sort the input points based on their residual error. The LMedS then returns the hypothesis with the lowest residual of the median point. The LTS method returns instead the model for which the sum of a fixed percentage of best ranked input data is minimized. Although the LMeds method does not require to specify the parameter of RANSAC, it implicitly assumes at least of inliers in the dataset. This method was successfully used to estimate the epipolar geometry between two images [16].Since its introduction and due to the massive proliferation of panorama generation [17], 3D estimation [13] and invariant feature points [1, 18], several modifications of RANSAC were proposed [19, 12, 20], mainly addressing three major aspects: computational cost, accuracy, and robustness against degenerate cases, i.e., cases where it is not possible to recognize an incorrect hypothesis by simply scoring the drawn minimal sample. Interestingly, these techniques can be grouped based on the strategy they adopt.
One of such strategies is to modify the random sampling process of RANSAC to generate better hypotheses, improving both speed and accuracy. In this group, NAPSAC [21] segments the sample data by assuming that inliers tend to be spatially closer to one another than outliers, thus picking points laying within a hypersphere of a specific radius. The PROSAC algorithm [22] exploits the ordering structure of the set of tentative correspondences and bases its sampling on the similarity computed on local descriptors. GroupSAC [23] combines both strategies by segmenting the input data based on feature characteristics. Suter and his collaborators show that taking random non-minimal subsets of samples considerably reduces the bias caused by taking minimal size samples over noisy input data [24, 25]. Another approach [26] is to perform an initial, not so extense round of random samples together with their hypotheses to build some prior knowledge, followed by guiding the subsequent sampling using the residuals of these hypotheses, i.e. the residual sorting information.
Another strategy is to perform a simple test by partially evaluating the dataset to discard hypotheses before computing the score, thus reducing the overall computational cost. The test [27] checks whether randomly drawn points are inliers to the hypothesis, discarding it otherwise. In [28], the authors propose to use a sequential probability ratio test (SPRT) to discard “bad” hypotheses, based on Wald’s theory of sequential testing [29]. Finally, instead of evaluating one model at a time, Preemptive RANSAC [30] filters hypotheses by scoring several of them in parallel over the same subset of samples, permitting real-time applications.
To improve on accuracy, instead of computing the cardinality of the set of sample points having smaller residuals than a specified threshold, several methods modify how samples are scored by weighting inliers based on their residual error. The MSAC method [31], a simple redescending M-estimator [32], takes the maximum between the error and the threshold to score each datum to finally minimize over their sum. Conversely, MLESAC [33]
maximizes the log likelihood of the given data by assuming the inlier error has Gaussian distribution while the outlier error follows an uniform law.
One strategy used to improve both accuracy and speed is to perform a local optimization step after a “good” enough model has been found. The locally optimized RANSAC (LO-RANSAC) [9], as well as its variants [34, 7], performs a fixed amount of iterations by taking non-minimal samples of the inliers of each newly found best hypothesis, in a so-called inner-RANSAC procedure. For each model generated by taking a non-minimal sample, a larger distance threshold is used to compute its inliers and an iterative procedure is performed to refine this model by progressively shrinking the threshold. The recently published Optimal RANSAC method by Hast et al. [35] also exploits this technique, however they add small modifications such as dataset pruning to remove outliers among iterations along with adaptive termination of the method when the same inlier set is found again after the inner-RANSAC procedure.
To increase the robustness of RANSAC against degenerate data in an epipolar geometry estimation context, DEGENSAC [36] performs a test aimed at identifying hypotheses where five or more correspondences in the minimal sample are related by a homography, to avoid bad solutions on images containing a dominant plane in the scene. In cases where most of the data samples are degenerate and do not provide a unique solution, QDEGSAC [37] performs several RANSAC executions, iteratively, by adding constraints to the inliers of the previous results.
Based on these approaches, Raguram et al. [12] studied several RANSAC extensions and proposed an algorithm that intelligently combined them into a single modular method called Universal RANSAC (USAC). This method obtains state-of-the-art results using few computational resources. It does so by combining ideas from LO-RANSAC, DEGENSAC, the SPRT test and PROSAC.
Another recent trend is to approach the robust estimation problem in an inverse fashion [38, 39, 40, 41]. The idea is to use the residuals for each data point obtained from several random models to generate the preference matrix, which relates input points with the generated hypotheses. This matrix is finally clustered to distinguish between inliers and outliers. The methods in the literature mainly differ in how to sample, define, and cluster this matrix. While J-Linkage [38] uses a binary method to define the relationship between each data point and each hypothesis in the matrix, T-Linkage [39] uses a continuous thresholded value, conceptually similar to the difference between RANSAC and MSAC scores. To perform the clustering, both methods rely on an agglomerative clustering technique: each iteration of the algorithm merges the two clusters with the smallest distance. J-Linkage uses the Jaccard distance metric while T-Linkage relies on the Tanimoto distance.
The approach of Tepper and Sapiro [41] performs bi-clustering on the preference matrix by using non-negative matrix factorization (NMF). A similar approach is used in [40], where they combine Robust PCA with Symmetric NMF to cluster the preference matrix. It should be noted that, opposite to RANSAC where a single model is estimated, these approaches target multimodel estimation and therefore are not the focus of the present work.
Finally, to address the selection of the parameter, ORSA [42] proposes an a-contrario criterion to avoid setting hard thresholds for inlier/outlier discrimination. On the other hand, Raguram [7] estimates the covariance of the underlying transform and propagates this error due to the noise by modifying the regions for which each data point is considered an inlier.
3 Random Sample Aggregated Consensus
As explained above, the RANSAC algorithm as well as its extensions discard the current best hypotheses when a new hypothesis with larger consensus is found. However, these hypotheses could be useful to obtain a more precise estimation of the underlying model. In fact, if hypotheses generated by all-inlier noisy samples are similar enough, i.e., their variance is bounded, then it makes sense to aggregate them to improve the accuracy of the method. The idea behind the proposed
RANSAAC approach is to take advantage of every generated hypothesis, using its score as a confidence measure allowing not only to discard outliers but also to give more importance to better models. In this section we first discuss the aggregation strategy employed, followed by presenting the local optimization improvement.3.1 Aggregation of 2D parametric transformations
In the context of RANSAC, we shall first illustrate this idea through a simple but classic example. Assume again that two images are related by a homography , each with some detected feature points and a certain amount of matching pairs between both images. Let us consider a single feature from the first image and its corresponding point on the second image (the match () does not necessarily need to be a detected match). At iteration RANSAC randomly picks matches, generates a hypothesized homography and computes its score using the whole dataset. This hypothesis in fact yields a tentative match . If the matches randomly selected were all inliers, then, with high probability, will be close to the real matching point on the second image. Thus, after several lucky all-inlier iterations, many estimates close to are actually found.
All of these estimates are affected by two noise sources: the underlying noise of the model hypothesis as explained before, and the measurement noise of the input points. Provided both of them have zero mean and a symmetric distribution function, aggregating these estimates should yield a value closer to the true point . However, since there may be outliers in the input samples, it is better to assign a weight for each generated hypothesis reflecting its reliability. Luckily, RANSAC already computes this measure, which is its so-called score. Therefore, by weighting each of the resulting estimated points using the RANSAC score and aggregating them by, for example, taking their weighted mean, the aggregated point will, with a high probability, be closer to its true value .
Therefore, in order to aggregate the transformations, the proposed approach requires the possibility to perform operations, such as averaging, directly on the models. Often, the set of these models has typically the structure of a Riemannian manifold [43, 44]. So they could be directly aggregated inside this manifold, by understanding its geodesics [45]. However, for the case of 2D transformations mapping points in one image to points on another, instead of aggregating on this manifold, a simple strategy based on pre-selecting points could be used, virtually adding no cost to the overall computation.
Let be a set of matching points where , , both and with predefined points on the first image. For each iteration with hypothesis RANSAAC computes the projection of the predefined points
(8) |
and the RANSAC score
(9) |
where is again the squared distance and is the cost function. After iterations, for each predefined point , RANSAAC aggregates the different estimations using the weights , resulting in trustworthy matches
(10) |
This procedure, in fact, could be regarded as a way to generate denoised inlier matches, since could be any point on the first image, while will be placed, with a high probability, close to the true matching point .
3.1.1 Predefined Source Points
The source points that are a parameter of the method depend on the global transform being estimated. The cardinality of this basis should be at least equal to the MSS. We select points which can uniquely determine the desired transform and minimize the noise caused by model generation, as depicted in Fig. 1. This strategy could be applied to any model that can be correctly estimated from points sampled on the image extents. For a homography, for example, the four points on the corners of the image should be selected. For a transformation involving a homography and two radial distortion parameters such as the one recently proposed by Kukelova et al. [46], an additional point on the center of the top and/or the left image border must be added.
3.1.2 Aggregation Strategies
For the case of 2D parametric transformations, instead of simply returning the best hypothesis, RANSAAC gathers, for each point of the predefined basis, a set of estimates of its location on the second image. These points form a weighted cluster, each one being associated the score of the hypothesis from which it was derived. All of these estimates can then be aggregated to compute a more accurate location for each point of the basis. This yields a minimum sample that permits to recover the transform. In the example shown in Fig. 2, five possible hypotheses are shown. Since the fifth hypothesis (green arrow) obtained a lower score, it has less influence on the resulting aggregation.
Two aggregation techniques were considered, namely the weighted mean and the weighted geometric median [47]. To aggregate the points of the set , with weights , with , the weighted mean computes
(11) |
where is a parameter of the aggregation procedure.
The weighted geometric median of a set is defined as the point with the minimum sum of distances to the others
(12) |
It can be calculated using the Weiszfeld’s algorithm [48], which is a form of iteratively re-weighted least squares [49] which iteratively computes
(13) |
where again is a parameter of the method.
3.2 Local Optimization
One of the major improvements of the original RANSAC method, in terms of accuracy, is to include an intermediate step every time a new best hypothesis has been found. This step, known as the local optimization step, achieves improved results by using the fact that models with good enough inlier supports are probably not distant from the true underlying model. Therefore, by taking non-minimal samples from the inliers of this new best model followed by a greedy strategy to refine the results, the LO-RANSAC algorithm [9] and its variants [34, 12] improve systematically over RANSAC.
The local optimization step is shown in Algorithm 1. The method begins by taking non-minimal samples of the inliers of the new best hypothesis found, and computes the model by performing a least squares fit (lines 10-11). Then, it computes the inliers of this model for the whole dataset using a larger distance threshold (line 12) and uses these inliers to calculate a second model by least squares fitting (line 15). This second model is later refined in an iterative fashion, by progressively shrinking the distance threshold until its original value is attained (lines 16-21). The refined model with the largest amount of inliers is then returned.
In practice, if an all-inlier sample is picked during a RANSAC iteration, this method obtains excellent approximations to the optimal solution. However, since RANSAC only keeps the best model defined by its inlier support, the last may not necessarily be the optimal model, as a better model may have appeared during these inner iterations and was subsequently discarded. This suggests that RANSAAC may significantly benefit from this procedure, and it turns out that it does. Therefore, saving and returning these intermediate models is the only extra work needed to adapt this method to RANSAAC, as seen by the lines with pink background in Algorithm 1.
As in its original formulation, the local optimization is executed every time a new best hypothesis is found during RANSAC iterations, working as an “algorithm within an algorithm”. This means that the local optimization step does not update RANSAC’s internal best-so-far variables ( and ), but has its own. In the locally optimized version of RANSAAC (i.e. LO-RANSAAC), instead of aggregating all generated hypotheses, we only aggregate hypotheses obtained from the local optimization step, since these models are likely to be closer to the ground truth. We evaluated the inclusion of all other generated models in the aggregation step and observed that the results were slightly less accurate. Furthermore, excluding these hypotheses in the aggregation reduces the processing time.
The final RANSAAC algorithm and its locally optimized version add a few lines to the traditional RANSAC method and an almost negligible computational cost. Algorithm 2 shows RANSAC together with both RANSAAC and LO-RANSAAC approaches, coded in colors. Each variant is understood by its identifying color. The original RANSAC is coded with white background. The RANSAAC method is obtained by including the pink and grey lines. The LO-RANSAAC procedure includes green and grey lines. As seen within the lines with pink background, RANSAAC first verifies for each iteration if the hypothesis is acceptable (i.e. that the amount of inliers is larger than the MSS), and then applies it to the source points while storing its scores as weights. The LO procedure, coded in green background, just performs the local optimization step when a new best model is found, and stores its results. Finally, after iterating, if no acceptable transform was found, it returns the traditional RANSAC result. However, if several hypotheses were considered valid, the computed projected points are aggregated using their weights and the algorithm returns the model that best fits these estimates.
4 Experiments
We compared the proposed method and its variants (LO, , ) with several state-of-the-art approaches both qualitative and quantitatively. Two transformation types were considered, namely the projective transform estimated using the DLT algorithm [50], and a homographic model that allows distortions on both images. The latter was estimated using the method of Kukelova et al. [46]
, which, given five matches, computes a homography and one radial distortion coefficient per image (using the division model and assuming the distortion center is on the center of the image). Since homographies are, along with fundamental matrices, the most frequent global model estimated in computer vision, we focus our evaluation on it. Nevertheless, we show results proving that the proposed approach also works on the other transform types as well, and could be potentially generalized to any parametric model obtained from points provided the possibility of averaging those models is available.
Matlab implementations of RANSAAC and all its variants are available at http://dev.ipol.im/~mrais/ransaac/.
4.1 Qualitative Evaluation: Paintings Registration with Projective Transformations
We first evaluated both RANSAC and RANSAAC with weighted mean aggregation on pictures of paintings taken in a museum under poor lighting conditions, which resulted in considerable noise visible on the output images. The objective was then to register each photograph of a painting detail taken from a shorter range, with the whole painting [51]. By using SIFT features and matching its descriptors, we proceeded to register both images using both methods. An example can be seen in Fig. 3. The camera distortion was corrected beforehand so the images are related by a pure homography. After registering both images, the resulting difference is mainly noise, however RANSAAC performs better, as evidenced in the flamingo on the bottom part of the painting (middle column of Fig. 3). When measurement noise is added to the input points on both images (), the difference in the registration performance becomes much more evident. Edges due to misalignments appear all over the difference image in RANSAC, but are dimmer on the RANSAAC results (right row of Fig. 3).
To get a better understanding of the behaviour of the method, figures 4 and its zoomed in version 5 shows the point distribution with their sizes according to their obtained weights using . As can be seen, the RANSAC algorithm always returns the model given by the points with the highest weight, ignoring every other estimation. By incorporating other weighted estimations and aggregating them, a closer point to the ground truth is obtained for every corner, therefore yielding a more precise homography.
4.2 Quantitative Evaluation by a Simulated Ground Truth. Application 1: Estimating Projective Transforms.
We built a valid ground truth, composed of point matches and a transform. In order to obtain a valid real-life transform, we used the SIFT algorithm, followed by the application of RANSAC with 100000 iterations between two images. To generate the matches, we randomly selected a fixed set of points from the first image as inliers and we matched them to the points obtained by applying the ground truth transform to them. Finally, to incorporate outliers in the dataset, we randomly picked other points from the first image and matched them with random positions on the second image. Measurement noise up to was added afterwards, since such high noise levels may appear in real-life applications. A possible reason for this is noisy point matches coming from higher scales of the Laplacian pyramid, which translate to large noisy values on the original scale. Another reason is due to an imperfect model selection that, by setting large values for , accepts matching points far away from the selected model. For example, when searching for a planar homography on radially distorted images from the same planar scene.
On this dataset, we compared several state-of-the-art methods and all variants of RANSAAC. To measure the error, we averaged the symmetric transfer error in pixels for every inlier. If is the obtained model, is an inlier point in the first image, its match and the amount of inliers, then the mean error is defined as
(14) |
where and represent the original input points, before adding the noise.
By varying the number of inliers, the percentage of outliers, the input noise and the number of iterations performed, we computed for each experiment the average error per experiment and the standard deviations over 100 evaluations.
Note that for all presented results, the very same random samples for each iteration were used for RANSAC, LO-RANSAC and RANSAAC, thus ensuring equal chance for those methods. We used the implementation of USAC available from the author [12].
We evaluated the performance of the proposed approach under distinct conditions. Since the noise level was known beforehand, the RANSAC parameter was set according to section 2.1.2, and for RANSAAC this parameter was fixed to due to the higher tolerance of the method with respect to this parameter. Unless explicitly mentioned, the parameter was set depending on the amount of iterations performed and on the used aggregation scheme, indicated by the results obtained in the following section.
4.2.1 Aggregation Weights
Because of outliers in the input data, the weighting of the hypotheses on the aggregation is mandatory. Increasing the value of the weight parameter leads to eliminate the influence of poor hypotheses, but a too high value will imply using too few transforms for the final aggregation.
An experiment was performed to measure the impact of the parameter for RANSAAC and its LO-RANSAAC variant, shown in Fig. 6. For the case of RANSAAC, it turns out that the value of depends on both the outlier percentage, but most importantly, on the number of drawn hypotheses. Indeed, as more hypothesis are drawn, they should be better discriminated, suggesting higher values of . As soon as the value of is high enough so that it allows to correctly discriminate between inliers and outliers, then the resulting accuracy does not vary significantly. A similar trend was observed using aggregation and therefore its results were not included. However, for the LO-RANSAAC variant using aggregation, varying the parameter does not affect the resulting accuracy. This can be explained in two ways. First, the models returned by the LO step have usually fewer outliers. Secondly, the geometric median is more robust to outliers than the mean, therefore not requiring to completely avoid them in the aggregation. When using aggregation, results were similar as in the standard RANSAAC with low values of . However, the curves remained almost constant later, as with the .
4.2.2 Accuracy evolution along iterations
To give insight about the evolution of the error under RANSAC and under the proposed approach, their accuracy was computed while iterating. To this end, we simulated input points from a predefined real homography (i.e. computed from two images of a landscape), corrupted them with white Gaussian Noise of and added 50% outliers. Using this input, we ran 20000 iterations of RANSAC and our method and calculated the error for every iteration (for RANSAAC, this implied performing aggregation). Finally, we computed the average error per iteration from 1000 repetitions of the experiment. In Fig. 7 left, we observe how RANSAC does not only converge slower comparing it with both aggregation schemes of RANSAAC, but also that it never reaches their accuracy. However, when observing the first 500 iterations, RANSAC usually achieves better results at the beginning, while RANSAAC usually requires more all-inlier samples to produce more accurate results. This is expected, since RANSAAC gains from aggregating all generated samples, thus only having a single “good” hypothesis and several “bad” ones will always benefit a method such as RANSAC that just considers the single hypothesis with the highest obtained score. However, as more all-inlier hypotheses are sampled, RANSAAC starts improving over traditional RANSAC (in the figure this happens at 125 iterations).
4.2.3 Ransaac variants evaluation
We evaluated both aggregation variants for RANSAAC with and without local optimization, by varying the number of inliers, the percentage of outliers, noise level, and the number of iterations performed. Table I shows results under three different noise levels ( or ). For each method and noise level, both 1000 and 10000 iterations were tested, each row corresponding to situations presenting either 0%, 5%, 20% or 50% outliers respectively.
Several conclusions could be drawn from these results. As can be seen, by using the local optimization procedure, every aggregated method improved, and in every case, the best performing aggregation scheme was the weighted geometric median. Interestingly, this did not occur when no local optimization steps were performed. The reason behind this is evident, as both the outlier ratio and the number of inliers increase, the probability of obtaining good models by random sampling becomes lower, therefore averaging these not-so-close-to-the-optimal models will achieve better results than taking their median, which will be far away from the optimal solution. However, when better (i.e. closer to the optimal solution) hypotheses are available, then computing their median is not only more accurate but also drastically reduces the possible bias caused by including outliers in the estimation. Indeed, performing LO steps provides excellent models, which justifies why aggregation improves over .
It is remarkable how the aggregation benefits from local optimization, evidenced by the increase in accuracy up to an order of magnitude for 1000 iterations, 1000 inliers, 50% outliers and . Finally, performing more RANSAC iterations when using local optimization does not improve the resulting accuracy. Indeed, results could become slightly worse when performing more than enough iterations since there are more possible bad hypotheses that have to be correctly discriminated. In conclusion, adding local optimization to RANSAAC does not only improves results but also enables the method to perform fewer iterations, making it less computationally demanding.
To sum up our findings, LO-RANSAAC using wgmed is our recommended algorithm whenever local optimization is applicable. When this is not possible, the best RANSAAC results are obtained by using the wmean aggregation. An example of this is shown in section 4.3.
wmean | LO (wmean) | wgmed | LO (wgmed) | |||||
1k | 10k | 1k | 10k | 1k | 10k | 1k | 10k | |
100 inliers | ||||||||
0.5 | 0.19 | 0.19 | 0.20 | |||||
0.20 | 0.20 | 0.20 | ||||||
0.20 | 0.20 | |||||||
0.21 | 0.21 | 0.21 | ||||||
2 | 0.81 | 0.75 | ||||||
0.81 | 0.76 | |||||||
0.83 | 0.82 | |||||||
0.81 | ||||||||
5 | 1.90 | 1.88 | ||||||
2.07 | 2.01 | |||||||
2.01 | 2.03 | 2.03 | ||||||
1.94 | 1.97 | |||||||
1000 inliers | ||||||||
0.5 | 0.07 | 0.06 | ||||||
0.07 | 0.06 | 0.06 | ||||||
0.07 | 0.06 | 0.06 | ||||||
0.07 | 0.06 | |||||||
2 | 0.24 | |||||||
0.26 | 0.26 | |||||||
0.25 | 0.25 | |||||||
0.25 | ||||||||
5 | 0.62 | |||||||
0.65 | ||||||||
0.63 | ||||||||
0.63 |
4.2.4 Execution times comparison
We compared the execution times for both RANSAC and RANSAAC together with their extra steps. By performing 100000 iterations on a dataset composed of 1000 inliers and 500 outliers with measurement noise of , we computed the total running time of each approach. Execution times are indicative and were measured on a 3.1GHz Intel Core i7 5557U processor. As seen from Table II, the standard RANSAAC method took more time than the original RANSAC. Moreover, if the LO-RANSAAC variant is used, since only the hypotheses coming from the LO step are aggregated, the difference with respect to LO variant of RANSAC is almost negligible.
RANSAC | LSM | RANSAAC | agg. | agg. |
---|---|---|---|---|
52.810s | 0.007s | 53.269s | 0.007s | 0.052s |
4.2.5 Results by using adaptive termination
We compared the LO-RANSAAC algorithm using both proposed aggregation schemes together with USAC, RANSAC and RANSAC+M. A maximum of 1000 iterations was allowed for all methods, although the methods with adaptive termination varied this parameter dynamically during the execution, based on Eq. (5) using . Results show average errors for 100 executions, where the noise value was fixed to and the distance values were set for every method according to section 2.1.2. USAC results are duplicated since adaptive termination was always activated.
Several conclusions are drawn from results shown in Fig. 8. First, while the weighted mean aggregation slightly suffers from using adaptive termination, the weighted geometric median aggregation in fact slightly benefits from it. The reason for this is again the same as mentioned before, and is related with the fact that making more iterations implies also having to better discriminate between samples.
4.2.6 Comparison with state-of-the-art methods.
We compared the proposed LO-RANSAAC approach with RANSAC and several state-of-the-art variants, namely LORANSAC, MSAC [33], ZHANGSAC [16] and the recent USAC algorithm [12]. We included in the comparison two minimization methods applied directly on the inliers using an oracle: the linear least squares methods using the DLT algorithm and the non-linear method that minimizes the Sampson’s approximation to the geometric reprojection error [13]. For every method, we also evaluated adding the final minimization step (using the DLT algorithm) and kept only the best between both.
With 1000 inlier matches available, the average errors together with their standard deviations are shown in Fig. 9 for 50% and 75% outlier ratios. All approaches not using local optimization steps have much higher errors proving the importance of this method. The LO-RANSAC approach was able to attain good results, although it never achieves the accuracy of the USAC method. Every RANSAAC variant improved over USAC, and strikingly, the aggregation variant is close in performance to the methods that use an oracle. What is more, when there are 75% outliers, the USAC approach failed while the LO-RANSAAC method still achieves highly accurate and stable results.
4.2.7 Performance under high outlier ratios
We tested our method under extreme conditions with high outlier ratios. We evaluated the accuracy of different algorithms with 1000 inliers and 9000 outliers, using both 10000 and 20000 iterations. In table III results are shown under two noise levels: mild/moderate noise () or high noise (). Even under these extreme conditions, the LO-RANSAAC method was still able to achieve extremely low error values. On the contrary, the RANSAC method failed considerably, even after applying the last step minimization, while LO-RANSAC did not achieve errors of under a pixel in average even by using 20000 iterations. Furthermore, USAC did not return any results. Interestingly, the achieved precision of the LO-RANSAAC is close to the accuracy obtained by using an oracle.
RANSAC+M | LO-RANSAC | USAC | Oracle | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
10k | 20k | 10k | 20k | 10k | 20k | 10k | 20k | 10k | 20k | - | |
2 | 0.36 | 0.31 | 0.28 | ||||||||
5 | 0.94 | 1.15 | 0.74 |
4.2.8 Performance evaluation using random homographies
We also tested the robustness of the proposed approach to different homographies. We used the Zuliani toolbox available online [52] to simulate random homographies.
Then, we evaluated both LO-RANSAAC aggregation methods together with USAC [12]
which turned out to be the best approach during our previous tests and again, aided by an oracle, the least squares minimization of the Sampson approximation to the geometric reprojection error. The experiment again consisted in averaging the error over 100 trials by varying the noise and the inlier/outlier configuration. The amount of iterations was set to 1000. Differing from the previous experiments where both the homography and the inliers were fixed, a new random homography and random inliers sampled from it were generated for each trial. Then, as before, outliers were injected in the input data by picking random positions on both images, and finally Gaussian white noise was added to the final points. Results confirmed the improvement of LO-
RANSAAC over USAC as shown in Fig. 10, validating the versatility of the approach.As a final test, instead of fixing the amount of iterations performed, we evaluated the same methods using adaptive termination (see sec. 2.1.1). Again the maximum number of iterations was set to 1000, however in practice the algorithm performed around 12 and 85 iterations for 20% and 50% outliers respectively, together with a single LO step. Interestingly, even by taking such low amount of iterations, the method still improved over USAC. However, this did not occur using aggregation for the case of 1000 inliers, due to its already observed necessity to get more models in order for the average to be closer to the optimal solution. A possible workaround for this, validated empirically, was to double the theoretical number of iterations given by Eq. (5) in the adaptive procedure. On the contrary, the LO-RANSAAC method with aggregation obtained more accurate results than USAC on every case.
Lastly, we observed some instability for our approach when using 100/200 inlier ratio and . This peak was caused by some homographies which were not correctly averaged. In fact, some degenerate cases of homographies would not be correctly aggregated using the approach described in section 3.1. Indeed, averaging points close to the horizon of the homography (i.e., the line of points that are projected to the infinity) will be unstable. Because of the noise on the input data, each all-inlier homography will be noisy. Then it is possible for one of the preselected points to lie close but on different sides of the horizon line for two different valid homographies, so when projected, they end up on opposite sides of the image and their average does not follow the geodesics of the space. To correct this behaviour, it would suffice to early detect which preselected points are close to the horizon line and either avoid averaging them or restart the algorithm by preselecting new points away from the horizon. Nevertheless, for several applications such as panorama generation, the probability of this situation to occur is very low, since images are taken continously.
4.3 Application 2: Estimating Homography+Distortion
The recently proposed method [46] to estimate an homography and a distortion coefficient for each image was evaluated to test the robustness of the proposed approach against different model noises. In this case, the available algorithm to compute such model assumes only 5 input matches, and it is not possible to perform a least squares like minimization based on the authors supplied code. Therefore, this is an interesting case in which, evidently, no local optimization is possible as is, and no last step minimization could be applied. Therefore, we restricted ourselves to evaluating the original RANSAAC approach using aggregation, comparing it with RANSAC, as in the author’s work.
As seen in Fig. 12, the method outperforms RANSAC in every evaluated experiment on both accuracy and stability, and the difference in performance gets higher as the amount of iterations increase, particularly for higher outlier percentages. This experiment proves the versatility of our approach compared to other more complex models.
5 Conclusion
In this article we introduced a simple, yet powerful RANSAC modification that improves the method by combining the random consensus idea using samples with minimal cardinality with a statistical approach performing an aggregation of estimates. This comes with an almost negligible extra computational cost. The most interesting advantage of our approach is that it makes better use of the generated hypotheses, obtaining improved results when using the exact same hypotheses. What is more, most of RANSAC enhancements easily fit into the proposed method. By adding local optimization to RANSAAC (and using weighted geometric median aggregation), the resulting accuracy and stability increased considerably surpassing every other RANSAC variant. Moreover, it succeeded in accurately estimating models under 90% outliers, a situation where most approaches failed.
As a future work, we plan to extend the method to the case of epipolar geometry estimation.
Acknowledgments
Work partly founded by the Centre National d’Etudes Spatiales (CNES, MISS Project), the Office of Naval research (ONR grant N00014-14-1-0023) and by the Ministerio de Ciencia e Innovación under grant TIN2014-53772-R. During this work, Martin Rais had a fellowship of the Ministerio de Economia y Competividad (Spain), reference BES-2012-057113, for the realization of his Ph.D. thesis.
References
- [1] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
- [2] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide baseline stereo from maximally stable extremal regions,” in Proceedings of the British Machine Vision Conference. BMVA Press, 2002, pp. 36.1–36.10, doi:10.5244/C.16.36.
- [3] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robust features (surf),” Computer vision and image understanding, vol. 110, no. 3, pp. 346–359, 2008.
- [4] K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615–1630, Oct 2005.
- [5] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE transactions on pattern analysis and machine intelligence, vol. 24, no. 4, pp. 509–522, 2002.
- [6] P.-E. Forssén and D. G. Lowe, “Shape descriptors for maximally stable extremal regions,” in 2007 IEEE 11th International Conference on Computer Vision. IEEE, 2007, pp. 1–8.
- [7] R. Raguram, J.-M. Frahm, and M. Pollefeys, “Exploiting uncertainty in random sample consensus,” in Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009, pp. 2074–2081.
- [8] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, Jun. 1981.
- [9] O. Chum, J. Matas, and J. Kittler, “Locally optimized RANSAC,” in Pattern recognition. Springer, 2003, pp. 236–243.
- [10] J. Choi and G. Medioni, “Starsac: Stable random sample consensus for parameter estimation,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, June 2009, pp. 675–682.
- [11] B. Tordoff and D. W. Murray, “Guided sampling and consensus for motion estimation,” in Proceedings of the 7th European Conference on Computer Vision-Part I, ser. ECCV ’02. London, UK, UK: Springer-Verlag, 2002, pp. 82–98.
- [12] R. Raguram, O. Chum, M. Pollefeys, J. Matas, and J.-M. Frahm, “USAC: A Universal Framework for Random Sample Consensus,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 2022–2038, Aug. 2013.
- [13] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. New York, NY, USA: Cambridge University Press, 2003.
- [14] T. Botterill, S. Mills, and R. Green, “Refining essential matrix estimates from ransac,” in Image and Vision Computing New Zealand, 2011, pp. 500–505.
- [15] P. J. Rousseeuw, “Least median of squares regression,” Journal of the American Statistical Association, vol. 79, no. 388, pp. 871–880, 1984.
- [16] Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artificial Intelligence, vol. 78, no. 1â2, pp. 87–119, 1995, special Volume on Computer Vision.
- [17] M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” International Journal of Computer Vision, vol. 74, no. 1, pp. 59–73, 2006.
- [18] I. Rey Otero and M. Delbracio, “Anatomy of the SIFT Method,” Image Processing On Line, vol. 4, pp. 370–396, 2014.
- [19] R. Raguram, J.-M. Frahm, and M. Pollefeys, “A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus,” Computer Vision–ECCV 2008, pp. 500–513, 2008.
- [20] S. Choi, T. Kim, and W. Yu, “Performance evaluation of ransac family.” in BMVC. British Machine Vision Association, 2009.
- [21] D. R. Myatt, P. H. S. Torr, S. J. Nasuto, J. M. Bishop, and R. Craddock, “Napsac: High noise, high dimensional robust estimation - it’s in the bag,” in BMVC, 2002.
- [22] O. Chum and J. Matas, “Matching with PROSAC-progressive sample consensus,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 220–226.
- [23] K. Ni, H. Jin, and F. Dellaert, “Groupsac: Efficient consensus in the presence of groupings,” in Computer Vision, 2009 IEEE 12th International Conference on, Sept 2009, pp. 2193–2200.
- [24] T. T. Pham, T. J. Chin, J. Yu, and D. Suter, “The random cluster model for robust geometric fitting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 8, pp. 1658–1671, Aug 2014.
- [25] R. B. Tennakoon, A. Bab-Hadiashar, Z. Cao, R. Hoseinnezhad, and D. Suter, “Robust model fitting using higher than minimal subset sampling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 350–362, Feb 2016.
- [26] T.-J. Chin, J. Yu, and D. Suter, “Accelerated hypothesis generation for multistructure data via preference analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 4, pp. 625–638, Apr. 2012.
- [27] J. Matas and O. Chum, “Randomized RANSAC with Td,d test,” Image and Vision Computing, vol. 22, no. 10, pp. 837–842, Sep. 2004.
- [28] O. Chum and J. Matas, “Optimal Randomized RANSAC,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 8, pp. 1472–1482, Aug. 2008.
- [29] A. Wald, Sequential Analysis, ser. Wiley series in probability and mathematical statistics. Probability and mathematical statistics. J. Wiley & Sons, Incorporated, 1947.
- [30] D. Nister, “Preemptive RANSAC for live structure and motion estimation,” Machine Vision and Applications, vol. 16, no. 5, pp. 321–329, Dec. 2005.
- [31] P. Torr and A. Zisserman, “Robust computation and parametrization of multiple view relations,” in Computer Vision, 1998. Sixth International Conference on, Jan 1998, pp. 727–732.
- [32] P. J. Huber, Robust Statistics. Wiley, 1981.
- [33] P. Torr and A. Zisserman, “MLESAC: A New Robust Estimator with Application to Estimating Image Geometry,” Computer Vision and Image Understanding, vol. 78, no. 1, pp. 138–156, Apr. 2000.
- [34] K. Lebeda, J. Matas, and O. Chum, “Fixing the locally optimized ransac–full experimental evaluation,” in British Machine Vision Conference. Citeseer, 2012, pp. 1–11.
- [35] A. Hast, J. Nysjo, and A. Marchetti, “Optimal ransac - towards a repeatable algorithm for finding the optimal set,” Journal of WSCG, vol. no.1, pp. 21–30, 2013.
- [36] O. Chum, T. Werner, and J. Matas, “Two-view geometry estimation unaffected by a dominant plane,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, June 2005, pp. 772–779 vol. 1.
- [37] J. M. Frahm and M. Pollefeys, “Ransac for (quasi-)degenerate data (qdegsac),” in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 1, June 2006, pp. 453–460.
- [38] R. Toldo and A. Fusiello, Robust Multiple Structures Estimation with J-Linkage. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008, pp. 537–547.
- [39] L. Magri and A. Fusiello, “T-linkage: A continuous relaxation of j-linkage for multi-model fitting,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, June 2014, pp. 3954–3961.
- [40] ——, “Robust multiple model fitting with preference analysis and low-rank approximation,” in Proceedings of the British Machine Vision Conference (BMVC), X. Xie, M. W. Jones, and G. K. L. Tam, Eds. BMVA Press, September 2015, pp. 20.1–20.12.
- [41] M. Tepper and G. Sapiro, “A biclustering framework for consensus problems,” SIAM Journal on Imaging Sciences, vol. 7, no. 4, pp. 2488–2525, 2014.
- [42] L. Moisan, P. Moulon, and P. Monasse, “Automatic Homographic Registration of a Pair of Images, with A Contrario Elimination of Outliers,” Image Processing On Line, vol. 2, pp. 56–73, 2012.
- [43] P. W. Michor and D. Mumford, “A zoo of diffeomorphism groups on ,” ArXiv e-prints, Nov. 2012.
- [44] R. Tron and K. Daniilidis, “On the quotient representation for the essential manifold,” in Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, ser. CVPR ’14. Washington, DC, USA: IEEE Computer Society, 2014, pp. 1574–1581.
- [45] G. Li, Y. Jian, Y. Liu, and Z. Shi, Projective Registration with Manifold Optimization. INTECH Open Access Publisher, 2009.
- [46] Z. Kukelova, J. Heller, M. Bujnak, and T. Pajdla, “Radial distortion homography,” in Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, June 2015, pp. 639–647.
- [47] Z. Drezner and H. Hamacher, Facility Location: Applications and Theory. Springer Berlin Heidelberg, 2001.
- [48] E. Weiszfeld and F. Plastria, “On the point for which the sum of the distances to n given points is minimum,” Annals of Operations Research, vol. 167, no. 1, pp. 7–41, 2008.
- [49] I. Daubechies, R. DeVore, M. Fornasier, and C. S. Güntürk, “Iteratively reweighted least squares minimization for sparse recovery,” Communications on Pure and Applied Mathematics, vol. 63, no. 1, pp. 1–38, 2010.
- [50] I. E. Sutherland, “Three-dimensional data input by tablet,” Proceedings of the IEEE, vol. 62, no. 4, pp. 453–461, April 1974.
- [51] A. Buades, G. Haro, and E. Meinhardt-Llopis, “Obtaining High Quality Photographs of Paintings by Image Fusion,” Image Processing On Line, vol. 5, pp. 159–175, 2015.
- [52] M. Zuliani, “Ransac toolbox for matlab,” Nov. 2008. [Online]. Available: https://github.com/RANSAC/RANSAC-Toolbox
Comments
There are no comments yet.