I Introduction
Feature correspondence selection is a fundamental and critical task in computer vision and robotics. It is the basis for a wide range of applications, such as structurefrommotion
[1], simultaneous localization and mapping [2], tracking [3], image stitching [4], and object recognition [5], to name just a few.The main purpose of correspondence selection is retrieving as many as correct correspondences (also known as inliers) from the initial feature matches. Usually, this task is under the background of feature matching. The general process of feature matching starts by detecting representative points, namely keypoints, for two images to be matched. Then, local descriptors such as SIFT [6] and ORB [7] are employed to perform feature description for those keypoints. To build the connection between two images, keypoints with similar feature descriptors are matched, generating a set of raw feature matches. However, the initial feature matches often suffer from severe wrong matches (as shown in Fig. 1(a)) due to the limited distinctiveness of feature descriptors or/and external interferences such as noise and occlusion. This problem makes correspondence selection a necessity for accurate feature matching. Fig. 1
(b) shows that those matches after correspondence selection are far more consistent than the initial feature matches. This consensus allows massive highlevel vision tasks. For instance, homography, affine and essential matrices can be estimated from those consistent correspondences, thus allowing us to compute the transformation between two images and warp them into a unified coordinate system
[4]. Other applications also involve camera parameter estimation [1] and object tracking [3]. Nonetheless, the correspondence selection problem is difficult in real applications due to several factors, e.g., zoom, rotation, blur, viewpoint change, JPEG compression, light change, different rendering styles, multistructures, and etc. Different scenarios will also lead to different distributions of feature matches which are linearly nonseparable.To address these problems, many approaches that have been presented during the past two decades can be divided into two categories [8]: parametric and nonparametric methods. (i) For parametric methods, they seek consistent correspondences grounded on parametric geometric models. Typical methods include the random sample consensus (RANSAC) [9], the progressive sample consensus (PROSAC) [10], the universal framework for random sample consensus (USAC) [11]
, and etc. (ii) For nonparametric methods, they are independent from parametric model assumptions. Some of them search correspondence inliers via either feature similarity constraint or geometric constraint, such as the nearest neighbor similarity ratio (NNSR)
[6], spectral technique (ST) [12], gametheoretic matching (GTM) [13], graphbased affine invariant matching (GAIM) [14] and locality preserving matching (LPM) [15]. There are also constraintindependent nonparametric methods such as identifying point correspondences by correspondence function (ICF) [16], vector field consensus (VFC)
[8], gridbased motion statistics (GMS) [17] and coherence based decision boundaries (CODE) [18]. With the wealth of existing correspondence selection methods, however, it is on the one hand difficult for developers to choose the most proper method given a specific application and on the other hand confusing for researchers to compare these methods under different conditions. This problem is mainly due to the fact that most methods were tested under a specific application scenario and compared with a limited number of baselines.Some performance evaluations in the field of image feature matching also exist. For instance, Mikolajczyk et al. [19] and Heinly et al. [20] evaluated the performance of several 2D feature descriptors. Aans et al. [21] investigated the performance of 2D feature detectors. Moreels et al. [22] performed an aggregated evaluation of both 2D detectors and descriptors. In addition to feature detectors and descriptors, Raguram et al. [23] tested the performance of a set of random sample consensus methods including the popular RANSAC and its variants. However, all these evaluations are either not in line with 2D correspondence selection or not comprehensive enough for an indepth comparison. First, the critical step in correspondence selection is finding correspondence consensus, while feature detection and description aim at building highquality initial feature correspondences (such quality is difficult to be guaranteed without correspondence selection [6]). Second, the performance of nonparametric approaches and some recent algorithms remains unclear (only parametric methods were tested in [23]).
In these regards, we present the first comprehensive evaluation, to the best of our knowledge, for 2D correspondence selection from different perspectives in a uniform experimental framework. The considered methods in our evaluation range from classical algorithms to the most recent ones, typically covering both parametric and nonparametric approaches. To be specific, RANSAC [9] and USAC [11] are selected from the parametric family, as RANSAC is arguably the most popular parametric approach and USAC is a wellknown modified version of RANSAC. As for nonparametric methods, we choose NNSR [6] as a representative of those methods based on descriptor similarity constraints. ST [12], GTM [13] and LPM [15] are selected as they all rely on the geometric consensus. VFC [8] and the recent GMS [17]
are taken into consideration since they eliminate outliers from the perspective of statistical measures. In order to compare those methods from different perspectives, we choose four standard datasets, i.e., VGG
[24], Heinly [20], Symbench [25], AdelaideRMF [26], as experimental platforms under the motivation to test those correspondence selection methods’ overall performance when faced with a variety of nuisances rather than in their favoring circumstances. For instance, geometric constraints may turn to be vulnerable under rigid/nonrigid transformations such as zoom and rotation; feature similarity constraints are suspicious when the image undergoes blur and light changes; parametric models (the homography matrix) can hardly cope with scenes with parallax (all above conclusions have been verified in Sect. IV). The considered datasets well cover these concerns. To be specific, VGG is a hybrid dataset containing challenges including zoom, rotation, blur, JPEG compression, light and viewpoint change. Heinly contains pure zoom and rotation. Symbench involves scenes with light changes and varying rendering styles. AdelaideRMF possesses viewpoint change and multistructures, resulting in parallax. The behavior of each method is quantitatively measured using precision, recall and Fmeasure [17, 27, 15]. In addition, the performance under preselected correspondences (with higher inlier ratios) and different detectordescriptor combinations are also accessed to test their flexibility with respect to the inlier ratio and correspondence distribution changes. Finally, the efficiency with respect to different scales of initial feature matches are examined. According to the experimental outcomes, we make an aggregated summary of the current advantages and limitations of our evaluated methods as well as their suitable applications.In a nutshell, the contributions of this paper are threefold:

A review and the core computation steps of eight stateoftheart 2D correspondence selection algorithms are presented.

We comprehensively evaluate and compare the performance, the robustness to a variety of perturbations and the efficiency of each algorithm on four standard datasets consisting of hundreds of images with zoom, rotation, blur, viewpoint change, JPEG compression, light change, different rendering styles and multistructures.

Instructive summarizations including merits, demerits and suitable applications of the tested methods are given that can be served as a “user guide” for the developers.
The remainder of this paper is organized as follows. Sect. II gives a review of 2D correspondence selection algorithms and relevant evaluations. Sect. III presents the core computation steps of eight stateoftheart approaches. Sect. IV describes the experimental setup including datasets, criteria and implementation details of the evaluated methods. Qualitative and quantitative experimental results are shown in Sect. V. Summary and discussion are presented in Sect. VI. Conclusions are finally drawn in Sect. VII.
Ii Related work
This section briefly reviews the prior works of 2D correspondence selection including both parametric and nonparametric categories. Relevant evaluations in the field of feature matching are also discussed.
Iia Correspondence selection methods
For parametric methods, the most wellknown algorithm is arguably RANSAC presented by Fischler et al. [9]. RANSAC iteratively explores the space of model parameters by randomly sampling and estimates the most reliable model based on the maximum number of inliers. Then, outliers can be removed using the generated model. Several variants of RANSAC such as MLESAC [28], LORANSAC [29], PROSAC [10] and USAC [11] were proposed in the following decades. MLESAC employs the maximum likelihood estimation rather than the inlier count to check the solutions. LORANSAC inserts an optimization process where the generated model is refined by the subset of inliers. A weighted sampling step is adopted instead of random sampling in PROSAC. This method sorts the raw correspondences by matching quality and generates hypotheses from the most promising correspondences. USAC extends the standard hypothesizeandverify structure in RANSAC and presents a universal framework that integrates advantages of previous parametric methods. In addition, some other approaches relying on local parametric structures have also been developed, such as agglomerative correspondence clustering (ACC) [30], multistructures robust fitting (MultiGS) [31], Hough voting and inverted Hough voting (HVIV) [32]. ACC uses Hessianaffine detector [33], which is invariant to affine transformations, to estimate the local homography matrix as constraints. The initial correspondences are then clustered based on the constraints, and the clusters with inliers are supposed to be larger than the ones constituted by outliers. MultiGS generates a series of tentative hypotheses by random sampling and considers that two correspondences from the same local structure are inliers if they share a common list of hypotheses. HVIV employs the BPLR detector [34]
to cluster correspondences and estimates the homographic transformation for each correspondence as well. The most plausible correspondence in each cluster is then selected using normalized kernel density estimation.
For nonparametric methods, their theoretical foundations are not always the same. A widelyused strategy is exploiting the consistency information of local geometric structures or appearance (feature similarity). Specifically, Lowe et al. [6] proposed a nearest neighbor similarity ratio (NNSR) method that assigns a penalty equaling to the ratio of the closest to the secondclosest feature distance to each correspondence and treats those correspondences with low ratios as inliers. Leordeanu et al. [12]
presented spectral technique (ST), where an affinity matrix is built using pairwise geometric constraints to remove mismatches in conflict with the most credible correspondences. Albarelli et al.
[13] casted the selection of correspondences in a game theoretic framework, known as gametheoretic matching (GTM), where a natural selection process allows corresponding points that satisfy a mutual distance constraint to thrive. Cho et al. [35] presented reweighted random walk algorithm (RRWM) for graph matching. An associated graph between two sets of candidate correspondences is drawn at first, and reliable nodes indicating the consistent correspondences in this graph are then selected by the reweighted random walk algorithm. Ma et al. [15] proposed locality preserving matching (LPM) to improve inlier selection by maintaining the local neighborhood structures of those potential true matches. Some nonparametric approaches that formulate the correspondence selection problem as a statistics problem have also been used, e.g., vector field consensus (VFC) [8] and gridbased motion statistics (GMS) [17]. VFC supposes that the noise around inliers and outliers falls in different distributions. This approach estimates the probability of inliers by the maximum likelihood estimation for parameters in the mixture probabilistic model. Additionally, GMS rejects false matches by counting the quantity of matches in small neighborhoods and achieves realtime performance with an efficient gridbased score estimator.
IiB Other evaluations
In the feature matching field, some evaluations of 2D/3D local descriptors and detectors have been performed. For instance, Mikolajczyk et al. [19] evaluated the performance of 2D feature descriptors under transformations of rotation, zoom, viewpoint change, blur, JPEG compression, light change and keypoint localization errors. Moreels et al. [22] conducted an evaluation of several groups of 2D feature detectors and descriptors on images captured from the same 3D object with different viewpoints and lighting conditions. Heinly et al. [20] performed an evaluation of several 2D binary descriptors, aiming at testing their descriptiveness under different feature detectors on several scenes with illumination change, viewpoint change, pure camera rotation and pure scale change. Aans et al. [21] investigated the performance of several 2D feature detectors on a particular dataset wherein each scene was depicted from 119 camera positions with a range of light directions. In 3D domain, Tombari et al. [36] compared two categories (i.e., fixedscale and adaptivescale) of 3D feature detectors in terms of distinctiveness, repeatability and efficiency under the nuisances of viewpoint changes, clutter, occlusions and noise. Guo et al. [37] tested the descriptiveness, robustness, compactness and efficiency of ten local geometric descriptors on eight datasets with radius variations, varying mesh resolution, Gaussian noise and etc. More relevant to our work is the evaluation performed by Raguram et al. [23], where RANSAC and a set of its variants were examined under different ratios of inliers. This paper, compared with [23], considers both parametric and nonparametric methods as well as a variety of nuisances for more comprehensive evaluation.
Iii Considered methods
Eight 2D correspondence selection algorithms including two parametric ones, i.e., RANSAC [9] and USAC [11], and six nonparametric ones, i.e., NNSR [6], ST [12], GTM [13], VFC [8], GMS [17], LPM [15], are considered in our evaluation. Before introducing their theories, we give some general notations for better readability.
Given two images to be matched, keypoints and local feature descriptors are computed for them as and , respectively. This procedure can be accomplished using offtheshelf detectors and descriptors, e.g., SIFT [6]. To generate initial feature matches , keypoints are matched with each other based on feature similarity, i.e., a correspondence (match) in is defined as with , , , and being the feature similarity score. The objective of correspondence selection is digging out the maximum consensus (inlier) subset . Core principles and computation steps of evaluated algorithms are given as follows.
Nearest Neighbor Similarity Ratio [6]. NNSR directly utilizes descriptor similarities to remove less distinctive matches. Specifically, the term equaling to the ratio of the closest to the secondclosest feature distance to each correspondence is used as a penalty. Therefore, a correspondence is judged as inlier if
(1) 
where , hereinafter denotes the norm (this distance metric is suggested in [6]), and represent the most and the second most similar feature descriptors of , respectively. Values of threshold and other mentioned thresholds in the following are presented in Table I.
Random Sample Consensus [9]. RANSAC follows a hypothesizeandverify framework by repeating procedures of random sampling and checking to maximize the object function. For 2D correspondence selection, the desired parametric model is usually a plane homography matrix or a fundamental matrix. Taking the homography matrix as an example, it first randomly samples several correspondences (at least 4) from and generates the model hypothesis for those samples at the th iteration. Then, the hypothesis is verified via the following object function
(2) 
where is a binary function defined as
(3) 
with and being a threshold that determines the accuracy of a judged inlier. Above steps are repeated times and the model with the maximum object function is selected as the final model . Correspondences agreeing with (producing 1 values using Eq. 3) are identified as inliers.
Spectral Technique [12]. ST locates the most reliable element by matrix decomposition. It assumes that the connection among correct matches is much tighter than the one among mismatches. Based on this assumption, ST first builds an adjacency matrix as
(4) 
where is the affinity between and
. Second, the principle eigenvector
ofis computed using the singular value decomposition algorithm. Third, the maximum element in
is selected as indicating being the most reliable correspondence. Fourth, set to zero and remove other components of that are in conflict with , i.e.,(5) 
where is a predefined threshold. By repeating the third and fourth steps until is empty or , the correspondences related to all elements selected from are determined as inliers.
Game Theory Matching [13]. GTM concentrates on extracting correspondences being consistent to the majority of . Specifically, this strategy interprets the filtering process as a gametheoretic framework where players attempt to obtain high payoffs. At the beginning of this game, every two players extracted from a large population choose a pair of correspondences (served as strategies in this context) from . Then they will receive a payoff linearly correlated to the coherence between these correspondences. The player who gets high payoffs will receive higher supports. In general, as the game going on, players will prefer to select more reliable correspondences to pursue higher payoffs.
Given a pair of correspondences , the payoff function is defined as
(6) 
where is a selectivity parameter, represents the norm and is the similarity transformation estimated by (similarly for )
(7) 
where is the homographic transformation of . Note that this algorithm particularly requires the local affine transformation cue to compute the payoff function. Next, the payoff matrix with the element in the th row and th column that is defined as
(8) 
can be generated. The population vector is updated by the evolutionary stable states algorithm (ESS’s) [38] as
(9) 
where represents the element in the th row of and is the iteration number. After iterations, a correspondence is identified as inlier if its corresponding is higher than a threshold .
Universal RANSAC [11]. USAC integrates a universal framework for RANSAC, where each original step is optimized by referring to the advantages of previous parametric approaches such as PROSAC [10], SPRT test [39] and LORANSAC [29]. Further, this algorithm inserts degeneracy and local optimization processes after generating the minimalsample model.
During the sampling step, USAC uses a weighted sampling algorithm named PROSAC [10], where the initial correspondences are reordered at first based on the descending sort order of bruteforce matching scores and correspondences with higher scores are preserved. At the checking stage of the model (homography matrix or fundamental matrix), a correspondence is judged as inlier by Eq. 3 with the threshold or by the equation
(10) 
where is the th hypothetic fundamental matrix, is the threshold and (similarly for ). After generating the minimalsample model, USAC verifies whether the model is interesting by the SPRT test [39]. The likelihood ratio can be computed after evaluating correspondences as
(11) 
where and respectively represent a “good” model and a “bad” model, is equal to 1 if is consistent with the generated model and 0 otherwise, is approximated by the inlier ratio and
follows a Bernoulli distribution. If the
is higher than an adaptive threshold, the model will be discarded. When fitting the fundamental matrix by epipolar geometry constraint, USAC utilizes DEGENSAC [40] for degeneracy. It assumes that the generated model is often incorrect in the context of images containing a dominant scene plane. Accordingly, DEGENSAC employs a homographic transformation to reject the generated fundamental model if there are five or more sampled correspondences lying on the same plane. Eventually, USAC adds a local optimization (LORANSAC [29]) to refine the minimalsample model. It resamples correspondences only from the set of selected inliers and refines the previous model by the sampling subset. This whole process is repeated until achieving confidence in solution or iterations reach the upper bound .Vector Field Consensus [8].
VFC interpolates a vector field where the posteriori probability of a correct correspondence is estimated by the Bayes rule.
For a correspondence , the transformation to a motion field is expressed as , where and . In this motion field, VFC holds the assumption that the noise around inliers indicated by
follows the Gaussian distribution and the noise around outliers indicated by
follows the uniform distribution. Thus, the probability is a mixture model given by
(12) 
where is a set of unknown parameters, is the vector field expected to be recovered, is the mixing coefficient of the mixture probability model, i.e, , and respectively are sets of and ,
is the uniform standard deviation of Gaussian distribution,
is the probability density of the uniform distribution and is the dimension of the output space. VFC employs the EM [41] algorithm to deal with the maximum likelihood estimation with latent variables. At Estep, the diagonal element of a diagonal matrix , i.e., , can be computed by the Bayes rule(13) 
At Mstep, a coefficient matrix is created first by
(14) 
where is a matrix consisting of the Gaussian kernel and is a regularization constant. Second, the vector field is estimated by
(15) 
where . Third, values of and are updated by
(16) 
and
(17) 
where . The Estep and Mstep are repeated until parameters are converged. Finally, the inlier set is generated as
(18) 
where is a predefined threshold.
Gridbased Motion Statistics [17].
GMS proves that besides feature descriptiveness, feature number also contributes to the quality of correspondences. It supposes that the quantity of correspondences in a small neighborhood around a true match is larger than that around a false match under the smooth motion. In overlarge neighborhoods, regions are divided into multiple small region pairs where distributions of correspondence number are approximated by Binomial distributions. Given a correspondence
, the joint statistical distribution is modeled as(19) 
where is the total number of correspondences in a region pair (, ) around , is the quantity of small region pairs, is the probability that the nearest neighbor of each keypoint in is located in under the condition that and view the same location, and is the probability provided that and view the different locations. and can be estimated by
(20) 
and
(21) 
where is the probability of a correspondence being correct, is the amount of keypoints in region , is the size of in , and is a factor added to balance deviations caused by repeated structures. A quantitative score is next designed to evaluate the distinction between two distributions as
(22) 
where is the mean value and is the standard deviation. This equation can be simplified as
(23) 
where the distinction is positive correlated to the number of correspondences.
In addition, to incorporate this approach into a realtime system, a fast girdbased score estimator is developed as follows. First, and are divided into nonoverlapping cells. Second, for each cell in , the cell containing the maximum amount of correspondences is grouped in . Third, in cellpair as well as its small neighborhoods (eight cellpairs), is estimated as
(24) 
where is the amount of correspondences in the cellpair . All correspondences in are judged as inliers if , where is a threshold approximated by with being a given parameter and being the average (of the nine cellpairs) amount of correspondences.
Locality Preserving Matching [15]. This algorithm removes mismatches by digging out the local geometric structure consensus. With the hypothesis that the local structure around a correspondence may not change freely, a cost function is defined as
(25)  
where is a regularization parameter, is the Euclidean distance between two keypoints, and respectively are sets of the nearest neighbors of and , is the size of , and is an inlier subset of . Under nonrigid transformations such as deformation, the absolute distance in Eq. 25 may not be preserved well. To address this issue, LPM converts the cost function to
(26)  
where is a set of indicators where indicates the inlier and otherwise. This equation can be further reorganized by merging the related items of as
(27) 
where
(28) 
is a constraint item measuring the local geometric structure changes. With the objective of minimizing the cost function, a correspondence with the cost, i.e., , is negative. For this purpose, the correct correspondence set is determined by
(29) 
Iv Experimental setup
The experimental setup is introduced detailedly in this section. First, we list implementations and parameter settings of the evaluated methods. Second, characteristics of four datasets, the experimental criteria and the deployment are formulated.
Iva Implementations
In our experiments, Hessianaffine detector [33] and SIFT descriptor [6] (a popular detectordescriptor combination [22]) are employed in default for image keypoint detection and description. Notice that another reason for using the Hessianaffine detector is that the evaluated GTM method requires local affine information, while we also consider different detectordescriptor combinations in Sect. VC. The initial correspondence set is generated by bruteforce matching, i.e., greedy comparison of two feature sets. Parameters and implementations for each algorithm are listed in Table I.
No.  Algorithm  Implementation  Parameters  Setting 
1  NNSR [6]  OPENCV  Adaptive [42]  
2  RANSAC [9]  OPENCV  10pix  
2000  
3  ST [12]  MATLAB  0.3  
4  GTM [13]  OPENCV  Adaptive [42]  
100  
0.0001  
5  USAC [11]  OPENCV  850000  
10pix  
1.5pix  
6  VFC [8]  OPENCV  0.1  
3  
0.75  
0.9  
7  GMS [17]  OPENCV  4  
8  LPM [15]  MATLAB  6  
4 
Notably, for NNSR and GTM we set and adaptively using the OTSU [42] algorithm to reduce thresholding errors as proper thresholds may vary in different scenarios even in different images.
All those methods are implemented in OPENCV or MATLAB with a PC equipped with a 3.2GHz processor and 8GB memory.
IvB Datasets




We perform our experiments on four datasets, i.e., VGG [24], Symbench [25], Heinly [20], and AdelaideRMF [26]. Exemplar images from these datasets and a brief summarization of their inherited nuisances are shown in Fig. 2 and Table II, respectively.
Dataset  Challenges  Matching pairs 

VGG [24]  Zoom, rotation, blur, viewpoint change,  40 
light change and JPEG compression  
Symbench [25]  Light change,  46 
different rendering styles  
Heinly [20]  Zoom and rotation  29 
AdelaideRMF [26]  Multistructures,  38 
viewpoint change 
The VGG dataset [24]. VGG is a hybrid dataset involving eight scenes. Each scene consists of six images with the first image being the reference one with respect to the others. Challenges including blur, viewpoint change, zoom, rotation, light change, and JPEG compression exist in this dataset. The groundtruth is the homography matrix , indicating that the transformation between two images on each scene satisfies the plane homographic constraint.
The Symbench dataset [25]. The Symbench dataset is composed of 46 image pairs. Each pair includes the same object with light change or different rendering styles. The homographic transformation of each image pair is given as the groundtruth.
The Heinly dataset [20]. The Heinly dataset comprises images with dense or sparse viewpoint change, illumination, pure largescale zoom or rotation. Considering that nuisances of viewpoint change and illumination have been covered in the other three datasets, we choose a subset of Heinly containing 29 pairs of image shot on 4 scenes with the specific challenges, i.e., pure zoom or rotation, to perform a more targeted test. The groundtruth is provided as the homographic transformation.
The AdelaideRMF dataset [26]. AdelaideRMF includes 38 pairs of image with viewpoint change and multistructures. The keypoint coordinates of initial correspondences are provided and the groundtruth correspondences are manually labeled in this dataset.
Motivations of employing these datasets can be summarized as: (i) The eight scenes in the VGG dataset cover a peculiar wide range of interferences such as the rigid/nonrigid transformation and image quality variation. Both the generality to diverse conditions and the robustness to a specific nuisance can be assessed on this dataset. (ii) The focus of Symbench is the image quality variation caused by light change and different rendering styles that give rise to potential errors of feature detection and description. The performance in the context of image quality variation can be specifically evaluated. (iii) The subset of Heinly is selected with the aim of testing the performance under the condition of a geometrical structure deformation (pure zoom or rotation). (iv) AdelaideRMF aims at evaluating the performance of those correspondence selection algorithms where plane homographic constraint fails and multiple consistent correspondence sets are involved due to multistructures. All above peculiarities make the evaluation benchmarks complementary to each other and allow us to find prominent algorithms under a specific nuisance.
IvC Criteria
The performance of evaluated algorithms is measured via precision, recall and Fmeasure as in [27, 17, 15]. First, we denote the selected correspondence set, the groundtruth correspondence set and the correct subset in the selected correspondence set as , and , respectively. Then, the precision, recall and Fmeasure are respectively defined as
(30) 
(31) 
and
(32) 
where denotes the cardinality of a set. A correspondence belongs to if
(33) 
where is the groundtruth homography matrix and is a threshold set to pix (pix being the unit of pixel) that controls the upper bound of the accuracy of a true inlier in our experiments.
IvD Experimental deployment
Our experiments are deployed as follows. In Sect. VA, the overall performance of the evaluated algorithms in different scenarios, i.e., the four experimental datasets, is tested. In Sect. VB, the performance with preselected correspondences by NNSR, i.e., commonly employed to improve the inlier ratio of initial matches [8, 23, 43, 44], is tested on the four datasets. In Sect. VC, different detectordescriptor combinations are considered to examine the performance variation of correspondence selection algorithms. Notice that different combinations of detector and descriptor are desired in different application contexts [33, 22] and will result in different distributions and inlier ratios. In Sect. VD, the robustness to different nuisances, i.e., blur, viewpoint change, zoom, rotation, light change, and JPEG compression, is independently examined on the VGG dataset. In Sect. VE, we address concerns about the efficiency in those algorithms by examining their overall time cost on different datasets paired with the speed comparison under different scales of initial matches. Finally, some representative visual results of the evaluated algorithms are shown in Sect. VF.
V Results
Following the experimental arrangement in Sect. IVD, this section presents the corresponding results together with necessary discussions and explanations.
Va Performance on the different datasets




In the following, we show the precision, recall and Fmeasure performance of our evaluated algorithms on different datasets, i.e., under different scenarios. In particular, the overall precision, recall and Fmeasure curves are shown in Fig. 3 for aggregately view and the Fmeasure scores for each image pair on the four datasets are shown in Fig. 4 to give a more detailed view. We mainly discuss the performance based on Fig. 3.
VA1 Performance on the VGG dataset
Fig. 3(a) shows outcomes on the VGG dataset. It is interesting to see that NNSR achieves the best precision performance, being marginally better than USAC, RANSAC and GMS. This result is due to the fact that the feature distinctiveness cue is rather selective with richtextured images, e.g., images in the VGG dataset. On the down side, feature distinctiveness is sometimes ambiguous and not a robust constraint as we can see that the recall of NNSR is just mediocre. It indicates that many correct correspondences have been filtered by NNSR. For ST and LPM, they are generally inferior to the others on this dataset in terms of the Fmeasure. That is because ST may fail to locate the main cluster in the spectral domain if the ourlier ratio is large, resulting in quite poor recall performance. LPM achieves much better recall performance than ST, while its precision performance is surpassed by most compared ones. It arises from the loose constraint employed in LPM. Overall, USAC is the best method on this dataset. Explanation behind is that USAC is a parametric method and the parametric model of each image pair existed in this dataset can be properly fitted.
VA2 Performance on the Symbench dataset
Fig. 3(b) presents results on the Symbench dataset. All methods suffer a clear drop in performance on this dataset when compared with that on the VGG dataset, which is attributed to light change and various rendering styles. More specifically, we observed that the average inlier ratio of initial correspondences on this dataset is lower than
. As previously explained, the feature distinctiveness constraint strongly relies on the discriminative power of the local feature descriptor. However, the rendering style variation makes it fairly challenging to maintain descriptiveness in this case. As a result, NNSR delivers very poor precision performance. Another significant difference compared to that on the VGG dataset is USAC’s performance. One can see that USAC returns the most and the second most inferior precision and recall performance, respectively. That is because USAC may find empty inlier sets in some cases when its average estimated scores decreases owing to the multiple constraints in this algorithm
[11]. In general, GMS and VFC are the two most wellbehaved methods after referring their Fmeasure rankings. A common trait of these two algorithms is that both of them are independent from the descriptor similarity.VA3 Performance on the Heinly dataset
Fig. 3(c) presents results on the Heinly dataset. Image pairs on this dataset only contain pure zoom or rotation, and we can observe that all methods obtain relatively decent performance on this dataset. In terms of precision, NNSR and RANSAC neatly outperform the others. Regarding recall, LPM and RANSAC are the two best ones. Note that the reason for the high recall of LPM is that most inliers are selected with the loose constraint designed by this algorithm. For NNSR and RANSAC, the former one is attributed to the high distinctiveness of SIFT (we will see its performance variation with less distinctive descriptors in Sect. VC), whereas the latter one is owing to the powerful homography fitting ability of RANSAC. GMS, due to its sensitivity to large degrees of rotation [17], shows worse results compared to its performance on the VGG and Symbench datasets.
VA4 Performance on the AdelaideRMF dataset
Fig. 3(d) presents results on the AdelaideRMF dataset. Two explanations should be given on this dataset. First, as only manual labeled groundtruth correspondences are available, we present the exact scores rather than curves with respect to matching tolerance for each method. Second, the keypoints on this dataset are not located by image detectors. Rather, they were labeled manually. Thus, GTM requiring local affine information and NNSR based on autodetected keypoints are not assessed on this dataset. Since each scene in this dataset contains multiple planes, the fundamental matrix based on the epipolar geometry constraint is employed to approximate the parametric model for RANSAC and USAC. By observing the scores in Fig. 3(d), one can see that GMS, LPM and VFC achieve the best precision, recall and Fmeasure performance, respectively. All the three methods are nonparametric. This is reasonable since the AdelaideRMF contains multistructures, and the parametric assumption for methods like RANSAC and USAC will fail in this case.
VA5 Overall performance
By weighing up the results presented in Fig. 3 and Fig. 4, we can draw the following conclusions. First, the performance of all correspondences selection algorithms is affected by the initial inlier ratio. For instance, the performance of all algorithms deteriorates dramatically on the Symbench dataset with less than 10% inliers. Second, NNSR simply relying on feature’s distinctiveness produces pleasurable results if images are welltextured and clean. Third, parametric approaches, i.e., RANSAC and USAC, prefer the context that the transformation between two images can be well fitted by a parametric model. While nonparametric algorithms perform better in situations without large degrees of rigid/nonrigid transformation. Overall, VFC and RANSAC are the two best algorithms under acrossdataset experiments.
VB Performance on selected matches




Many existing works [8, 23, 43, 44] first prune false correspondences via NNSR and then use parametric or nonparametric methods to for further selection. This experiment then checks this scenario. Remarkably, since NNSR fails to work on the AdelaideRMF dataset, this dataset is not considered in this test. Fig. 5 shows the difference between correspondences before and after applying NNSR, and results using NNSRselected correspondences for selection are shown in Fig. 6.
On the VGG dataset shown in Fig. 6(a), one can see that the performance of all methods has been improved using NNSRselected matches compared to bruteforce matches in Fig. 3(a). Particularly, USAC manages to be the best method regarding precision, recall and Fmeasure. Also, gaps between most curves excluding that of ST are relatively small. On the Symbench and Heinly datasets, GMS and LPM respectively achieve the best overall performance, where LPM even produces an extremely high Fmeasure score, i.e., 97.27%, on the Heinly dataset. We can infer that LPM adapts well to initial correspondence sets with high inlier ratio.
VC Performance under different detectors and descriptors
NNSR  RANSAC  ST  USAC  VFC  GMS  LPM  

SIFT +  Symbench  9.34  3.02  1.22  3.27  11.56  11.64  9.36 
SIFT  Heinly  94.13  95.43  33.82  98.75  83.08  40.29  89.84 
ORB +  Symbench  4.70  5.27  2.13  3.33  3.00  11.62  6.31 
ORB  Heinly  57.62  58.98  17.57  56.45  56.30  50.24  60.30 
ASIFT +  Symbench  7.00  7.15  3.29  4.54  14.42  17.48  12.54 
ASIFT  Heinly  69.31  92.31  27.47  78.72  78.75  44.21  88.21 
BLOB +  Symbench  4.62  2.35  0.95  1.97  6.25  0.50  2.19 
FREAK  Heinly  68.63  76.25  20.32  74.45  68.71  4.30  66.64 
In addition to Hessianaffine + SIFT, we also consider four other popular detectordescriptor combinations, i.e., SIFT + SIFT [6], ORB + ORB [7], ASIFT + ASIFT [45], and BLOB [46] + FREAK [47]. Fig. 7 shows the initial correspondences with these combinations on a sample image pair. Note that GTM is excluded in this test as it requires local affine information and these detectors do not provide this information. Also, the AdelaideRMF dataset is not considered due to humanlabeled keypoints. The results are reported in Fig. 8 and Table III.
A common characteristic of these results is that the best correspondence selection algorithm generally varies with combinations of detector and descriptor. While we can still find some consistencies, e.g., the VFC method achieves pleasurable performance on the VGG dataset in spite of the descriptordetector combinations. The performance of some methods fluctuates dramatically. For example, NNSR ranks the first with SIFT + SIFT while performs poorly using ASIFT + ASIFT on the VGG dataset. On the Symbench and Heinly datasets, GMS and RANSAC are two prominent methods under different kinds of detectordescriptor combinations.
VD Robustness
In this section, we independently evaluate the robustness of these algorithms to a specific nuisance, e.g., zoom, rotation, blur, viewpoint change, light change and JPEG compression on the VGG dataset. Some exemplar images with different nuisances are exhibited in Fig. 9. Results are shown in Table IV.
Under zoom and rotation (case1 and case3), USAC and RANSAC, i.e., two parametric methods, behave the best (Fmeasure is referred) mainly attributed to that zoom and rotation are faint impact on homography fitting. Under blur (case2 and case6), GMS and NNSR outperform others. GMS is independent from feature similarity constraint, thus making it rational. For NNSR, it is still explicable as SIFT is very robust to blur. Regarding viewpoint change (case4 and case8), USAC and VFC are the best methods. Note that VFC generally delivers good performance under all kinds of nuisances, being benefited from the consensus search in the nonparametric field. USAC also achieves the best performance under light change (case5) and JPEG compression (case7), being the one that is robust to the broadest categories of nuisances.
NNSR  RANSAC  ST  GTM  UASC  VFC  GMS  LPM  

Case1  Precision  81.16  76.11  17.98  43.22  77.38  67.19  63.61  42.50 
(zoom and rotation)  Recall  77.68  92.86  4.51  79.60  99.05  86.11  11.45  83.54 
Fmeasure  77.35  82.42  6.56  53.69  84.48  74.27  18.46  54.75  
Case2  Precision  74.57  36.87  44.23  67.00  49.66  29.44  41.71  27.73 
(blur)  Recall  79.39  41.85  8.23  56.74  60.00  51.41  50.45  54.75 
Fmeasure  71.87  38.71  13.46  61.12  54.30  35.27  45.54  35.86  
Case3  Precision  61.53  70.54  15.97  44.92  67.41  49.38  58.57  44.59 
(zoom and rotation)  Recall  57.91  83.28  1.97  52.16  79.95  99.22  57.21  76.43 
Fmeasure  53.74  74.81  3.50  44.83  73.12  61.91  56.35  55.10  
Case4  Precision  51.77  55.58  37.21  50.94  63.01  57.86  57.05  45.08 
(viewpoint change)  Recall  61.63  66.38  3.52  68.55  79.73  97.08  75.52  83.97 
Fmeasure  51.75  58.56  6.41  55.69  70.23  71.23  64.55  56.56  
Case5  Precision  76.28  81.44  61.90  68.90  83.76  71.99  64.89  57.65 
(light change)  Recall  63.75  86.97  6.76  80.35  100  100  87.95  84.46 
Fmeasure  68.00  82.34  11.61  73.94  91.11  82.49  74.37  67.90  
Case6  Precision  31.90  45.33  24.95  33.45  32.23  31.18  57.10  26.72 
(blur)  Recall  69.13  27.06  2.57  39.10  40.00  40.00  47.00  66.81 
Fmeasure  31.49  28.86  4.34  34.29  35.68  35.02  50.80  35.82  
Case7  Precision  89.47  87.07  89.46  80.66  89.59  89.48  79.87  75.87 
(JPEG compression)  Recall  61.17  97.41  28.59  94.42  100  100  96.70  93.38 
Fmeasure  72.42  91.81  43.07  86.88  94.43  94.26  87.25  83.43  
Case8  Precision  67.27  74.42  52.34  72.05  73.03  72.40  80.86  62.67 
(viewpoint change)  Recall  61.08  79.03  4.02  80.12  80.00  79.51  73.10  81.76 
Fmeasure  58.39  76.23  7.36  73.42  76.33  75.74  76.08  69.64 
VE Efficiency
To provide an overview of the evaluated methods by taking both selection performance and efficiency into consideration, we present the efficiency v.s. Fmeasure plots on the four experimental datasets in Fig. 10. Owing to fast execution speed and overall decent performance, GMS strikes a good balance between selection performance and efficiency.
In order to further test an algorithms’s efficiency regarding different numbers of initial correspondences, i.e., the number of initial correspondences may vary in different applications or with different feature detectors, we vary the amount of initial correspondences from to and record the average speed of the eight methods. This experiment has been repeated for 10 rounds and average statistics are retained. Because codes of these algorithms are implemented either in OpenCV (C++) or MATLAB, we assess methods within the same platform independently. In addition, the VFC method is evaluated on both platforms and can be a reference for comparing acrossplatform methods. Results are reported in Fig. 11.
For methods implemented in OpenCV, the efficiency of GMS is beyond all others. That is because GMS involves a grid framework for fast scoring. NNSR ranks the second, as only sort operation is needed to rank correspondences. RANSAC is slightly slower than USAC, and the core time consumption of both methods is dedicated to hypothesis generationverification. GTM, with the computational complexity of (
being the number of input correspondences), is significantly slower than the other five methods. The margin is rather significant as the number of correspondences increases. For methods implemented in MATLAB, LPM is very efficient as it relies on a simple yet efficient strategy by preserving local neighborhood structure. ST is the most inefficient method, being slower than others by tens of magnitude with dense correspondences. It is due to the fact that the time consumption for computing eigenvalues increases exponentially with the size of the affinity matrix.
VF Visual results
To obtain a qualitative sense of outputs of evaluated algorithms, we present several visual results of these algorithms on the four experimental datasets in Fig. 12.
Two main observations can be made from the figure. First, distributions of selected correspondences by different algorithms are generally different from each other. For instance, few correspondences are found by GTM on the bread in Fig. 12(d). However, NNSR and LPM get plenty of correspondences on it. Second, the quantity of selected correspondences also varies with different methods. In particular, LPM manages to return dense correspondences on most datasets, while ST seeks out much less than others.
Vi Summary and discussion
Scenarios  Superior methods  Inferior methods  
Datasets  VGG  USAC, RANSAC, VFC  ST 
Symbench  GMS  ST, USAC  
Heinly  RANSAC, NNSR, LPM  ST, GTM, GMS  
AdelaideRMF  VFC, LPM  ST, RANSAC  
NNSR preselection  VGG  USAC, RANSAC, LPM  ST 
Symbench  GMS, VFC  ST  
Heinly  LPM, RANSAC, VFC  ST, GMS  
Det/Des combinations  SIFT+SIFT  USAC, NNSR, GMS  ST, RANSAC 
ORB+ORB  LPM, GMS, USAC  ST, VFC  
ASIFT+ASIFT  VFC, RANSAC, GMS  ST, USAC, NNSR  
BLOB+FREAK  VFC, NNSR, RANSAC  ST, GMS, USAC  
Robustness  Zoom and rotation  USAC, RANSAC  ST, GTM, GMS 
Blur  NNSR, GMS  ST, RANSAC  
Viewpoint change  USAC, VFC  ST, NNSR  
Light change  USAC, VFC  ST  
JPEG compression  USAC, VFC  ST, NNSR  
Efficiency  GMS, NNSR  ST, GTM 
To give a quick guidance for developers regarding proper algorithms in a specific case, we list the superior and inferior correspondence selection in Table V. Also, peculiarities inherited to each evaluated algorithm are presented as follows:

NNSR is arguably the most straightforward strategy to select correspondences. Its key strength is that repeatable patterns can be removed reliably in certain circumstances, provided that its employed feature detectors can locate the keypoints accurately and descriptors possess strong discriminative power, e.g., SIFT. Also, the high execution speed makes it suitable for realtime or near realtime systems. However, the limitation of NNSR is obvious because of the simple descriptor similarity constraint. It is vulnerable when image quality is low (e.g., facing with light change, blur, exposure, and styletransfer) and texture information is limited.

RANSAC and USAC, i.e., two evaluated parametric approaches, can fit the parametric models including the homography and fundamental matrices between two images effectively, with the premise that the image pair has homography or epipolar geometry constraint. Thus, they are prior options in such circumstances. Nevertheless, such assumption also brings drawbacks, e.g., when nonrigid objects are captured in images with large scale of parallax or the pure rotation between two camera positions, resulting in the failure of RANSAC and USAC. Further, the reliable models may not be generated by limited iterations with high outlier ratios, which will give rise to expensive time cost. For RANSAC, the minimalsample models sometimes fall into the local optimization. USAC optimizes over RANSAC, though, it does not guarantee convergence and may produce an empty inlier set due to strict constraints.

ST and GTM are methods relying on the affinity matrix computed from initial matches. We can find that these two methods are relatively timeconsuming, especially for the ST method. The performance of GTM is much better than ST, mainly because GTM employs local affine information to judge the compatibility of two correspondences. While ST is based on rigid constraint. ST, when inputted with highquality correspondences, is able to achieve high precision performance (as verified in Sect. VB). These two methods are optional for offline applications desiring high precision and with highquality input.

LPM rejects outliers by the local structure consistency. The constraint item in LPM is relatively loose, resulting in high recall yet relatively low precision. LPM prefers scenarios where the geometric structure information is well preserved between the same local pattern in the image pairs, e.g., small degrees of rigid transformations. Similar to NNSR, it relies strongly on the discriminative power of the feature descriptor. In other words, retrieving the local consistency can be problematic if the local region contains too few inliers. We therefore suggest to choose LPM in the context that has well preserved geometric structures and requires dense correspondences.

VFC, as revealed by our experiment, is the most robust method under all tested scenarios. This is attributed to the fact that VFC is independent from the feature similarity and parametric models. Specifically, it performs inlier selection in a vector field. VFC generalizes well under different application contexts and can cope with various kinds of nuisances, especially for viewpoint change, light change and JPEG compression.

GMS, similar to VFC, is also independent from the feature similarity and parametric models. However, it assumes that the motion between two images is smooth. Accordingly, it behaves unsatisfactory for image pairs undergoing large degrees of rotation. While if the motion smoothness assumption holds, its performance is superior even for correspondence set with very limited number of inlier, e.g., correspondences generated from the Symbench dataset. Another attractive merit of GMS is the ultra fast execution speed even under several thousands of initial correspondences, making it a prior selection for realtime applications.
Vii Conclusions
This paper has comprehensively evaluated eight stateoftheart image correspondence selection algorithms, covering both parametric and nonparametric families. The experiments addressed several critical issues regarding correspondence selection, e.g., different application scenarios (datasets), inputs from different combinations of feature detector and descriptor, robustness under various challenging conditions including zoom, rotation, blur, viewpoint change, JPEG compression, light change, different rendering styles and multistructures, and efficiency. Advantages and limitations, in light of experimental outcomes, are summarized so as to guide developers to choose a proper algorithm given a specific scenario.
Remarkably, the performance of most existing algorithms changes dramatically in different scenarios and most methods fail to achieve satisfactory results when the inlier ratio of the initial correspondence set is low. We therefore believe the research should towards the development of correspondence selection algorithms with well generality and be robust to a low inlier rate.
Acknowledgment
We are deeply grateful to the authors of the evaluated algorithms and datasets for making their contributions publicly available. This work is supported by the National High Technology Research and Development Program of China (863 Program) under Grant 2015AA015904.
References
 [1] N. Snavely, S. M. Seitz, and R. Szeliski, “Modeling the world from internet photo collections,” International Journal of Computer Vision, vol. 80, no. 2, pp. 189–210, 2008.
 [2] S. Benhimane and E. Malis, “Realtime imagebased tracking of planes using efficient secondorder minimization,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1, 2004, pp. 943–948.

[3]
S. Hare, A. Saffari, and P. H. Torr, “Efficient online structured output
learning for keypointbased object tracking,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2012, pp. 1894–1901.  [4] M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” International Journal of Computer Vision, vol. 74, no. 1, pp. 59–73, 2007.
 [5] D. G. Lowe, “Object recognition from local scaleinvariant features,” in Proceedings of the IEEE International Conference on Computer Vision, vol. 2. IEEE, 1999, pp. 1150–1157.
 [6] ——, “Distinctive image features from scaleinvariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
 [7] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 2564–2571.
 [8] J. Ma, J. Zhao, J. Tian, A. L. Yuille, and Z. Tu, “Robust point matching via vector field consensus,” IEEE Transactions on Image Processing, vol. 23, no. 4, pp. 1706–1721, 2014.
 [9] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981.
 [10] O. Chum and J. Matas, “Matching with prosacprogressive sample consensus,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2005, pp. 220–226.
 [11] R. Raguram, O. Chum, M. Pollefeys, J. Matas, and J. M. Frahm, “Usac: A universal framework for random sample consensus,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 2022–2038, 2013.
 [12] M. Leordeanu and M. Hebert, “A spectral technique for correspondence problems using pairwise constraints,” in Proceedings of the IEEE International Conference on Computer Vision, 2005, pp. 1482–1489.
 [13] A. Albarelli, E. Rodolà, and A. Torsello, “Imposing semilocal geometric constraints for accurate correspondences selection in structure from motion: A gametheoretic perspective,” International Journal of Computer Vision, vol. 97, no. 1, pp. 36–53, 2012.
 [14] T. Collins, P. Mesejo, and A. Bartoli, An analysis of errors in graphbased keypoint matching and proposed solutions. Springer International Publishing, 2014.
 [15] J. Ma, J. Zhao, H. Guo, J. Jiang, H. Zhou, and Y. Gao, “Locality preserving matching,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2017, pp. 4492–4498.
 [16] X. Li and Z. Hu, “Rejecting mismatches by correspondence function,” International Journal of Computer Vision, vol. 89, no. 1, pp. 1–17, 2010.
 [17] J. Bian, W.Y. Lin, Y. Matsushita, S.K. Yeung, T. D. Nguyen, and M.M. Cheng, “Gms: Gridbased motion statistics for fast, ultrarobust feature correspondence,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
 [18] W. Y. Lin, F. Wang, M. M. Cheng, S. K. Yeung, P. H. S. Torr, M. N. Do, and J. Lu, “Code: Coherence based decision boundaries for feature correspondence,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PP, no. 99, pp. 1–1, 2017.
 [19] K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615–1630, 2005.
 [20] J. Heinly, E. Dunn, and J. M. Frahm, “Comparative evaluation of binary features,” in Proceedings of the European Conference on Computer Vision, 2012, pp. 759–773.
 [21] H. Aanæs, A. Dahl, and K. Steenstrup Pedersen, “Interesting interest points: A comparative study of interest point performance on a unique data set,” International Journal of Computer Vision, vol. 97, no. 1, pp. 18–35, 2012.
 [22] P. Moreels and P. Perona, “Evaluation of features detectors and descriptors based on 3d objects,” International Journal of Computer Vision, vol. 73, no. 3, pp. 263–284, 2007.
 [23] R. Raguram, J. M. Frahm, and M. Pollefeys, “A comparative analysis of ransac techniques leading to adaptive realtime random sample consensus,” in Proceedings of the European Conference on Computer Vision, 2008, pp. 500–513.
 [24] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. V. Gool, “A comparison of affine region detectors,” International Journal of Computer Vision, vol. 65, no. 12, pp. 43–72, 2005.
 [25] N. Snavely and D. C. Hauagge, “Image matching using local symmetry features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 206–213.
 [26] H. S. Wong, T. J. Chin, J. Yu, and D. Suter, “Dynamic and hierarchical multistructure geometric model fitting,” in Proceedings of the IEEE International Conference on Computer Vision, 2011, pp. 1044–1051.
 [27] W.Y. D. Lin, M.M. Cheng, J. Lu, H. Yang, M. N. Do, and P. Torr, “Bilateral functions for global motion modeling,” in Proceedings of the European Conference on Computer Vision. Springer, 2014, pp. 341–356.
 [28] P. H. S. Torr and A. Zisserman, “Mlesac: A new robust estimator with application to estimating image geometry,” Computer Vision and Image Understanding, vol. 78, no. 1, pp. 138–156, 2000.
 [29] O. Chum, J. Matas, and J. Kittler, “Locally optimized ransac,” Pattern Recognition, pp. 236–243, 2003.
 [30] M. Cho, J. Lee, and K. M. Lee, “Feature correspondence and deformable object matching via agglomerative correspondence clustering,” in Proceedings of the IEEE International Conference on Computer Vision. IEEE, 2009, pp. 1280–1287.
 [31] T.J. Chin, J. Yu, and D. Suter, “Accelerated hypothesis generation for multistructure robust fitting,” Proceedings of the European Conference on Computer Vision, pp. 533–546, 2010.
 [32] H. Y. Chen, Y. Y. Lin, and B. Y. Chen, “Robust feature matching with alternate hough and inverted hough transforms,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 2762–2769.
 [33] K. Mikolajczy and C. Schmid, “Scale & affine invariant interest point detectors,” International Journal of Computer Vision, vol. 60, no. 1, pp. 63–86, 2004.
 [34] J. Kim and K. Grauman, “Boundary preserving dense local regions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1553–1560.
 [35] M. Cho, J. Lee, and K. M. Lee, “Reweighted random walks for graph matching,” in Proceedings of the European Conference on Computer Vision, 2010, pp. 492–505.
 [36] F. Tombari, S. Salti, and L. Di Stefano, “Performance evaluation of 3d keypoint detectors,” International Journal of Computer Vision, vol. 102, no. 13, pp. 198–220, 2013.
 [37] Y. Guo, M. Bennamoun, F. Sohel, M. Lu, J. Wan, and N. M. Kwok, “A comprehensive performance evaluation of 3d local feature descriptors,” International Journal of Computer Vision, vol. 116, no. 1, pp. 66–89, 2016.
 [38] J. W. Weibull, Evolutionary game theory. MIT press, 1997.
 [39] J. Matas and O. Chum, “Randomized ransac with sequential probability ratio test,” in Proceedings of the IEEE International Conference on Computer Vision, vol. 2. IEEE, 2005, pp. 1727–1732.
 [40] O. Chum, T. Werner, and J. Matas, “Twoview geometry estimation unaffected by a dominant plane,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2005, pp. 772–779 vol. 1.
 [41] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1–38, 1977.
 [42] N. Otsu, “A threshold selection method from graylevel histograms,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979.
 [43] J. Yang, Z. Cao, and Q. Zhang, “A fast and robust local descriptor for 3d point cloud registration,” Information Sciences, vol. 346, pp. 163–179, 2016.
 [44] J. Yang, Q. Zhang, and Z. Cao, “Multiattribute statistics histograms for accurate and robust pairwise registration of range images,” Neurocomputing, vol. 251, pp. 54–67, 2017.

[45]
J. M. Morel and G. Yu, “Asift: A new framework for fully affine invariant image comparison,”
Siam Journal on Imaging Sciences, vol. 2, no. 2, pp. 438–469, 2009.  [46] T. Lindeberg, “Feature detection with automatic scale selection,” International Journal of Computer Vision, vol. 30, no. 2, pp. 79–116, 1998.
 [47] A. Alahi, R. Ortiz, and P. Vandergheynst, “Freak: Fast retina keypoint,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012, pp. 510–517.
Comments
There are no comments yet.