I Introduction
Finding feature correspondence between two or more feature sets [1] is a fundamental problem in computer vision and pattern recognition, which also relates to operational research as it involves solving assignment problems. As a building block, it facilitates various problems e.g. 3D reconstruction [35], CAD [44, 43], visual tracking [29], common object discovery [18], among a considerable amount of applications.
Most existing methods for feature correspondence have been devoted into the pairwise case i.e. matching two feature sets one time [19]. Nevertheless it is far more common in realworld problems where a batch of feature sets are involved.
This paper focuses on the problem that given multiple (more than two) feature sets, how to find their onetoone correspondences based on the unary nodewise attributes without considering the higherorder compatibility between nodes. For the classical twoset setting, the problem usually can be transformed to a linear assignment problem whose global optimum can be found in polynomial time using the Hungarian method [25]. While this paper explores the setting where a collection feature sets are involved for feature matching.
Ii Related Work
We mention a few of relevant concepts about the correspondence problem, mainly in the context of computer vision.
Iia Point registration and feature matching
For point registration, often a parametric transformation is assumed [12, 14]
which serves as a regularization or prior term in the objective to account for the motion in an image sequence. The registration problem usually involves two ‘chickenandegg’ steps: i) finding correspondence from two point sets; ii) estimating the parameters of the transformation based on the correspondence. For instance, based on an initial correspondence, the iterative closest point (ICP) methods
[9, 10] iterate between finding the correspondence via nearest neighbor and updating the transformation with the least square error. Many other methods, such as the Robust Point Matching (RPM) [12] and the Coherent Point Drift (CPD) method [38] are designed to perform pairwise registration. In contrast, the multiview methods e.g. [14] aim to finding the correspondence and geometric transform over a batch of point sets.Similar to the problem considered in this paper, there are several methods that only explore the unary feature attributes associated with the nodes without imposing geometrical priors. Two examples are [2, 3]. The former formulates the feature matching among multiple sets as a matrix low rank sparsity decomposition task; the latter proposes an optimization algorithm to find consistent correspondences over feature sets.
IiB Graph matching
In contrast to point registration, there is no parametric transformation imposed in graph matching [4, 6]
thus we call it a nonparametric model. Moreover, compared with feature matching, graph matching additionally incorporates the edge attribute which can refer to secondorder
[30], or even higherorder [22, 27] geometrical information for matching. This lifts the order of the matching problem, and a quadratic assignment programming formulation is derived [21, 32] which in general is known NPhard and only a few special cases can be solved in polynomial time [15], such as planar graph [7], bounded valence graph [39], and tree structure [40]. Note there are several stateoftheart multiple graph matching methods [23, 24, 33, 31, 37, 36] by which the multiple feature sets matching problem can be viewed as a special case in their model by ignoring the edge attributes, however, as will be shown later in this paper, our method specifically leverages the special structure of the problem for the unary feature that enables the proposed clustering [45] based approach. These methods will also be compared in our tests.Note the incorporation of higherorder information or transform prior as done in graph matching or point registration may be beneficial in certain practical tasks, yet this is a problem in parallel as they use different assumptions and information.
IiC Matching cost modeling
It is theoretically proved in [13] that the graph edit costs is critical to the graph edit distance based methods. In complex tasks a manual procedure for cost setting is difficult, or even impossible to apply. To address this issue, [42] aims to learn the edit cost by a probabilistic framework, to reduce the intraclass edit distance and increase the interclass one. Alternatively, [41]
proposes to use selforganizing maps to learn the edit cost. For graph matching, recent work leverage various machine leaning algorithms for computing the optimal affinity matrix
[11, 5], and these methods in general fall into either supervised [5], unsupervised [11], or semisupervised [17] learning paradigms, based on to what extent the supervision information is used. The cost modeling is orthogonal to this paper and we assume the cost is given.IiD Correspondence consistency over multiple node sets
Matching consistency around multiple node sets is a widely recognized concept such as [23, 31]. The key idea is illustrated in Fig.1: given the pairwise correspondence set independently computed by a twoset matching solver from feature sets – we call as matching configuration in line with [31] where is the node correspondence permutation matrix for feature set and ^{1}^{1}1Specifically, the element in denotes the th node in feature set corresponds to the th node in feature set if , otherwise ., there often exists the inconsistency over different node correspondence transitive paths. Now we further introduce two definitions concerning consistency as presented in [31, 24] (we slightly rewrite them to better fit to our problem).
Definition 1.
Given feature sets and the pairwise matching configuration , the unary consistency of feature set is defined as , where is the Frobenious norm.
Definition 2.
Given feature sets and matching configuration , for any pair and , the pairwise consistency is defined as .
where is the number of feature points in each feature set. In the next section, we will present our method where the above two definitions will be used.
Iii Proposed Method
For twoset feature matching with set and , by merely taking the unary attributes into consideration, the matching problem can be formulated as a linear assignment problem, for without loss of generality:
(1)  
Or in a more compact form:
(2)  
The operation
stacks the columns of the input matrix into a column vector. The superscript
denotes the transformation of a given matrix or vector, and is the element in matrix C. Here X is a partial permutation matrix.The cost matrix C can be modeled in different ways based on the applications. In line with [1], we compute it by an exponential cost of the feature vector distance. Formally, let the feature vectors of feature set be where is the feature vector of feature in , thus . Then the cost between assigning feature in one feature set to feature in the other is computed by:
we set the scale parameter to in this paper which is found insensitive to the performance, and is the norm.
The above linear assignment problem has been well studied in early years and polynomial solvers that can attain global optimum are devised such as the Hungarian method [25] and its JonkerVolgenant alternative [26].
As a common preprocessing step, we replace the inequality constraint with the equality constraint as follows:
(3)  
This can be fulfilled by adding dummy nodes to feature set of smaller size (i.e. adding slack variables to the assignment matrix and augment the affinity matrix by zeros) for , then
. This is a standard technique from linear programming and is adopted as a robust means by the literature e.g.
[24, 16, 31] to handle unmatchable nodes.However, directly applying the Hungarian method to obtain the mathematical global optimum may not lead to the perfect matching in realworld problems. There are two reasons for the existence of such an ambiguity: i) modeling the cost/affinity function is nontrivial and it is difficult to fit a score function fully unbiased with the matching accuracy; ii) the noises further make the correlation between matching accuracy and score function more biased, especially only two feature sets are given. In contrast, a collection of feature sets is expected to provide additional global information to help disambiguate the local noises by information fusion.
As a result, we consider the multiple feature set matching problem as a clustering task in the feature space: each feature correspondence across the sets can be viewed as one cluster comprised of the corresponding feature points from each set. From this perspective, the matching problem can be viewed as a procedure to group each feature node in each set into different clusters. Different from the classical clustering problem, there is one combinatorial constraint imposed on the clustering procedure: any two feature nodes in one set cannot be assigned to the same cluster since we assume onetoone node correspondence. Hence traditional clustering methods such as kmean cannot be directly applied.
To address this constrained clustering problem, we propose an iterative algorithm as described in Alg.1: we first generate a group of initial twoset feature correspondences: where is a reference feature set. Using this reference feature set and the correspondences: , we can obtain a series of aligned feature matrix: by . Then we choose a certain feature set , and compute the mean of the rest feature sets: . Then we compute the nodetonode cost between and , based on which we employ the Hungarian method to compute their correspondence , and we set . As the iteration continues, we update for alternatively by fixing the other . The above assignment procedure in one iteration is illustrated in Fig.2.
We term Alg.1 by a superscript ‘fast’ because it is initialized by a linear number of twoset matchings regarding . However, as discussed in a similar situation in [24], since our method is an alternating updating procedure, the final solution can be sensitive to the quality of the initial point. Moreover, the updating order may also influence the iteration path.
There is one useful observation in [31, 24] that matching accuracy, though not observable in testing stage, is correlated with the pairwise consistency as defined in Def.(2). Moreover, the most consistent feature set by Def.(1) is expected to generate the most accurate initial matching set from . Thus we adopt this strategy and describe the improved algorithm in Alg.2. While the expense is the additional time cost for computing the whole with times of pairwise Hungarian method and the computing of and . One illustration of the usefulness of the above strategy is plotted in Fig.3 in comparison with a random selection (matchCluster).
Iv Experiment
Iva Test data settings and compared methods
IvA1 Synthetic feature set matching
The random graph test provides a controlled setting to evaluate the performance of our methods and other peer methods. For each trial, a reference feature set with nodes is created by assigning a random attribute vector of dimension to each of its nodes. In our test, we set for all synthetic tests. The attributes are uniformly sampled from the interval . Then the ‘perturbed’ feature sets are created by adding a Gaussian deformation disturbance to the attributes , which is sampled from i.e. where the superscript ‘p’ and ‘r’ denotes for ‘perturb’ and ‘reference’ respectively. Optionally, each ‘perturbed’ feature set is added by outliers sampled from the same distribution as inliers.
IvA2 CMU hotel sequence
The CMU motion sequences (http://vasc.ri.cmu.edu//idb/html/motion/) are widely used for feature matching [19] and graph matching [22, 23, 24, 31] etc. Here we use it to test feature matching solvers. In our experiment, we focus on the hotel sequence (111 frames) because matching the other sequence house by different methods always produces the perfect matching results. We use 15, 20, 25 feature points from each frame respectively, where the unary feature vector is computed by the SIFT descriptor [34].
IvA3 WillowObjectClass
This dataset released in [5] is constructed using images from Caltech256 and PASCAL VOC2007. Each object category contains different number of images: 109 Face, 50 Duck, 66 Wine bottle, 40 Motorbike, and 40 Car images. For each image, 10 feature points are manually labeled on the target object. We focus on the testing on the face category because the matching accuracy for other images is much lower (below 0.25) if only unary SIFT feature is considered. This is because the objects are from different angles, different textures and colors, as well as different poses.
IvA4 Comparing methods
We compare several stateoftheart general multiview matching methods including matchOpt [23, 24], matchSync[8], Composition based Affinity Optimization (CAO) and its consistencyenforcing variant CAO [33, 31]. In addition, we also involve the baseline twoset matching by the Hungarian method. There are two cases for applying the Hungarian method: i) Hung which performs the Hungarian method independently on each pair of the feature sets. This directly produces the whole matching configuration ; ii) randomly select one feature set , and preform the Hungarian method independently on each of the other feature sets with it. Then the pairwise matchings are computed by . Note the former has a quadratic time cost in terms of the number of feature sets , and the latter is in linear with . For our methods, we also test two versions: i) matchCluster which decides the base feature set and alternating order by the consistency in definition Def.(1) and Def.(2) respectively; ii) the fast version matchCluster that randomly sets the base feature set and alternating order. Note we do not include any point registration method such as [12, 38, 14] in our evaluation because we focus on the nonparametric node correspondences and impose no parametric transformation prior nor regularization for fair comparison.
IvB Results and discussion
The matching accuracy and runtime on the synthetic random tests are plotted in Fig.4 and Fig.5. The former concerns the performance as a function of different disturbances, and the latter concerns the performance by changing the number of feature sets. The matching accuracy and runtime on the CMU Hotel sequence and WillowObjectClass Face is illustrated in Fig.6 and Fig.7, respectively. Fig.8 plots the visual matching illustration for the face and hotel image collections.
We make several observations based on the results:

The synthetic tests suggest our methods (in green) are competitive especially in the presence of many outliers;

In real image tests, matchOpt performs even better than ours, while ours perform competitively against other stateofthearts. No outlier is added in real image tests;

matchCluster outperforms its variant matchCluster especially in real image tests. While in the synthetic test, the accuracy margin is much reduced;

Though being most fast, while Hung always obtain the worst accuracy. This is reasonable as the information across feature sets are not fully explored by Hung.
Our analysis focuses on the first three bullets as 4) is obvious:
For 1) and 2), this perhaps suggests our method is more suited in two cases: i) when the deformation noises associated with the feature sets are evenly distributed around the template feature set – that is the setting of our synthetic test; ii) when there are many outliers. For 3), we think the reason is that the real images are more heterogeneous which results in initial pairwise matchings with varying qualities. While in our synthetic tests, the noises are evenly distributed on the disturbed feature sets, thus their pairwise matching quality are relatively homogenous. This fact reduces the improvement space for adaptively deciding the updating order.
V Conclusion and future work
We propose a novel method for matching feature sets with two advantages: i) the consistency of correspondences of the multiple feature sets is satisfied; ii) the global information of all feature sets is jointly explored. We show that given both simulated feature sets and real image sequences, the proposed method performs competitively with stateofthearts.
Future work involves designing and evaluating cost functions such as the Knearest neighbor distance from the cluster. We will also apply our method to matching graphs by approximately extracting node signature e.g. the distribution of the length of the shortest paths from the node to other nodes.
Acknowledgement This work is supported by STCSM (15JC1401700, 14XD1402100, 13511504501).
References
 [1] G. Scott and H. LonguetHiggins, An algorithm for associating the features of two images, in Proc. the Royal Society of London. Series B: Biological Sciences, 1991.
 [2] Z. Zeng, T.H. Chan, K. Jia and D. Xu, Finding correspondence from multiple images via sparse and lowrank decomposition, in ECCV, 2012.
 [3] J.G. Yu, G.S. Xia, A. Samal and J. Tian Globally Consistent Correspondence of Multiple Feature Sets Using, Proximal GaussSeidel Relaxation, in Pattern Recognition, 2016.
 [4] D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in pattern recognition. IJPRAI, 2004.
 [5] M. Cho, K. Alahari, and J. Ponce. Learning graphs to match. In ICCV, 2013.
 [6] P. Foggia, G. Percannella, and M. Vento. Graph matching and learning in pattern recognition in the last 10 years. IJPRAI, 33(1), 2014.
 [7] D. Eppstein, Subgraph isomorphism in planar graphs and related problems, in SODA, 1995.
 [8] D. Pachauri, R. Kondor, and S. Vikas. Solving the multiway matching problem by permutation synchronization. In NIPS, 2013.
 [9] Y. Chen and G. Medioni, Object modeling by registration of multiple range images, IVC, 1992.
 [10] Z. Zhang, Iterative point matching for registration of freeform curves and surfaces, IJCV, 1994.
 [11] M. Leordeanu, R. Sukthankar, and M. Hebert. Unsupervised learning for graph matching. Int. J. Comput. Vis., pages 28–45, 2012.
 [12] H. Chui and A. Rangarajan, A new point matching algorithm for nonrigid registration, CVIU, 2003.
 [13] H. Bunke, Error correcting graph matching: on the influence of the underlying cost function, PAMI, 21(9), 1999.
 [14] J. Yan, J. Wang, H. Zha, X. Yang, and S. Chu, Multiview point registration via alternating optimization, In AAAI, 2015.
 [15] Y. Aflalo, A. Bronstein, and R. Kimmel, On convex relaxation of graph isomorphism, PNAS, 112(10):2942–2947, 2015.
 [16] A. Wong and M. You, Entropy and distance of random graphs with application to structural pattern recognition, PAMI, 1985.
 [17] M. Leordeanu, A. Zanfir, and C. Sminchisescu. Semisupervised learning and optimization for hypergraph matching. In ICCV, 2011.
 [18] M. Cho, S. Kwak, C. Schmid, and J. Ponce, Unsupervised object discovery and localization in the wild: Partbased matching with bottomup region proposals, In CVPR 2015.
 [19] J. Maciel and J.P. Costeira, A global solution to sparse correspondence problems, PAMI, 2003.
 [20] X. Zhou, M. Zhu, and K. Daniilidis, Multiimage matching via fast alternating minimization, In ICCV, 2015.
 [21] E. M. Loiola and N. Abreu and P. O. BoaventuraNetto and P. Hahn and T. Querido, A survey for the quadratic assignment problem. EJOR, pages 657–90, 2007.
 [22] J. Yan, C. Zhang, H. Zha, W. Liu, X. Yang, and S. Chu, Discrete hypergraph matching. In CVPR, 2015.
 [23] J. Yan, Y. Tian, H. Zha, X. Yang, Y. Zhang and S. Chu, Joint optimization for consistent multiple graph matching. In ICCV, 2013.
 [24] J. Yan, J. Wang, H. Zha, and X. Yang, Consistencydriven alternating optimization for multigraph matching: A unified approach. TIP, 2015.
 [25] J. Munkres, Algorithms for the assignment and transportation problems, Journal of the Society for Industrial & Applied Mathematics, vol. 5, no. 1, pp. 32–38, 1957.
 [26] R. Jonker and A. Volgenant A shortest augmenting path algorithm for dense and sparse linear assignment problems, Computing, vol. 38, no. 4, pp. 325–340, 1987.

[27]
Q. Ngoc, A. Gautier, and M. Hein,
A flexible tensor block coo rd inate ascent scheme for hypergraph matching,
In CVPR, 2015.  [28] S. Gold and A. Rangarajan, A graduated assignment algorithm for graph matching, IEEE Transaction on PAMI, 1996.
 [29] C.H. Lee and A. Joshi, Correspondence problem in image sequence analysis Pattern Recognition, 26(1) (1993) 47–61.
 [30] Y. Tian, J. Yan, H. Zhang, Y. Zhang, X. Yang, and H. Zha, On the convergence of graph matching: Graduated assignment revisited, In ECCV, 2012.
 [31] J. Yan, M. Cho, H. Zha, X. Yang, and S. Chu, Multigraph matching via affinity optimization with graduated, consistency regularization. TPAMI, 2016.
 [32] M. Leordeanu, M. Hebert, and R. Sukthankar. An integer projected fixed point method for graph matching and map inference, In NIPS, 2009.
 [33] J. Yan, Y. Li, W. Liu, H. Zha, X. Yang, and S. Chu, Graduated consistencyregularized optimization for multigraph matching, In ECCV, 2014.
 [34] D. Lowe, Object recognition from local scaleinvariant features, In ICCV, 1999.
 [35] J. Yan, Y. Li, E. Zheng and Y. Liu, An accelerated human motion tracking system based on voxel reconstruction under complex environments, In ACCV, 2009.
 [36] X. Shi and H. Ling and W. Hu and J. Xing and Y. Zhang, Tensor Power Iteration for MultiGraph Matching, In CVPR, 2016.
 [37] X. Shi and H. Ling and J. Xing and W. Hu, Multitarget Tracking by Rank1 Tensor Approximation, In CVPR, 2013.
 [38] A. Myronenko and X. Song, Point set registration: Coherent point drift, TPAMI, 2010.
 [39] E. M. Luks Isomorphism of graphs of bounded valence can be tested in polynomial time, Journal of Computer and System Sciences, 1982.
 [40] A. V. Aho, M. Ganapathi, and S. W. Tjiang Code generation using tree matching and dynamic programming, TOPLAS, 1989.
 [41] M. Neuhaus and H. Bunke. Selforganizing maps for learning the edit costs in graph matching, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 35(3):503–514, 2005.
 [42] M. Neuhaus and H. Bunke Automatic learning of cost functions for graph edit distance, Information Sciences, 177(1):239–247, 2007.
 [43] Z. Niu and R. Martinand M. Sabin and F. Langbein and H. Bucklow Applying database optimization technologies to feature recognition in CAD, Computeraided design and applications, 12(3):373–382, 2015.
 [44] Z. Niu and R. Martinand M. Sabin and F. Langbein and M. Sabin Rapidly finding CAD features using database optimization, Computeraided design and applications, 69:35–50, 2015.
 [45] Y. Yan and E. Ricci and R. Subramanian and G. Liu and O. Lanz and N. Sebe LA Multitask Learning Framework for Head Pose Estimation under Target Motion, PAMI, 2016.
Comments
There are no comments yet.