1 Introduction
High level information contained in images can be generally described using a sparse set of features. These features, encoded into feature descriptors [17, 5], can then be used to compare and match images [31, 21] to perform image classification [8, 14], stereo correspondence [20], or motion tracking [26, 25].
These techniques rely on the detection and characterization of features in images. The vast majority of the literature in the field focuses on the localization of feature points in space or in spacescale [17, 7, 30], however there is little published work on establishing meaningful topological relations between the detected feature points, and more generally on feature topology in images. One reason for this is that many image matching problems can be solved without this knowledge. For example, in image classification the bag of visual words paradigm [8]
uses feature histograms which do not consider spatial arrangement of the feature points. Moreover, salient point detection methods are normally a preprocessing step and therefore yield an overdetection of salient points, which is subsequently refined. Efforts to incorporate spatial information in computer vision, for example in classification problems, are commonly based on spatial pyramids
[16, 29] (which consist on hierarchical refinement of regular grids to achieve spatial localization of feature points) and are used to provide structurallydiscriminant image representations. However these techniques do not explicitly capture image structure.To compensate for the lack of feature topology, contextaware feature point descriptors have been proposed [12, 22, 15]. These descriptors can capture the appearance of a feature in relation to its surroundings, which has been shown to increase accuracy in image matching; still they cannot establish the relation in space with near feature points.
Recently, graphbased approaches have gained interest of the community [26, 23, 27]. If images can be represented through graphs, the graphmatching problem can be solved efficiently using graph factorization, discrete optimization [10] or, thanks to the relatively low dimensionality, bruteforce search in some cases. Most graphmatching methods build graphs from a set of feature points (detected with any of the methods cited above) and simply compute edges from feature points by Delaunay triangulation [32], or by nearest neighbours [2]. these approaches have two main drawbacks: first the added computational cost of computing the edges after the feature points have been detected; and second, there is no mechanism enforcing that edges have a meaningful relation with the underlying structure or with true relations between feature points other than spatial proximity.
All the previous approaches that use graph representations assume that the information needed for matching is embedded in feature descriptors at the nodes and that spatial relations between these descriptors (encoded into edges) are supplementary or just computationally convenient. An exception to the above approaches, for the specific application of shape classification using binary images, is the construction of graphs from image skeletons [9, 28, 4]. The resulting graphs, which run along the medial shape line, are descriptive of the structure of the binary object, however by construction nodes are far from edges, corners or more generally any salient feature points, rendering this class of approaches not very well suitable for establishing correspondences or more generally for describing images through features.
Frequently, high level information which is not localized at a single point, such as contours, is sought. Contours are normally computed using image segmentation techniques. Image segmentation is out of the scope of this paper but the interested reader can find many comprehensive reviews in the literature [13, 3].
In this paper, we propose a novel method to extract image structure as a connected graph, named oversegmenting graph. The nodes of the oversegmenting graph are located at salient image points, the edges approximate an oversegmentation of the image as the graph resolution increases, and a saliency measure is associated with each edge. Oversegmenting graphs provide a sparse, topological representation of image data where both nodes and edges carry meaningful information of the image structure and the computation is reduced to a onedimensional problem, hence they can be computed very efficiently.
The remainder of the paper is organised as follows. Section 2 describes related work on superpixel clustering and variational image meshing. The proposed method is described in Sec. 3, including the salient point detection (Sec. 3.1), the iterative adaptation process (Sec. 3.3) and the computation of the oversegmenting graph (Sec. 3.4). The experiments are described in Sec. 4 and the results, including a qualitative comparison with other related methods, are provided in Sec. 5.
2 Related Work
The proposed work is related to two different topics in computer vision and computational imaging: superpixel clusterization and image meshing. These methods are briefly introduced here, and a more detailed comparison with the proposed method is given in Sec. 5.3.
Superpixel clustering methods have gained popularity, particularly when image contours are needed as a preprocessing step. Such methods provide an oversegmentation of image regions, i.e. an overdetection of image contours. In summary, they aim at grouping connected pixels into clusters which share similar intensities. Connectivity can be imposed implicitly or enforced explicitly. Arguably, the most popular superpixel method to date is SLIC superpixels [1], in which pixels are associated to the closest seed point and closeness is measured in terms of Euclidean and color distances. This method is particularly efficient because for a given pixel, distance is only computed to a small number of neighbouring seeds, hence the algorithm is linear with the number of seeds. At each iteration, seeds are replaced by the centroid of the current superpixel, until a maximum number of iterations is reached or until convergence.
Image meshing consists on extracting an (ideally) uniform triangular mesh from an image, so that image contours are aligned with triangle edges, normally in the aim of using the resulting mesh as a discretized domain for computational modelling using the Finite Element Method (FEM) [6, 24], for example to simulate flow, mechanics, or other physical phenomena. This meshing process is normally carried out in two steps, a first segmentation step in which the domain of interest is extracted from the image, followed by a meshing step, in which a binary image of the extracted domain is meshed. A method to carry out the image meshing in one step, called imagebased variational meshing was recently proposed [11]. This method adapts a triangular mesh to image data directly, by defining an objective function related to the goodnessoffit of the mesh to the image data and searching for the optimum of such function.
Our proposed oversegmenting graphs use distances similar to those used for superpixel clusterization but aim at finding salient points rather than at clustering pixels. The search is carried out along graph edges, insted of on a ND image domain, thus reducing computational complexity and yielding a measure of saliency. Similarly to image meshing, in our proposed method a graph is adapted to image data. However, the adaptation process does not target aligning triangle edges with image contours but rather the opposite: finding points along graph edges that intersect image edges. Additionally, our proposed method does not require a variational optimization process and computations are independent for each edge.
3 Methods
The proposed method is generic for dimensional (D) image data. For simplicity, it will be presented for 2D images, and precisions on other dimensions will be made in the Discussion section.
In summary, an initial graph is iteratively adapted to image features, computing a saliency measure for each feature along the way. Then the oversegmenting graph is computed as the dual of the adapted graph. This process is illustrated in Fig. 1, and described more formally in Sec. 3.1, 3.2 and 3.4 below.
3.1 Salient Point Detection
Let be a graph where is a graph edge defined between the nodes . Initially, is initialized as a homogeneous triangulated mesh that covers the image extent, as shown in Fig. (a)a. For each edge , we first search for the point along that is on an image feature point. We part into two segments and , and through these, we propose two methods to find , described in Sec. 3.1.1 and 3.1.2 respectively. Examples of the behaviour of these two methods are illustrated in Fig. 3.2.
3.1.1 DistanceBased Feature Point Detection
This approach searches for the feature point in a similar way as SLIC superpixels [1] search for boundaries between pixel clusters. For each edge , we define that is on an image feature point if there is a distance measure for which
(1) 
We compute as a weighted sum of the distance in space and the distance in image intensity. Distance in space is defined as the Euclidean distance on the graph edge:
(2) 
and image intensity distance, , is defined as the absolute difference in intensity from the edge nodes:
(3) 
In order to ensure independence from edge length and local intensity range, is normalised by the edge length , and is normalized by a parameter . These two distance measures can be combined using a weighting factor :
(4) 
The above equations can be analogously described for the other edge node, . As pointed out by Achanta et al. [1], variability of intensities throughout the image and between images make it difficult to calculate . As a consequence, can be made constant to an arbitrary value and then the contribution of the intensity distance can be fixed for each image and application using the weighting factor that controls the tradeoff between the two distances. In practice, setting
to the estimated image intensity range will set each term to be between
and independently on the edge lengths and the image intensity along the edges. Lower values of enforce image boundary adaptation while larger values of push the salient point towards the centre of the edge. Using this formulation, the salientmost image point on the graph edge can be found at the crossing of and , as shown in the example in Fig. 2 (top row, central column).3.1.2 Robust Feature Point Detection
The intensity distance measure presented in Sec. 3.1.1 is effective for pixel clusterization but can be very sensitive to noise (e.g. Fig. 2). For this reason, we introduce a robust alternative to find the salientmost point along the graph edge. We define the integrated intensity values along the edge , starting from each edge node and as follows:
(5) 
where is a weighting function which, for simplicity, we take as (or for the other node). If we consider that crosses only one image contour, and that this crossing occurs at the edge point , then and from (5) represent, respectively, the average value of the image along at each side of the contour. As a result, the squared difference between the two will be maximum when the correct salient location is found, because it ensures the more distinct average values at each side. More formally, saliency along the edge can be described by the function :
(6) 
On homogeneous regions, the desired behaviour is that the salient point is found at the center of the edge and with a low saliency value. To achieve this, we add a regularization term, , which is a convex parabola valued 0 at the nodes and 1 at the center, and therefore pushes the maximum towards the center of the edge:
(7) 
The complete saliency measure is
(8) 
The salient point can be found by maximizing :
(9) 
In practice, for each edge, the image is sampled at regular intervals depending on the edge length in relation to the image resolution. For the images used in this paper, between 10 and 30 samples along each edge were used. As a result, (9) can be solved by exhaustive search with very little computational cost.
3.2 Saliency Measure
As described above, the graph adaptation process consists of a search for feature points in 1D. As a result, feature saliency is determined by the shape of the distance functions at the feature point. We define saliency in two different ways for the two feature point search strategies proposed:

Distancebased feature point detection (Sec. 3.1.1):
Salient features such as image contours will yield steeper crossings of the distance curves, simply because changes in intensity along the graph edge will be higher, as can be seen in the noiseless case in the top central graph in Fig. 2. In this case, saliency is computed using the highest slope value of the two intensity distance curves at the point where they cross, i.e.:(10) where is the slope of .
Note that we only consider image intensity distance to compute saliency, because feature saliency should be independent of the location of the graph nodes relative to the salient point.
3.3 Iterative Graph Adaptation
Once the salient points have been found with either of the Distancebased or the Robust feature point detection method, the nodes of the adapted graph are updated to the centroid of the salient points adjacent to each node. The process is described in Alg. 1. After adaptation, the average euclidean distance between the graph nodes before and after adaptation is used as a measure of residual, . This process can be carried out iteratively until reaches a maximum residual or after a specific number of iterations.
3.4 Dual Graphs
Edges of the adapted graph run across image contours but the desired behaviour of oversegmenting graphs is that edges run along image contours. For this, the oversegmenting graph is computed as the dual of the adapted graph.
Obtaining the dual graph (, Fig. (e)e) from the adapted graph (, Fig. (d)d) is done as follows. The vertices of the dual graph, are simply the salient points found in . Edges connect every two salient points , which run along two edges that share a node in . The edge saliency associated to can be computed from the saliency measures , :
(11) 
3.5 Algorithmic Complexity
As there is one salient point calculation per graph edge, independently of the other edges, the algorithm is linear with the number of edges. As the number of edges is in turn linear with the number of nodes, the complexity of the algorithm is .
Since at each iteration each edge computed independently on the other edges, the algorithm would lend itself to an efficient parallel implementation.
4 Materials and Experiments
We carry out experiments in synthetic images and on 200 images from the Berkeley Segmentation Data Set (BSDS300) [19]. As is commonly done [1, 18], we evaluate the performance of our method through the adherence to boundaries. We do this by approximating the boundary recall for the salient points as follows: a salient point is considered to lay on the boundary (i.e. ) if , where is the distance to the boundary (computed through the morphological distance) and is a threshold set to two pixels, as in [1]. Using this convention, for each dual graph we define the following sets:
(12) 
The boundary recall is defined as the fraction of salient points where saliency is beyond that are at a distance from the image contour , and can be calculated using the cardinality of the above sets:
(13) 
Note that the definition of recall in (13) is different to the standard boundary recall definition for superpixels (i.e. fraction of boundary pixels that are coincident with superpixel edges), and therefore the two are not directly comparable.
For the experiments on synthetic data we generated six synthetic binary images, shown in Fig. 3, to investigate the performance of the propose method using the two feature point detection strategies (Distancebased and Robust), for image contours of different shapes and different noise levels. Additive centred Gaussian noise , with between and the maximum image intensity value, was used.
In our experiments on natural images from BSDS300 we analysed the recall for different number of nodes in the graph and different values of . We also provide a qualitative comparison between the proposed method and two related methods: (SLIC) superpixel clustering and variational image meshing.
5 Results
5.1 Results on Synthetic Images
Figure 4 shows the boundary recall curves for the two proposed methods (SLIC distance on top and robust saliency on the bottom) used to compute the oversegmenting graphs from the synthetic images from Fig. 3. The curves were obtained as an average over different values of the number of graph nodes (). The colour code indicates the amount of centred Gaussian noise added, ranging from in black to the maximum image intensity in light grey. The horizontal axis indicates the saliency threshold (in logarithmic units) beyond which salient points were considered for the boundary recall.
It can be observed that the curves using the SLIC distance follow a similar trend to the curves using the robust saliency but with are displaced towards lower saliency values. In other words, using the robust saliency measure the boundary recall obtained for the same is between a and a greater.
These results are consistent with the qualitative results shown in Fig. 5. Six representative examples of oversegmenting graphs obtained using the two proposed methods on input images with varying geometry and amount of noise are shown. All these examples were produced using and graph nodes. The resulting graphs are colourcoded by saliency (white indicates low saliency and black high saliency).
It can be observed that while for smaller amounts of noise (first row of Fig. 5, ) the results with the two methods are fairly similar, the results using the robust saliency measure outperform those obtained with the SLIC distance, both in terms of saliency encoding (a more compact saliency can be observed along the true image edges) and in terms of adaptation ability. For example this is most obious in the Donut figure at the bottom, where the results using the SLIC distance not only cannot accurately highlight the true image shape but also the adapted edges do not follow the donut shape.
5.2 Results on the BSDS300 Images
Figure 6 shows the boundary recall curves for the two proposed methods (SLIC distance on the top, robust saliency on the bottom) in natural images from the BSDS300 database. The horizontal axis shows the number of graph nodes used, and the colour code indicates the saliency threshold beyond which salient points are considered to compute the boundary recall.
For the two methods, as expected, higher saliency thresholds yield a higher recall for all number of nodes in the graph. Consistently with the results on synthetic images, the robust saliency measure yields higher recall by approximately .
Figure 7 shows the resulting oversegmenting graphs on four images from the BSDS300 database using the robust saliency measure. Each image shows the graph at three different resolutions (100, 400 and 900 grid nodes). The graph edges are colourcoded by the saliency, in logarithmic scale. It can be observed that the high saliency edges are consistent with apparent image contours, and that particularly at fine scales the graph follows image contours.
5.3 Comparison to Other Related Methods
The proposed technique shares some characteristics with two different methods in computer vision and image processing. In this sections we discuss similarities and differences with them.
5.3.1 Relation to Superpixel Clustering
Superpixel clustering methods produce an oversegmentation of the input image by grouping together (clustering) pixels with similar intensities that are connected. Because of our choice of initial graph as a uniform triangular topology, the dual oversegmenting graph yields a partition of the space into hexagons and triangles (e.g. Figs. 1 and 7) and therefore the compactness of these regions is guaranteed by definition. In the case of superpixels, connectivity is normally only enforced [1, 18]; (although in practice can be ensured given a sufficiently high regularization). While superpixels aim at creating groups of pixels that snap to image boundaries, our proposed method aims at snapping points at image boundaries and connecting those points, hence in practice snapping edges to image boundaries. This difference is of importance for three reasons: first, our proposed method does a 1D search regardless of the image dimensionality, while superpixel methods do an D search; second, superpixel methods label pixels in the image (without any explicit measure of how different two neighbouring regions are) while our method labels edges in a graph, and actually provides a saliency values for these edges which measures how different two image regions are at each side of the graph edge; and third, superpixel methods fill the entire image space with pixel clusters, while our proposed method only does so if the initial graph covers the entire image, but is not required to do so which for certain applications can be an advantage.
Although superpixels and oversegmenting graphs are different in nature, they can yield visually similar results. Figure 8 shows a qualitative comparison of SLIC superpixels [1] with our oversegmenting graphs computed at different resolutions for the 3 example images from the BSDS300 database. It can be observed that our proposed method appears to converge to SLIC superpixels as the number of grid nodes increases, and also provides a saliency measure for graph edges that are consistent with image contours.
5.3.2 Relation to Image Meshing
Image meshing consists on adapting a (usually triangular) mesh to image features. Similarities with our proposed method are relevant because triangle edges and vertices can be regarded as edges and nodes of a graph. Indeed, image meshing can be regarded as an example application of oversegmenting graphs. The main differences between image meshing and oversegmenting graphs is that meshing aims at discretizing the image space into triangular cells as regular as possible that adapt well to image features, normally to perform a finite element method analysis on the mesh. As a result the focus is on adapting the triangles to homogeneous image regions rather than adapting edges to image boundaries. For this reason the meshing problem is formulated in 2D and is computationally more complex than our proposed 1D approach. To the best of our knowledge, image meshing has been normally formulated as a variational problem that requires the optimization of a cost function, as opposed to our approach which does not.
In order to provide a qualitative comparison of image meshing and our proposed method, oversegmenting graphs can yield a triangulated output by passing the nodes from the adapted graph to the dual graph and connecting this nodes to all nodes in the dual graph that belong to the same edge as the salient point. Figure 9 shows an example of an imagebashed meshing from [11], and the result of applying our proposed method to the same image. In order to produce triangles in the dual graph, nodes from the adapted graph were passed on to the dual graph and connected to all salient nodes originated from the same triangle.
Figure (a)a shows the original input image, taken from [11]. The original adapted mesh from their paper is shown in Fig (b)b, side to side with the resulting oversegmenting graph in Fig. (c)c. Since the edges and nodes added to make a triangulated graph do not have any saliency measured associated, the graph is represented in grey as the original image. Figures (d)d and (e)e show the discretized space using the triangulation, and colouring each triangle with the mean intensity of the pixel data within the triangle. Our proposed triangulation has the ability to adapt to the shape, however does produce triangles which are not as uniform as the variational meshing approach, mainly because triangle uniformity is not enforced explicitly in our case.
6 Discussion and Conclusions
We have presented a novel graph adaptation method, named oversegmenting graphs, which aims at adapting a graph to image data. Starting from an initial graph, salient image points are searched for along edges, and the graph nodes are iteratively moved to the centroid of connected salient points. The oversegmenting graph is the dual of the adapted graph. The adaptation process lends itself naturally to a measurement of the edge saliency.
We have proposed two methods to find salient points along edges: a method inspired by SLIC superpixels [1], and a novel method which provides robustness against noise. Overall, the robust saliency measure was shown to outperform the SLICinspired (Distancebased) method, achieving higher boundary recall and qualitatively better image adaptation. This improvement was particularly obvious with synthetic images with large additive noise, for the averaging effect of the robust saliency measure. The effect, although present, was less significant with natural images from the BSDS300 which are not particularly noisy.
Extension of the proposed method to 3D or higher dimensions is straightforward and requires no modification other than using a 3D graph initially. This is because all computations are carried out edgewise or nodewise. Similarly, the proposed method has been described for scalar (grey scale) images, unlike most superpixel methods which are described for colour (RGB) images. Extension to colour images would only involve modifying the saliency point search. This is trivial for the SLIC distance (which is already described in the original SLIC paper [1]). Extension of the robust saliency measure to colour images could also be further investigated.
The proposed method was presented using an initial graph with uniform triangular topology, however the method is not limited by the type of initial graph. However, because the adaptation process moves every graph node to the centroid of the salient points found on edges sharing the actually node, it is desirable (unless there is some problemspecific reason not to) that all nodes have the same number of connected edges so that every salient point contributes equally to moving all connected nodes.
We have shown that oversegmenting graphs, with the associated saliency measure, can provide a sparse representation of shapes in images. Such representation has potential for image matching, image classification, and more generally image processing methods that require a description of the topology of content in an image and where speed and computational efficiency are needed.
References
 [1] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk. SLIC superpixels compared to stateoftheart superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11):2274–2281, 11 2012.
 [2] W. Aguilar, Y. Frauel, F. Escolano, M. E. MartinezPerez, A. EspinosaRomero, and M. A. Lozano. A robust Graph Transformation Matching for nonrigid registration. Image and Vision Computing, 27(7):897–910, 2009.
 [3] P. Arbela?ez, M. Maire, C. Fowlkes, and J. Malik. Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5):898–916, 5 2011.
 [4] E. Baseski, A. Erdem, and S. Tari. Dissimilarity between two skeletal trees in a context. Pattern Recognition, 42(3):370–385, 3 2009.
 [5] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool. SpeededUp Robust Features (SURF). Computer Vision and Image Understanding, 110(3):346–359, 2008.
 [6] S. K. Boyd and R. Müller. Smooth surface meshing for automated finite element model generation from 3D image data. Journal of Biomechanics, 39(7):1287–1295, 1 2006.
 [7] K. Chatfield, V. S. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. BMVC, 2(4), 2011.
 [8] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, pages 1–2, 2004.
 [9] C. Di Ruberto. Recognition of shapes by attributed skeletal graphs. Pattern Recognition, 37(1):21–31, 2004.
 [10] P. F. Felzenszwalb and R. Zabih. Dynamic Programming and Graph Algorithms in Computer Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(4):721–740, 4 2011.
 [11] O. Goksel and S. E. Salcudean. ImageBased Variational Meshing. IEEE Transactions on Medical Imaging, 30(1):11–21, 1 2011.
 [12] M. P. Heinrich, M. Jenkinson, M. Bhushan, T. Matin, F. V. Gleeson, S. M. Brady, and J. A. Schnabel. MIND: Modality independent neighbourhood descriptor for multimodal deformable registration. Medical Image Analysis, 16(7):1423–1435, 2012.
 [13] D. E. Ilea and P. F. Whelan. Image segmentation based on the integration of colour?texture descriptors?A review. Pattern Recognition, 44(1011):2479–2501, 10 2011.
 [14] H. Jegou, M. Douze, C. Schmid, and P. Perez. Aggregating local descriptors into a compact image representation. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3304–3311. IEEE, 6 2010.
 [15] D. Jiang, Y. Shi, D. Yao, M. Wang, and Z. Song. miLBP: a robust and fast modalityindependent 3D LBP for multimodal deformable registration. International Journal of Computer Assisted Radiology and Surgery, 11(6):997–1005, 6 2016.
 [16] S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition  Volume 2 (CVPR’06), volume 2, pages 2169–2178. IEEE.
 [17] D. G. Lowe. Distinctive Image Features from ScaleInvariant Keypoints. International Journal of Computer Vision, 60(2):91–110, 11 2004.
 [18] V. Machairas, M. Faessel, D. C??rdenasPe??a, T. Chabardes, T. Walter, and E. Decenci??re. Waterpixels. IEEE Transactions on Image Processing, 24(11):3707–3716, 11 2015.
 [19] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pages 416–423. IEEE Comput. Soc.
 [20] K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615–1630, 10 2005.
 [21] F. Remondino, M. G. Spera, E. Nocerino, F. Menna, and F. Nex. State of the art in high density image matching. The Photogrammetric Record, 29(146):144–166, 6 2014.
 [22] H. Rivaz, Z. Karimaghaloo, and D. L. Collins. Selfsimilarity weighted mutual information: A new nonrigid image registration metric. Medical Image Analysis, 18(2):343–358, 2014.
 [23] F. Serratosa. Fast computation of Bipartite graph matching. Pattern Recognition Letters, 45:244–250, 8 2014.
 [24] H. Si and Hang. TetGen, a DelaunayBased Quality Tetrahedral Mesh Generator. ACM Transactions on Mathematical Software, 41(2):1–36, 2 2015.
 [25] G. Takacs, V. Chandrasekhar, S. Tsai, D. Chen, R. Grzeszczuk, and B. Girod. Unified RealTime Tracking and Recognition with RotationInvariant Fast Features. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 934–941. IEEE, 6 2010.
 [26] L. Torresani, V. Kolmogorov, and C. Rother. Feature Correspondence Via Graph Matching: Models and Global Optimization. In Computer Vision – ECCV 2008, pages 596–609. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.
 [27] M. Vento. A long trip in the charming world of graphs for Pattern Recognition. Pattern Recognition, 48(2):291–301, 2 2015.
 [28] Xiang Bai and L. Latecki. Path Similarity Skeleton Graph Matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7):1282–1292, 7 2008.
 [29] Yangqing Jia, Chang Huang, and T. Darrell. Beyond spatial pyramids: Receptive field learning for pooled image features. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 3370–3377. IEEE, 6 2012.
 [30] Yongzhen Huang, Zifeng Wu, Liang Wang, and Tieniu Tan. Feature Coding in Image Classification: A Comprehensive Study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3):493–506, 3 2014.
 [31] D. Zhang and G. Lu. Review of shape representation and description techniques. Pattern Recognition, 37(1):1–19, 1 2004.
 [32] F. Zhou and F. De la Torre. Factorized Graph Matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(9):1774–1789, 9 2016.