KerGM: Kernelized Graph Matching

11/25/2019 ∙ by Zhen Zhang, et al. ∙ Washington University in St Louis William & Mary 0

Graph matching plays a central role in such fields as computer vision, pattern recognition, and bioinformatics. Graph matching problems can be cast as two types of quadratic assignment problems (QAPs): Koopmans-Beckmann's QAP or Lawler's QAP. In our paper, we provide a unifying view for these two problems by introducing new rules for array operations in Hilbert spaces. Consequently, Lawler's QAP can be considered as the Koopmans-Beckmann's alignment between two arrays in reproducing kernel Hilbert spaces (RKHS), making it possible to efficiently solve the problem without computing a huge affinity matrix. Furthermore, we develop the entropy-regularized Frank-Wolfe (EnFW) algorithm for optimizing QAPs, which has the same convergence rate as the original FW algorithm while dramatically reducing the computational burden for each outer iteration. We conduct extensive experiments to evaluate our approach, and show that our algorithm significantly outperforms the state-of-the-art in both matching accuracy and scalability.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Graph matching (GM), which aims at finding the optimal correspondence between nodes of two given graphs, is a longstanding problem due to its nonconvex objective function and binary constraints. It arises in many applications, ranging from recognizing actions Brendel and Todorovic (2011); Gaur et al. (2011) to identifying functional orthologs of proteins Elmsallati et al. (2015); Wu et al. (2019). Typically, GM problems can be formulated as two kinds of quadratic assignment problems (QAPs): Koopmans-Beckmann’s QAP Koopmans and Beckmann (1957) or Lawler’s QAP Lawler (1963). Koopman-Beckmann’s QAP is the structural alignment between two weighted adjacency matrices, which, as a result, can be written as the standard Frobenius inner product between two matrices, where

denotes the number of nodes. However, Koopmans-Beckmann’s QAP cannot incorporate complex edge attribute information, which is usually of great importance in characterizing the relation between nodes. Lawler’s QAP can tackle this issue, because it attempts to maximize the overall similarity that well encodes the attribute information. However, the key concern of the Lawler’s QAP is that it needs to estimate the

pairwise affinity matrix, limiting its application to very small graphs.

In our work, we derive an equivalent formulation of Lawler’s QAP, based on a very mild assumption that edge affinities are characterized by kernels Hofmann et al. (2008); Rasmussen (2003). After introducing new rules for array operations in Hilbert spaces, named as operations, we rewrite Lawler’s QAP as the Koopmann-Beckmann’s alignment between two arrays in a reproducing kernel Hilbert space (RKHS), which allows us to solve it without computing the huge affinity matrix. Taking advantage of the operations, we develop a path-following strategy for mitigating the local maxima issue of QAPs. In addition to the kernelized graph matching (KerGM) formulation, we propose a numerical optimization algorithm, the entropy-regularized Frank-Wolfe (EnFW) algorithm, for solving large-scale QAPs. The EnFW has the same convergence rate as the original Frank-Wolfe algorithm, with far less computational burden in each iteration. Extensive experimental results show that our KerGM, together with the EnFW algorithm, achieves superior performance in both matching accuracy and scalability.

Related Work: In the past forty years, a myriad of graph matching algorithms have been proposed Conte et al. (2004), most of which focused on solving QAPs. Previous work Almohamad and Duffuaa (1993); Gold and Rangarajan (1996); Kushinsky et al. (2019)

approximated the quadratic term with a linear one, which consequently can be solved by standard linear programming solvers. In

Schellewald et al. (2001), several convex relaxation methods are proposed and compared. It is known that convex relaxations can achieve global convergence, but usually perform poorly because the final projection step separates the solution from the original QAP. Concave relaxations Maron and Lipman (2018); Maciel and Costeira (2003) can avoid this problem since their outputs are just permutation matrices. However, concave programming Chinchuluun et al. (2005) is NP-hard, which limits its applications. In Zaslavskiy et al. (2009), a seminal work termed the "path-following algorithm" was proposed, which leverages both the above relaxations via iteratively solving a series of optimization problems that gradually changed from convex to concave. In Liu and Qiao (2014); Wang et al. (2018, 2016); Yu et al. (2018), the path following strategy was further extended and improved. However, all the above algorithms, when applied to Lawler’s QAP, need to compute the affinity matrix. To tackle the challenge, in Zhou and De la Torre (2012), the authors elegantly factorized the affinity matrix into the Kronecker product of smaller matrices. However, it still cannot be well applied to large dense graphs, since it scales cubically with the number of edges. Beyond solving the QAP, there are interesting works on doing graph matching from other perspectives, such as probabilistic matchingZass and Shashua (2008), hypergraph matching Lee et al. (2011), and multigraph matching Yan et al. (2015). We refer to Yan et al. (2016) for a survey of recent advances.

Organization: In Section 2, we introduce the background, including Koopmans-Beckmann’s and Lawler’s QAPs, and kernel functions and its reproducing kernel Hilbert space. In Section 3, we present the proposed rules for array operations in Hilbert space. Section 4 and Section 5 form the core of our work, where we develop the kernelized graph matching, together with the entropy-regularized Frank-Wolfe optimizaton algorithm. In Section 6, we report the experimental results. In the supplementary material, we provide proofs of all mathematical results in the paper, along with further technical discussions and more experimental results.

2 Background

2.1 Quadratic Assignment Problems for Graph Matching

Let be an undirected, attributed graph of nodes and edges, where is the adjacency matrix, and are the respective node set and node attributes matrix, and and are the respective edge set and edge attributes matrix. Given two graphs and of nodes111We assume and have the same number of nodes. If not, we add dummy nodes., the GM problem aims to find a correspondence between nodes in and which is optimal in some sense.

For Koopmans-Beckmann’s QAP Koopmans and Beckmann (1957), the optimality refers to the Frobenius inner product maximization between two adjacency matrices after permutation, i.e.,

(1)

where is the Frobenius inner product. The issue with (1) is that it ignores the complex edge attributes, which are usually of particular importance in characterizing graphs.

For Lawler’s QAP Lawler (1963), the optimality refers to the similarity maximization between the node attribute sets and between the edge attribute sets, i.e.,

(2)

where and are the node and edge similarity measurements, respectively. Furthermore, (2) can be rewritten in compact form:

(3)

where is the node affinity matrix, is an matrix, defined such that if , , , and , otherwise, . It is well known that Koopmans-Beckmann’s QAP is a special case of Lawler’s QAP if we set and . The issue of is that the size of scales quadruply with respect to , which precludes its applications to large graphs. In our work, we will show that Lawler’s QAP can be written in the Koopmans-Beckmann’s form, which can avoid computing .

2.2 Kernels and reproducing kernel Hilbert spaces

Given any set , a kernel is a function for quantitatively measuring the affinity between objects in . It satisfies that there exist a Hilbert space, , and an (implicit) feature map , such that , . The space is the reproducing kernel Hilbert space associated with .

Note that if is a Euclidean space i.e., , many similarity measurement functions are kernels, such as , , and , .

3 -operations for arrays in Hilbert spaces

Let be any Hilbert space, coupled with the inner product taking values in . Let be the set of all arrays in , and let , , i.e., , , . Analogous to matrix operations in Euclidean spaces, we make the following addition, transpose, and multiplication rules (-operations), i.e.,, and we have

  1. , , .

  2. , where .

  3. , , where and .

Note that if , all the above degenerate to the common operations for matrices in Euclidean spaces. In Fig. 1, we visualize the operation , where we let , let be a array in , and let be a permutation matrix. It is easy to see that is just after column-permutation.

Figure 1: Visualization of the operation .

As presented in the following corollary, the multiplication satisfy the combination law.

Corollary 1.

, .

Based on the -operations, we can construct the Frobenius inner product on .

Proposition 1.

Define the function such that , , . Then is an inner product on .

As an immediate result, the function , defined such that , is the Frobenius norm on . Next, we introduce two properties of , which play important roles for developing the convex-concave relaxation of the Lawler’s graph matching problem.

Corollary 2.

and .

4 Kernelized graph matching

Before deriving kernelized graph matching, we first present an assumption.

Assumption 1.

We assume that the edge affinity function is a kernel. That is, there exist both an RKHS, , and an (implicit) feature map, , such that , .

Note that Assumption 1 is rather mild, since kernel functions are powerful and popular in quantifying the similarity between attributes Zhang et al. (2018), Kriege and Mutzel (2012).

For any graph , we can construct an array, :

(4)

Given two graphs and , let and be the corresponding Hilbert arrays of and , respectively. Then the edge similarity term in Lawler’s QAP (see (2)) can be written as

which shares a similar form with (1), and can be considered as the Koopmans-Beckmann’s alignment between the Hilbert arrays and . The last term in (4) is just the Frobenius inner product between two Hilbert arrays after permutation. Adding the node affinity term, we write Laweler’s QAP as222For convenience in developing the path-following strategy, we write it in the minimization form.:

(5)

4.1 Convex and concave relaxations

The form (5) inspires an intuitive way to develop convex and concave relaxations. To do this, we first introduce an auxiliary function . Applying Corollary 1 and 2, for any , which satisfies , we have

which is always a constant. Introducing to (5), we obtain convex and concave relaxations:

(6)
(7)

The convexity of is easy to conclude, because the composite function of the squared norm,

, and the linear transformation,

, is convex. We have similarity interpretation for the concavity of .

It is interesting to see that the term in (6) is just the distance between Hilbert arrays. If we set the map , then the convex relaxation of (1) is recovered (see Aflalo et al. (2015)).

Path following strategy: Leveraging these two relaxations Zaslavskiy et al. (2009), we minimize by successively optimizing a series of sub-problems parameterized by :

(8)

where is the double stochastic relaxation of the permutation matrix set, . We start at and find the unique minimum. Then we gradually increase until . That is, we optimize with the local minimizer of as the initial point. Finally, we output the local minimizer of . We refer to Zaslavskiy et al. (2009), Zhou and De la Torre (2012), and Wang et al. (2016) for detailed descriptions and improvements.

Gradients computation: If we use the first-order optimization methods, we need only the gradients:

(9)

where , ; , ; and , . In the supplementary material, we provide compact matrix multiplication forms for computing (9).

4.2 Approximate explicit feature maps

Based on the above discussion, we significantly reduce the space cost of Lawler’s QAP by avoiding computing the affinity matrix . However, the time cost of computing gradient with is , which can be further reduced by employing the approximate explicit feature maps Rahimi and Recht (2008); Wu et al. (2016).

For the kernel , we may find an explicit feature map , such that

(10)

For example, if , then is the Fourier random feature map Rahimi and Recht (2008):

(11)

Note that in practice, the performance of is good enough for relatively small values of Zhang et al. (2018). By the virtue of explicit feature maps, we obtain a new graph representation :

(12)

Its space cost is . Now computing the gradient-related terms , , and in (9) becomes rather simple. We first slice into matrices , . Then it can be easily shown that

(13)

whose the first and second term respectively involves and matrix multiplications of the size . Hence, the time complexity is reduced to . Moreover, gradient computations with (13) are highly parallizable, which also contributes to scalability.

5 Entropy-regularized Frank-Wolfe optimization algorithm

The state-of-the-art method for optimizing problem (8) is the Frank-Wolfe algorithm Maron and Lipman (2018); Leordeanu et al. (2009); Vogelstein et al. (2015); Zhou and De la Torre (2015), whose every iteration involves linear programming to obtain optimal direction , i.e.,

(14)

which is usually solved by the Hungarian algorithm Kuhn (1955). Optimizing may need to call the Hungarian algorithm many times, which is quite time-consuming for large graphs.

Figure 2: Hungarian vs Sinkhorn.

In this work, instead of minimizing in (8), we consider the following problem,

(15)

where , is the negative entropy, and the node affinity matrix in (see (5) and (8)) is normalized as to balance the node and edge affinity terms. The observation is that if is set to be small enough, the solution of (15), after being multiplied by , will approximate that of the original QAP (8) as much as possible. We design the entropy-regularized Frank-Wolfe algorithm ("EnFW" for short) for optimizing (15), in each outer iteration of which we solve the following nonlinear problem.

(16)

Note that (16) can be extremely efficiently solved by the Sinkhorn-Knopp algorithm Cuturi (2013). Theoretically, the Sinkhorn-Knopp algorithm converges at the linear rate, i.e., . An empirical comparison between the runtimes of these two algorithms is shown in Fig. 2, where we can see that the Sinkhorn-Knopp algorithm for solving (16) is much faster than the Hungarian algorithm for solving (14).

The EnFW algorithm description: We first give necessary definitions. Write the quadratic function . Then, we define the coefficient of the quadratic term as

(17)

where the second equality holds because is a quadratic function. Next, similar to the original FW algorithm, we define the nonnegative gap function as

(18)
Proposition 2.

If is an optimal solution of (15), then .

Therefore, the gap function characterize the necessary condition for optimal solutions. Note that for any , if , then we say " is a first-order stationary point". Now with the definitions of and , we detail the EnFW procedure in Algorithm 1.

1:Initialize

2:while not converge do
3:   Compute the gradient

based on (9) or (13),
4:   Obtain the optimal direction

by solving (16), i.e.,

,
5:   Compute

and

,
6:   Determine the stepsize : If

; , else

,
7:   Update

.
8:end
9:Output the solution

.
Algorithm 1 The EnFW optimization algorithm for minimizing (15)

After obtaining the optimal solution path , , we discretize by the Hungarian Kuhn (1955) or the greedy discretization algorithm Cho et al. (2010) to get the binary matching matrix. We next highlight the differences between the EnFW algorithm and the original FW algorithm: (i) We find the optimal direction by solving a nonlinear convex problem (16) with the efficient Sinkhorn-Knopp algorithm, instead of solving the linear problem (14). (ii) We give an explicit formula for computing the stepsize , instead of making a linear search on for optimizing or estimating the Lipschitz constant of Pedregosa et al. (2018).

5.1 Convergence analysis

In this part, we present the convergence properties of the proposed EnFW algorithm, including the sequentially decreasing property of the objective function and the convergence rates.

Theorem 1.

The generated objective function value sequence, , will decreasingly converge. The generated points sequence, , will weakly converge to the first-order stationary point, at the rate , i.e,

(19)

where , and

is the largest absolute eigenvalue of

.

If is convex, which happens when is small (see (8)), then we have a tighter bound .

Theorem 2.

If is convex, we have .

Note that in both cases, convex and non-convex, our EnFW achieves the same (up to a constant coefficient) convergence rate with the original FW algorithm (see Jaggi (2013) and Pedregosa et al. (2018)). Thanks to the efficiency of the Sinkhorn-Knopp algorithm, we need much less time to finish each iteration. Therefore, our optimization algorithm is more computationally efficient than the original FW algorithm.

6 Experiments

In this section, we conduct extensive experiments to demonstrate the matching performance and scalability of our kernelized graph matching framework. We implement all the algorithms using Matlab on an Intel i7-7820HQ, 2.90 GHz CPU with 64 GB RAM. Our codes and data are available at https://github.com/ZhenZhang19920330/KerGM_Code.

Notations: We use to denote our algorithm when we use exact edge affinity kernels, and use to denote it when we use approximate explicit feature maps.

Baseline methods: We compare our algorithm with many state-of-the-art graph (network) matching algorithms: (i) Integer projected fixed point method (IPFP) Leordeanu et al. (2009), (ii) Spectral matching with affine constraints (SMAC) Cour et al. (2007), (iii) Probabilistic graph matching (PM) Zass and Shashua (2008) , (iv) Re-weighted random walk matching (RRWM) Cho et al. (2010), (v) Factorized graph matching (FGM) Zhou and De la Torre (2012), (vi) Branch path following for graph matching (BPFG) Wang et al. (2016), (vii) Graduated assignment graph matching (GAGM) Gold and Rangarajan (1996), (viii) Global network alignment using multiscale spectral signatures (GHOST) Patro and Kingsford (2012), (ix) Triangle alignment (TAME) Mohammadi et al. (2017), and (x) Maximizing accuracy in global network alignment (MAGNA) Saraph and Milenković (2014). Note that GHOST, TAME, and MAGNA are popular protein-protein interaction (PPI) networks aligners.

Settings: For all the baseline methods, we used the parameters recommended in the public code. For our method, if not specified, we set the regularization parameter (see (15)) and the path following parameters . We use the Hungarian algorithm for final discretization. We refer to the supplementary material for other implementation details.

6.1 Synthetic datasets

We evaluate algorithms on the synthetic Erdos–Rényi ERDdS and R&wi (1959) random graphs, following the experimental protocol in Gold and Rangarajan (1996); Zhou and De la Torre (2012); Cho et al. (2010). For each trial, we generate two graphs: the reference graph and the perturbed graph , each of which has inlier nodes and outlier nodes. Each edge in

is randomly generated with probability

. The edges are associated with the edge attributes . The corresponding edge has the attribute , where is a permutation map for inlier nodes, and is the Gaussian noise. For the baseline methods, the edge affinity value between and is computed as . For our method, we use the Fourier random features (11) to approximate the Gaussian kernel, and represent each graph by an array in . We set the parameter and the dimension .

Comparing matching accuracy. We perform the comparison under three parameter settings, in all of which we set . Note that different from the standard protocol where Zhou and De la Torre (2012), we use relatively large graphs to highlight the advantage of our . (i) We change the number of outlier nodes, , from 0 to 50 while fixing the noise, , and the edge density, . (ii) We change from 0 to 0.2 while fixing and . (iii) We change from 0.3 to 1 while fixing and

. For all cases in these settings, we repeat the experiments 100 times and report the average accuracy and standard error in Fig. 

3 (a). Clearly, our outpeforms all the baseline methods with statistical significance.

Comparing scalability. To fairly compare the scalability of different algorithms, we consider the exact matching between fully connected graphs, i.e., , , and . We change the number of nodes, (), from 50 to 2000, and report the CPU time of each algorithm in Fig. 3 (b). We can see that all the baseline methods can handle only graphs with fewer than 200 nodes because of the expensive space cost of matrix (see (3)). However, can finish Lawler’s graph matching problem with 2000 nodes in reasonable time.

Analyzing parameter sensitivity. To analyze the parameter sensitivity of , we vary the regularization parameter, , and the dimension, , of Fourier random features. We conduct large subgraph matching experiments by setting , , , and . We repeat the experiments 50 times and report the average accuracies and standard errors. In Fig. 4, we show the results under different and different . We can see that (i) smaller leads to better performance, which can be easily understood because the entropy regularizer will perturb the original optimal solution, and (ii) the dimension does not much affect on , which implies that in practice, we can use relatively small for reducing the time and space complexity.

Figure 3: Comparison of graph matching on synthetic datasets.
(a)
(b)
Figure 4: (a) Parameter sensitivity study of the regularizer . (b) Parameter sensitivity study of the dimension, , of the random Fourier feature.

6.2 Image datasets

The CMU House Sequence dataset has 111 frames of a house, each of which has 30 labeled landmarks. We follow the experimental protocol in Zhou and De la Torre (2012); Wang et al. (2016). We match all the image pairs, spaced by 0:10:90 frames. We consider two node settings: and . We build graphs by using Delaunay triangulation Lee and Schachter (1980) to connect landmarks. The edge attributes are the pairwise distances between nodes. For all methods, we compute the edge affinity as . In Fig. 5, we report the average matching accuracy and objective function (3) value ratio for every gap. It can be seen that on this dataset, and FGM achieve the best performance, and are slightly better than BPFG when outliers exist, i.e., .

Figure 5: Comparison of graph matching on the CMU house dataset.

The Pascal dataset Leordeanu et al. (2012) has 20 pairs of motorbike images and 30 pairs of car images. For each pair, the detected feature points and manually labeled correspondences are provided. Following Zhou and De la Torre (2012); Wang et al. (2016), we randomly select 0:2:20 outliers from the background to compare different methods. For each node, , its attribute,

, is assigned as its orientation of the normal vector at that point to the contour where the point was sampled. Nodes are connected by Delaunay triangulation

Lee and Schachter (1980). For each edge, , its attribute equals , where is the distance between and , and is the absolute angle between the edge and the horizontal line. For all methods, the node affinity is computed as . The edge affinity is computed as . Fig. 6 (a) shows a matching result of . In Fig. 6 (b), we report the matching accuracies and CPU running time. From the perspective of matching accuracy, , BPFG, and FGM consistently outperforms other methods. When the number of outliers increases, and BPFG perform slightly better than FGM. However, from the perspective of running time, the time cost of BPFG is much higher than that of the others.

(a)
(b)
Figure 6: (a) A matching example for a pair of motorbike images generated by , where green and red lines respectively indicate correct and incorrect matches. (b) Comparison of graph matching on the Pascal dataset.

6.3 The protein-protein interaction network dataset

The S.cerevisiae (yeast) PPI network Collins et al. (2007) dataset is popularly used to evaluate PPI network aligners because it has known true node correspondences.

Figure 7: Results on PPI networks.

It consists of an unweighted high-confidence PPI network with 1004 proteins (nodes) and 8323 PPIs (edges), and five noisy PPI networks generated by adding 5%, 10%, 15%, 20%, 25% low-confidence PPIs. We do graph matching between the high-confidence network with every noisy network. To apply KerGM, we generate edge attributes by the heat diffusion matrix Hu et al. (2014); Chung and Graham (1997), , where is the normalized Laplacian matrix Chung and Graham (1997), and are eigenpairs of . The edge attributes vector is assigned as . We use the Fourier random features (11), and set and . We compare 333To the best our knowledge, KerGM is the first one that uses Lawler’s graph matching formulation to solve the PPI network alignment problem. with the state-of-the-art PPI aligners: TAME, GHOST, and MAGNA. In Fig. 7, we report the matching accuracies. Clearly, significantly outperforms the baselines. Especially when the noise level are 20% or 25%, ’s accuracies are more than 50 percentages higher than those of other algorithms.

7 Conclusion

In this work, based on a mild assumption regarding edge affinity values, we provided KerGM, a unifying framework for Koopman-Beckmann’s and Lawler’s QAPs, within which both two QAPs can be considered as the alignment between arrays in RKHS. Then we derived convex and concave relaxations and the corresponding path-following strategy. To make KerGM more scalable to large graphs, we developed the computationally efficient entropy-regularized Frank-Wolfe optimization algorithm. KerGM achieved promising performance on both image and biology datasets. Thanks to its scalability, we believe KerGM can be potentially useful for many applications in the real world.

8 Acknowledgment

This work was supported in part by the AFOSR grant FA9550-16-1-0386.

References

  • [1] Y. Aflalo, A. Bronstein, and R. Kimmel (2015) On convex relaxation of graph isomorphism. Proceedings of the National Academy of Sciences 112 (10), pp. 2942–2947. Cited by: §4.1.
  • [2] H. Almohamad and S. O. Duffuaa (1993) A linear programming approach for the weighted graph matching problem. IEEE Transactions on pattern analysis and machine intelligence 15 (5), pp. 522–525. Cited by: §1.
  • [3] W. Brendel and S. Todorovic (2011) Learning spatiotemporal graphs of human activities. In 2011 International Conference on Computer Vision, pp. 778–785. Cited by: §1.
  • [4] A. Chinchuluun, E. Rentsen, and P. M. Pardalos (2005) A numerical method for concave programming problems. In Continuous Optimization, pp. 251–273. Cited by: §1.
  • [5] M. Cho, J. Lee, and K. M. Lee (2010) Reweighted random walks for graph matching. In European conference on Computer vision, pp. 492–505. Cited by: §5, §6.1, §6.
  • [6] F. R. Chung and F. C. Graham (1997) Spectral graph theory. American Mathematical Soc.. Cited by: §6.3.
  • [7] S. R. Collins, P. Kemmeren, X. Zhao, J. F. Greenblatt, F. Spencer, F. C. Holstege, J. S. Weissman, and N. J. Krogan (2007) Toward a comprehensive atlas of the physical interactome of saccharomyces cerevisiae. Molecular & Cellular Proteomics 6 (3), pp. 439–450. Cited by: §6.3.
  • [8] D. Conte, P. Foggia, C. Sansone, and M. Vento (2004) Thirty years of graph matching in pattern recognition.

    International journal of pattern recognition and artificial intelligence

    18 (03), pp. 265–298.
    Cited by: §1.
  • [9] T. Cour, P. Srinivasan, and J. Shi (2007) Balanced graph matching. In Advances in Neural Information Processing Systems, pp. 313–320. Cited by: §6.
  • [10] M. Cuturi (2013) Sinkhorn distances: lightspeed computation of optimal transport. In Advances in neural information processing systems, pp. 2292–2300. Cited by: §5.
  • [11] A. Elmsallati, C. Clark, and J. Kalita (2015) Global alignment of protein-protein interaction networks: a survey. IEEE/ACM transactions on computational biology and bioinformatics 13 (4), pp. 689–705. Cited by: §1.
  • [12] P. ERDdS and A. R&wi (1959) On random graphs i. Publ. Math. Debrecen 6, pp. 290–297. Cited by: §6.1.
  • [13] U. Gaur, Y. Zhu, B. Song, and A. Roy-Chowdhury (2011) A “string of feature graphs” model for recognition of complex activities in natural videos. In 2011 International Conference on Computer Vision, pp. 2595–2602. Cited by: §1.
  • [14] S. Gold and A. Rangarajan (1996) A graduated assignment algorithm for graph matching. IEEE Transactions on pattern analysis and machine intelligence 18 (4), pp. 377–388. Cited by: §1, §6.1, §6.
  • [15] T. Hofmann, B. Schölkopf, and A. J. Smola (2008)

    Kernel methods in machine learning

    .
    The annals of statistics, pp. 1171–1220. Cited by: §1.
  • [16] N. Hu, R. M. Rustamov, and L. Guibas (2014) Stable and informative spectral signatures for graph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2305–2312. Cited by: §6.3.
  • [17] M. Jaggi (2013) Revisiting frank-wolfe: projection-free sparse convex optimization.. In ICML (1), pp. 427–435. Cited by: §5.1.
  • [18] T. C. Koopmans and M. Beckmann (1957) Assignment problems and the location of economic activities. Econometrica: journal of the Econometric Society, pp. 53–76. Cited by: §1, §2.1.
  • [19] N. M. Kriege and P. Mutzel (2012) Subgraph matching kernels for attributed graphs. In ICML, Cited by: §4.
  • [20] H. W. Kuhn (1955) The hungarian method for the assignment problem. Naval research logistics quarterly 2 (1-2), pp. 83–97. Cited by: §5, §5.
  • [21] Y. Kushinsky, H. Maron, N. Dym, and Y. Lipman (2019) Sinkhorn algorithm for lifted assignment problems. SIAM Journal on Imaging Sciences 12 (2), pp. 716–735. Cited by: §1.
  • [22] E. L. Lawler (1963) The quadratic assignment problem. Management science 9 (4), pp. 586–599. Cited by: §1, §2.1.
  • [23] D. T. Lee and B. J. Schachter (1980-06-01) Two algorithms for constructing a delaunay triangulation. International Journal of Computer & Information Sciences 9 (3), pp. 219–242. External Links: ISSN 1573-7640, Document Cited by: §6.2, §6.2.
  • [24] J. Lee, M. Cho, and K. M. Lee (2011) Hyper-graph matching via reweighted random walks. In CVPR 2011, pp. 1633–1640. Cited by: §1.
  • [25] M. Leordeanu, M. Hebert, and R. Sukthankar (2009) An integer projected fixed point method for graph matching and map inference. In Advances in neural information processing systems, pp. 1114–1122. Cited by: §5, §6.
  • [26] M. Leordeanu, R. Sukthankar, and M. Hebert (2012) Unsupervised learning for graph matching. International journal of computer vision 96 (1), pp. 28–45. Cited by: §6.2.
  • [27] Z. Liu and H. Qiao (2014) GNCCP—graduated nonconvexityand concavity procedure. IEEE transactions on pattern analysis and machine intelligence 36 (6), pp. 1258–1267. Cited by: §1.
  • [28] J. Maciel and J. P. Costeira (2003) A global solution to sparse correspondence problems. IEEE Transactions on Pattern Analysis & Machine Intelligence (2), pp. 187–199. Cited by: §1.
  • [29] H. Maron and Y. Lipman (2018) (Probably) concave graph matching. In Advances in Neural Information Processing Systems, pp. 408–418. Cited by: §1, §5.
  • [30] S. Mohammadi, D. F. Gleich, T. G. Kolda, and A. Grama (2017)

    Triangular alignment tame: a tensor-based approach for higher-order network alignment

    .
    IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 14 (6), pp. 1446–1458. Cited by: §6.
  • [31] R. Patro and C. Kingsford (2012) Global network alignment using multiscale spectral signatures. Bioinformatics 28 (23), pp. 3105–3114. Cited by: §6.
  • [32] F. Pedregosa, A. Askari, G. Negiar, and M. Jaggi (2018) Step-size adaptivity in projection-free optimization. arXiv preprint arXiv:1806.05123. Cited by: §5.1, §5.
  • [33] A. Rahimi and B. Recht (2008) Random features for large-scale kernel machines. In Advances in neural information processing systems, pp. 1177–1184. Cited by: §4.2, §4.2.
  • [34] C. E. Rasmussen (2003) Gaussian processes in machine learning. In Summer School on Machine Learning, pp. 63–71. Cited by: §1.
  • [35] V. Saraph and T. Milenković (2014) MAGNA: maximizing accuracy in global network alignment. Bioinformatics 30 (20), pp. 2931–2940. Cited by: §6.
  • [36] C. Schellewald, S. Roth, and C. Schnörr (2001) Evaluation of convex optimization techniques for the weighted graph-matching problem in computer vision. In Joint Pattern Recognition Symposium, pp. 361–368. Cited by: §1.
  • [37] J. T. Vogelstein, J. M. Conroy, V. Lyzinski, L. J. Podrazik, S. G. Kratzer, E. T. Harley, D. E. Fishkind, R. J. Vogelstein, and C. E. Priebe (2015) Fast approximate quadratic programming for graph matching. PLOS one 10 (4), pp. e0121002. Cited by: §5.
  • [38] T. Wang, H. Ling, C. Lang, and S. Feng (2018) Graph matching with adaptive and branching path following. IEEE transactions on pattern analysis and machine intelligence 40 (12), pp. 2853–2867. Cited by: §1.
  • [39] T. Wang, H. Ling, C. Lang, and J. Wu (2016) Branching path following for graph matching. In European Conference on Computer Vision, pp. 508–523. Cited by: §1, §4.1, §6.2, §6.2, §6.
  • [40] L. Wu, I. E. Yen, J. Chen, and R. Yan (2016) Revisiting random binning features: fast convergence and strong parallelizability. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1265–1274. Cited by: §4.2.
  • [41] L. Wu, I. E. Yen, Z. Zhang, K. Xu, L. Zhao, X. Peng, Y. Xia, and C. Aggarwal (2019) Scalable global alignment graph kernel using random features: from node embedding to graph embedding. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1418–1428. Cited by: §1.
  • [42] J. Yan, J. Wang, H. Zha, X. Yang, and S. Chu (2015) Consistency-driven alternating optimization for multigraph matching: a unified approach. IEEE Transactions on Image Processing 24 (3), pp. 994–1009. Cited by: §1.
  • [43] J. Yan, X. Yin, W. Lin, C. Deng, H. Zha, and X. Yang (2016) A short survey of recent advances in graph matching. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 167–174. Cited by: §1.
  • [44] T. Yu, J. Yan, Y. Wang, W. Liu, et al. (2018) Generalizing graph matching beyond quadratic assignment model. In Advances in Neural Information Processing Systems, pp. 853–863. Cited by: §1.
  • [45] M. Zaslavskiy, F. Bach, and J. Vert (2009) A path following algorithm for the graph matching problem. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (12), pp. 2227–2242. Cited by: §1, §4.1.
  • [46] R. Zass and A. Shashua (2008) Probabilistic graph and hypergraph matching. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. Cited by: §1, §6.
  • [47] Z. Zhang, M. Wang, Y. Xiang, Y. Huang, and A. Nehorai (2018) Retgk: graph kernels based on return probabilities of random walks. In Advances in Neural Information Processing Systems, pp. 3964–3974. Cited by: §4.2, §4.
  • [48] F. Zhou and F. De la Torre (2012) Factorized graph matching. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 127–134. Cited by: §1, §4.1, §6.1, §6.1, §6.2, §6.2, §6.
  • [49] F. Zhou and F. De la Torre (2015) Factorized graph matching. IEEE transactions on pattern analysis and machine intelligence 38 (9), pp. 1774–1789. Cited by: §5.