An Efficient Multilinear Optimization Framework for Hypergraph Matching

11/09/2015 ∙ by Quynh Nguyen, et al. ∙ Universität Saarland 0

Hypergraph matching has recently become a popular approach for solving correspondence problems in computer vision as it allows to integrate higher-order geometric information. Hypergraph matching can be formulated as a third-order optimization problem subject to the assignment constraints which turns out to be NP-hard. In recent work, we have proposed an algorithm for hypergraph matching which first lifts the third-order problem to a fourth-order problem and then solves the fourth-order problem via optimization of the corresponding multilinear form. This leads to a tensor block coordinate ascent scheme which has the guarantee of providing monotonic ascent in the original matching score function and leads to state-of-the-art performance both in terms of achieved matching score and accuracy. In this paper we show that the lifting step to a fourth-order problem can be avoided yielding a third-order scheme with the same guarantees and performance but being two times faster. Moreover, we introduce a homotopy type method which further improves the performance.



There are no comments yet.


page 12

page 13

page 14

page 20

page 21

page 22

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Graph matching is among the most challenging tasks of graph processing and lies at the heart of various fields in pattern recognition, machine learning and computer vision. In computer vision, it has been used for solving several

types of problems, for instance, object recognition [32], feature correspondence [11, 42], texture regularity discovery [20], shape matching [16, 38, 46], object tracking [2] and surface registration [48]. Graph matching also finds its applications in document processing tasks like optical character recognition [26, 17], image analysis (2D and 3D) [43, 33, 6, 37], or bioinformatics [39, 44, 41].

In this paper, we focus on the application of graph matching to the feature correspondence problem. Given sets of points/features extracted from two images, the task is to find for each point/feature in the first image a corresponding point/feature in the second image while maximizing as much as possible the similarity between them. The most simple

first-order approach is to match each feature in the first set to its nearest neighbor in the second set by comparing their local descriptors. This naive approach fails in the presence of repeated patterns, textures or non-discriminative local appearances. To overcome this problem, second-order methods [27, 28, 29, 24, 9, 14, 42, 49, 50] try to enforce geometric consistency between pairs of correspondences, for instance using distances between points. However, pairwise relations are often not enough to capture the entire geometrical structure of point set. Therefore, higher-order approaches [47, 8, 15, 25, 48] that take into account higher-order relations between features/points have been proposed. This paper falls into this line of research. In particular, we cast the correspondence problem as a hypergraph matching problem using higher-order similarities instead of unary or pairwise ones used by previous methods. Depending on the chosen similarities, this formulation allows for both scaling and rotation invariance. However, instead of concentrating on how to build these similarities, see [15, 7, 19, 35], the focus of this paper is how to solve the resulting optimization problem.

Compared to graph matching, algorithms for hypergraph matching are less studied in the computer vision literature. The reason might be the difficult optimization problem which requires not only to deal with the combinatorial constraints but also to maximize a high degree polynomial function. Most prior work has relaxed the constraints in order to use concepts from continuous optimization [15, 25, 47, 8]. Our main idea is to reformulate the problem. Instead of maximizing a third-degree polynomial we maximize the corresponding multilinear form. This allows the direct integration of assignment constraints and yields a simple yet efficient block-coordinate ascent framework. The idea of this paper is based on our previous work [36], where the third-order problem has been lifted to a fourth-order problem in order to show the equivalence of maximizing the score function and its associated multilinear form. In this paper we show by a careful analysis that the lifting to a fourth-order problem can be avoided. The resulting third-order algorithms based on maximization of the third-order multilinear form have the same guarantees and properties as in [36] and achieve similar state-of-the-art matching score and matching accuracy while being two times faster. Moreover, we provide in this paper a quite generic framework, which allows the adaptation of our whole approach to matching problems with different assignment constraints, as well as to other general third-order optimization problems with combinatorial constraints.

Extensive experiments on both synthetic and realistic datasets show that all our algorithms, including the third-order and fourth-order ones, significantly outperform the current state-of-the-art in terms of both matching score and matching accuracy, in particular for very challenging settings where different kinds of noise are present in the data. In terms of running time, our algorithms are also on average significantly faster than previous approaches. All proofs and additional experimental results can be found in the appendix.

1.1 Related Work

The graph resp. hypergraph matching problem is known to be NP-hard except for special cases where polynomial-time algorithms exist e.g. for planar graphs [21]. In order to make the problem computationally tractable, a myriad of approximate algorithms have been proposed over the last three decades aiming at an acceptable trade-off between the complexity of the algorithm and matching accuracy. They can be categorized from different perspectives and we refer to [13, 18] for an extensive review. In this section, we review only those approximate algorithms that use Lawler’s formulations for both graph and hypergraph matching as they are closely related to our work. In particular, the graph matching problem can be formulated as a quadratic assignment problem (QAP)

while the hypergraph matching problem is formulated as a higher-order assignment problem (HAP)

In these formulations, and

refer to the affinity matrix and affinity tensor respectively, and

is some matching constraint set depending on specific applications. Also, depending on whether the problem is graph matching or hypergraph matching, the corresponding algorithms will be called second-order methods or higher-order methods.

Among recent second-order methods, Leordeanu and Hebert [27] introduce the Spectral Matching (SM) algorithm, and Cour et al. [14] introduce Spectral Matching with Affine Constraint (SMAC). Both algorithms are based on the best rank-1 approximation of the affinity matrix. Torresani et al. [42] design a complex objective function which can be efficiently optimized by dual decomposition. Leordeanu et al. [29] propose the Integer Projected Fixed Point (IPFP) algorithm, which optimizes the quadratic objective function via a gradient-type algorithm interleaved with projection onto the assignment constraints using the Hungarian method. Lee et al. [24] tackle the graph matching problem using stochastic sampling, whereas Cho et al. [10] introduce a reweighted random walk (RRWM) where the reweighting is done in order to enforce the matching constraints. Zaslavskiy et al. [46] propose a path-following algorithm based on convex-concave relaxations. Zhou and Torre [49] factorize the affinity matrix and then compute a convex-concave relaxation of the objective function which is finally solved by a path-following method. Along this line there is also work of Liu et al. [31] and Zhou and Torre [50] extending previous methods [49, 46] to deal with directed graph matching. Recently, Cho et al. [12]

propose a novel max pooling matching algorithm, in which they tweak the power method to better cope with noise in the affinities. Although the algorithm comes without theoretical guarantees, it turns out to perform very well in practice, in particular, when one has a large number of outliers.

The hypergraph matching problem (i.e. HAP) is much less studied in the literature. Duchenne et al. [15]

formulate the hypergraph matching problem as a tensor eigenvalue problem and propose a

higher-order power method for solving it. Zass and Shashua [47]

introduce a probabilistic view on the problem with their Hypergraph Matching Method (HGM). Their idea is to marginalize the tensor to a vector and solve then the lower dimensional problem. Chertok and Keller

[8] extend this idea and marginalize the tensor to a matrix, leading to a quadratic assignment problem which is then tackled using spectral methods. Since both methods are based on tensor marginalization, some part of the higher-order information is lost. Moreover, they cannot handle the one-to-one matching constraint during the iterations since it is only considered at the final discretization step. Lee et al. [25] extend the reweighted random walk approach of [24] to hypergraph matching. Their algorithm aims at enforcing the matching constraint via a bi-stochastic normalization scheme done at each iteration. In [48], Zeng et al. propose to use pseudo-boolean optimization [4] for 3D surface matching, where higher-order terms are decomposed into second-order terms and the quadratic pseudo-boolean optimization (QPBO) algorithm [22] is employed to solve the problem. We have proposed in [36] a tensor block-coordinate ascent framework, see also [45], for hypergraph matching based on a multilinear reformulation of the HAP where the third-order problem has been lifted to a fourth-order problem and one directly optimizes over the set of assignment matrices.

1.2 Notation

Vectors are denoted by bold lowercase letters (e.g. ). The elements of vectors are denoted by subscripts while superscripts denote iteration numbers. Tensors will be denoted by calligraphic uppercase letters (e.g. ), while their associated multilinear forms will be denoted by the corresponding standard uppercase letter (e.g. F). The symbol is used to denote the tensor product.

2 Hypergraph Matching Formulation

Hypergraphs are powerful in matching problems because they allow modeling of relations involving groups of more than two vertices so that higher-order information can be integrated. We concentrate our study on -uniform hypergraphs, i.e. each hyperedge describes a relation between up to vertices. Every -uniform hypergraph can be represented by a third-order tensor. Let and be two point sets with . The matching problem consists of finding a binary assignment matrix such that if matches and otherwise. The set of all possible assignment matrices that assign each vertex of to exactly one vertex of is given by

Suppose that we have a function which maps each pair of triples and to its similarity weight . For example, is calculated as the similarity of the corresponding angles of the two triangles and as shown in Fig. 1. Then, the score of a matching is defined as follows [15]

For ease of notation, we introduce a linear ordering in so that each can be rewritten as a vector with and as a tensor in . With this convention, the score function can be computed in the following way


We recall that a third-order tensor is called symmetric if its entries are invariant under any permutation of . In particular, the tensor is symmetric and thus the non-symmetric part of is “averaged” out in the computation of . Therefore, without loss of generality, we can assume that is symmetric111Note that can always be symmetrized (without changing the score function) as follows , where and is the permutation group of three elements.. In this paper, we consider only terms of order , i.e.  if , or . However, note that it is possible to integrate first-order terms on the main diagonal , and pairwise potentials for . This is also the major advantage of hypergraph matching over conventional graph matching methods because one can combine local appearance similarities, pair-wise similarities (e.g. distances) and higher-order invariants (e.g. angles of triplets of points) in a unified framework.

Fig. 1: Illustration of constructing the affinity tensor. Each entry is computed by comparing the angles of two triangles formed by three candidate correspondences and .

3 Mathematical Foundations for Multilinear Optimization Framework

In this section, we derive the basis for our tensor block coordinate ascent framework for solving hypergraph matching problems. The general idea is to optimize instead of the score function the associated third-order multilinear form which is defined as


It turns out that the score function in Eq. (1) is a special case of the multilinear form when all the arguments are the same, i.e.  The multilinear form is called symmetric if it is invariant under any permutation of its arguments, i.e.  for all and a permutation of Note that if the tensor is symmetric then F is also symmetric. In the following, we write to denote a vector in such that for all and write to denote a matrix in such that for all Note that the positions of the dots do not matter in the case of a symmetric multilinear form because the function is invariant under any permutation of its arguments.

The hypergraph matching problem can be written as the maximization of a third-order score function subject to the assignment constraints,


In [36] we propose to solve instead the equivalent problem defined as


where the new symmetric fourth-order tensor is obtained by lifting the third order tensor via


The reason for this lifting step is discussed below. The main idea of [36] is then to optimize instead of the score function the associated multilinear form. The following theorem establishes equivalence of these problems under the assumption that the score function is convex on .

Theorem 3.1 (Nguyen et al. [36])

Let be a symmetric fourth-order tensor and suppose the corresponding score function is convex. Then it holds for any compact constraint set ,

Note that a given score function need not be convex. In [36] we show that every fourth-order score function can be modified so that it becomes convex, while solving (4) for the original and the modified score function turns out to be equivalent. The maximization of the multilinear form can then be done in two ways. Either one fixes all but one or all but two arguments and maximizes the remaining arguments over the set of assignment matrices . As shown in [36], this scheme boils down to solving a sequence of Linear Assignment Problems (LAPs) or a sequence of (approximate) QAPs which can be done efficiently by existing lower-order methods.

The convexity of the score function is crucial for the proof of Theorem 3.1. The lifting step (5) has been proposed in [36] as non-trivial third-order score functions are not convex on as the following lemma shows.

Lemma 3.2 (Nguyen et al. [36])

Let be a third-order score function defined as in Eq. (1). If is not constant zero, then is not convex.

In the following we show that a statement similar to Theorem 3.1 holds also in the third-order case. This allows us to propose algorithms directly for the third-order case without the lifting step, leading to two times faster algorithms. The new key insight to establish this result is that convexity on whole is not required. If one just requires that the third-order score function is convex on the positive orthant, then Lemma 3.2 is no longer true. In fact convexity is only required in a “pointwise” fashion which we make precise in Theorem 3.3. Similar to our previous Theorem 3.1, we state the main Theorem 3.3 of this paper for general compact sets which includes as a special case the set of assignment matrices. Although this general form of the theorem may allow generalizations of our proposed framework to other applications, in this paper we only focus on the case for the application of hypergraph matching.

Theorem 3.3

Let be a compact set, a symmetric third-order multilinear form and the associated third-order score function such that then we have the following chain of implications: 1) 2) 3) 4).

  1. is positive semidefinite for all

  2. for all

  3. It holds for all

  4. The optimization of the multilinear form is equivalent to the optimization of its associated score function

As a twice continuously differentiable function is convex if and only if its Hessian is positive semi-definite on its domain, the first requirement is much weaker than requiring convexity on whole . Thus Theorem 3.3 allows us to establish for inequality (6) for a nontrivial class of third-order score functions, which is the key result necessary to establish the equivalence of the maximization of the score function and the maximization of the multilinear form. This is the key requirement to show monotonic ascent for our block-coordinate optimization scheme. In Proposition 3.4 we show how to modify an existing third-order score function so that it satisfies condition 1) for . Again this modification turns out to be constant on and thus leads to equivalent optimization problems. This altogether leads to two new algorithms for the hypergraph matching problem (1) similar to our previous work [36]. The key idea of both algorithms is to use block-coordinate ascent updates, where all but one argument or all but two arguments of the multilinear form are fixed and one optimizes over the remaining ones in either case. The inequality (6) then allows us to connect the solutions of the two optimization problems. In both variants, we directly optimize over the discrete set of possible matchings, that is, there is no relaxation involved. Moreover, we theoretically prove monotonic ascent for both methods. In all our experiments, the proposed third-order methods achieve competitive matching score and matching accuracy to the fourth-order ones [36] while outperforming other state-of-the-art approaches. On the other hand, we achieve a speed up of factor over the fourth-order algorithms due to the ability of working directly with the original tensor.

Compared to [36] we avoid the lifting step while maintaining the same theoretical guarantees. Moreover, we use the weaker conditions on the score function in Theorem 3.3 to propose two variants of our algorithms based on homotopy methods as presented in Section 3.3.

In general, the multilinear form F might not fulfill the conditions of Theorem 3.3. Therefore, we propose to modify F by adding to it some multilinear form G so that the first statement of Theorem 3.3 is satisfied for the new function. At the same time the modification does not affect the solution of our problem since the added term is constant on the set of assignment matrices. This will be detailed later on in Proposition 3.4. While finding the best possible modification is difficult, one can consider some canonical choices. As the first choice, one can implicitly modify F by considering a new score function defined as and then show that there exists some such that is positive semidefinite everywhere in for all However, the modified score function then becomes in-homogeneous and thus there does not exist any multilinear form such that

A homogeneous modification would be . Indeed, one can associate to the multilinear form where is the -th standard basis vector of Moreover, it holds that In this paper, we consider a family of modifications which includes the one above by perturbing the set of basis vectors as with being the all-ones vector. Note that is included in this family as a special case for The main idea is that one can later on optimize to find the best lower bound on That is to say, we aim at a minimal amount of such that the conditions from Theorem 3.3 are satisfied. As shown in the proof of Proposition 3.4, is the optimal value for which the best bound on can be obtained. Thus, we use this value in the rest of the paper. In particular, let we define the symmetric third-order tensor as


and the associated multilinear form as

Proposition 3.4

Let be a symmetric third-order multilinear form and as defined in Eq. (8). We consider the new multilinear form defined by

For all it holds

  1. is a symmetric third-order multilinear form.

  2. is positive semidefinite for all

  3. The new problem is equivalent to the original one

Discussion. In [34] a general convexification strategy for arbitrary score functions has been proposed where, similar to our modification, the added term is constant on the set of assignment matrices . However, as the added term is in-homogeneous, it cannot be extended to a symmetric multilinear form and thus does not fit to our framework. In second-order graph matching also several methods use convexified score functions in various ways [46, 49, 50]. However, for none of these methods it is obvious how to extend it to a third-order approach for hypergraph matching.

Once is symmetric and is positive semidefinite at every in , it follows from Theorem 3.3 that

Therefore, block-coordinate ascent schemes that optimize over assignment constraints can be derived in a similar fashion to [36]. In particular, we propose below two variants, one for solving

which leads to Algorithm 1 and the other for solving

which leads to Algorithm 2. Both variants come along with theoretical guarantees including strict monotonic ascent and finite convergence.

3.1 Tensor Block Coordinate Ascent via a Sequence of Linear Assignment Problems

The first algorithm uses a block coordinate ascent scheme to optimize the multilinear function where two arguments are fixed at each iteration and the function is maximized over the remaining one. This boils down to solving a sequence of LAPs, which can be solved to global optimality by the Hungarian method [23, 5]. As we are optimizing the multilinear form, we need to guarantee that our algorithm produces finally a homogeneous solution, i.e. . Moreover, we require that our algorithms achieve ascent not only in the multilinear form but also in the original score function. These are the two main reasons why we use the inequality (6) at step 7) of Algorithm 1 to get ascent in the score function. The following theorem summarizes the properties of Algorithm 1.

Theorem 3.5

Let be a symmetric third-order tensor and . Then the following holds for Algorithm 1:

  1. The sequence for is strictly monotonically increasing or terminates.

  2. The sequence of scores for is strictly monotonically increasing or terminates. For every , is a valid assignment matrix.

  3. The algorithm terminates after a finite number of iterations.

1:  Input:
2:  Output:
3:  loop
7:     if  then
9:        if  then
12:        else
13:           return
14:        end if
15:     else
16:        , ,
17:     end if
19:  end loop
Algorithm 1 BCAGM3

We would like to note that all statements of Theorem 3.5 remain valid for if one of the statements 1)-3) in Theorem 3.3 holds for . In practice, this condition might be already satisfied for some constructed affinity tensors. Thus we adopt the strategy in [36] by first running Algorithm 1 with until we get no further ascent and only then we set It turns out that in our experiments often the first phase with leads automatically to a homogeneous solution. In this case, the algorithm can be stopped as the following lemma shows.

Lemma 3.6

Suppose Algorithm 1 runs with for some . If holds at some iteration then for all , it holds

Lemma 3.6 shows that we can stop Algorithm 1 whenever the first phase leads already to a homogeneous solution as no further ascent is possible. In particular, there is no need to go for a new phase with larger value of since the current iterate is already optimal to all local updates of Algorithm 1, i.e. steps (4)-(6), for all . Also, the fact that the iterate is already homogeneous implies that no further possible ascent can be achieved at step (7).

3.2 Tensor Block Coordinate Ascent via Alternating between Quadratic and Linear Assignment Problems

The second algorithm uses a block coordinate ascent scheme to optimize the multilinear form where now one resp. two arguments are fixed and we optimize over the remaining ones in an alternating manner. The resulting scheme alternates between QAPs and LAPs. While the LAP can be efficiently solved using the Hungarian algorithm as before, the QAP is NP-Hard. Thus a globally optimal solution is out of reach. However, our algorithm does not require the globally optimal solution at each step in order to maintain the theoretical properties. It is sufficient that the sub-routine method, called yields monotonic ascent w.r.t. the current variable, that is,


where is the current iterate and the nonnegative symmetric is in our algorithm. As in Algorithm 1, we go back to the optimization of the score function in step 6) using inequality (6). The following theorem summarizes the properties of Algorithm 2.

Theorem 3.7

Let be a symmetric third-order tensor and . Let be an algorithm for the QAP which yields monotonic ascent w.r.t. the current iterate. The following holds for Algorithm 2:

  1. The sequence for is strictly monotonically increasing or terminates.

  2. The sequence of scores for is strictly monotonically increasing or terminates. For every , is a valid assignment matrix.

  3. The algorithm terminates after a finite number of iterations.

1:  Input: , is a sub-routine method that provides monotonic ascent for the QAP, i.e. 
2:  Output:
3:  loop
6:     if  then
8:        if  then
11:        else
12:           return
13:        end if
14:     else
15:        ,
16:     end if
18:  end loop
Algorithm 2 BCAGM3+

In analogy to Theorem 3.5, all statements of Theorem 3.7 remain valid for if one of the statements 1)-3) in Theorem 3.3 holds for .. Thus we use the same initialization strategy with as described above. There are several methods available which we could use for the sub-routine in the algorithm. We decided to use the recent max pooling algorithm [12] and the IPFP algorithm [29], and then use the Hungarian algorithm to turn their output into a valid assignment matrix in at each step. It turns out that the combination of our tensor block coordinate ascent scheme using their algorithms as a sub-routine yields very good performance on all datasets. In case of Algorithm 2 a statement as in Lemma 3.6 is not possible, as the subroutine usually can only deliver a local solution to the subproblem and Lemma 3.6 relies on the fact that the subproblem can be solved globally optimally. However, we observe in our experiments that almost always Algorithm 2 does not achieve further ascent when its iterate is homogeneous, thus we recommend to stop Algorithm 2 in this case.

3.3 A Homotopy Tensor Block Coordinate Ascent Scheme

Both algorithms consist of two phases – the first phase uses as initialization and the second phase uses However, if the value of is too large for the second phase, often no further improvement is achieved. This phenomenon can be explained by the fact that a large modification term tends to homogenize the variables more quickly, i.e.  , which makes the algorithm get stuck faster at a critical point.

To fix this, one first observes that the inequality (6) lies at the heart of both algorithms. On one hand, it serves as a sufficient condition for the equivalence of maximizing score functions and multilinear form. On the other hand, it might help the algorithms to jump to better solutions in case they reach a stationary state. Both methods guarantee this inequality to be satisfied for all the tuples at the same time by using a rather large value of However, this might be too conservative as the algorithm itself converges typically rather quickly and thus visits only a small number of feasible tuples. Thus, we propose below to satisfy the inequality (6) as the algorithm proceeds. This is done by updating accordingly during the algorithm which yields a homotopy method.

Proposition 3.8

Let be a symmetric third-order multilinear form and as defined in Eq. (8). For each , let be defined as

Then the following holds

  1. is a symmetric multilinear form.

  2. For all non-homogeneous tuple the following inequality holds


    if and only if

  3. For every , it holds

Proposition 3.8 suggests that if is chosen such that


then the inequality (6) is satisfied for all , for which, it follows from Theorem 3.3

Thus, Algorithm 1 or 2 can be applied again to solve the problem. However, the computation of the optimal bound given in (11) is not feasible as the number of feasible tuples grows exponentially with the size of the problem. Therefore, we adaptively update as follows. First, we use block coordinate ascent steps as in the previous methods with . When such a step achieves no further ascent and inequality (6) is violated, we slightly increase so that the inequality becomes satisfied for the current iterate, and the algorithm is continued. The whole scheme is shown in Algorithm 3. Note that a small is used in step 13) to prove the convergence, which is analyzed below.

1:  Input:
2:  Output:
3:  loop
7:     if  then
9:        if  then
12:        else if  does not satisfy Eq. (11) then
14:           , ,
15:        else
16:           return
17:        end if
18:     else
19:        , ,
20:     end if
22:  end loop
Algorithm 3 Adapt-BCAGM3

Note that is always strictly increasing whenever Algorithm 3 enters step 12). This can be seen as follows. If the current value of is greater than or equal to , then inequality (10) holds for the current tuple , which implies the algorithm should not have entered step 12), leading to a contradiction.

For any fixed value of , Algorithm 3 works exactly in the same fashion as Algorithm 1, thus, it also yields strict monotonic ascent. The only difference is that the first algorithm updates only with a potentially large value, whereas the new one splits this update into multiple phases.

In case exceeds the bound given in (11) at some iteration, the “strict” inequality will hold for all , in which case will no longer be updated. This, combined with the fact that the algorithm yields strict monotonic ascent for fixed and is bounded from above, guarantees the whole scheme converges in a finite number of iterations.

It should be emphasized that for any value of , the optimization of and on is always equivalent, but the equivalence of the optimization of the multilinear form and that of the score function is only guaranteed when satisfies Eq. (11).

Finally, we adopt the same homotopy approach as described above for Algorithm 2 to obtain Algorithm 4. All the properties of Algorithm 3 transfer to Algorithm 4. In particular, the algorithm achieves strict monotonic ascent within each phase and has finite termination.

1:  Input:
2:  Output:
3:  loop
6:     if  then
8:        if