MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment

02/28/2020 ∙ by Florian Bernard, et al. ∙ 0

We present a convex mixed-integer programming formulation for non-rigid shape matching. To this end, we propose a novel shape deformation model based on an efficient low-dimensional discrete model, so that finding a globally optimal solution is tractable in (most) practical cases. Our approach combines several favourable properties: it is independent of the initialisation, it is much more efficient to solve to global optimality compared to analogous quadratic assignment problem formulations, and it is highly flexible in terms of the variants of matching problems it can handle. Experimentally we demonstrate that our approach outperforms existing methods for sparse shape matching, that it can be used for initialising dense shape matching methods, and we showcase its flexibility on several examples.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Finding correspondences in geometric data is a long-standing problem in vision, graphics, and beyond. The applications range from the creation of statistical shape models, 3D reconstruction, object tracking, or recognition, to more recent settings such as the alignment of geometric data to enable the training of deep learning models. In this work we consider the problem of finding correspondences between two given shapes, known as

shape matching.

We assume that one shape is a geometrically transformed version of the other shape. With that, matching shape to shape can be phrased as finding a transformation (which belongs to a particular class of transformations) such that the transformed shape best aligns with . Formally, this can be written as the optimisation problem

(1)

where is a suitable metric that quantifies the discrepancy between both shapes. The particular shape matching setting depends on the choice of the metric and the class of transformations . For example, rigid shape matching refers to , where is the special Euclidean group in dimension . In this work we study the non-rigid shape matching problem, where comprises non-rigid deformations (to be defined in Sec. 3).

Although many previous works have addressed non-rigid shape matching, there are several open challenges: (i) due to the non-convex nature of Problem (1) for virtually all relevant choices of and , existing methods cannot guarantee to find global optima. Hence, these methods heavily depend on the initial choice of . (ii) Oftentimes, non-rigid shape matching methods require that both shapes have the same representation (e.g. meshes). (iii) Existing approaches have a limited flexibility in terms of the matching formulation that can be handled, e.g. they can only handle bijective matchings, or they cannot guarantee injectivity, they do not allow for additional constraints (e.g. bounding the maximum distortion of a matching), or they cannot deal with shapes that have different topologies. (iv) Moreover, existing formulations that purely aim for preserving pairwise distances when finding a matching (see quadratic assignment problem in Sec. 2) are not guaranteed to maintain the orientation of the surface.

Our contribution.

Our main idea is to formulate non-rigid shape matching in terms of a convex mixed-integer programming (MIP) problem, while addressing (i)-(iv). We summarise our main contributions as follows:

  • We propose a low-dimensional discrete model for non-rigid shape matching that is highly flexible as it allows to tackle a wide range of matching formulations.

  • Although solving MIP problems to global optimality has worst-case time complexity that is exponential in the number of integer variables, our proposed formulation only requires a small number of integer variables that is independent of the shape resolution.

  • Our formulation does not require an initialisation and it is oftentimes possible to (certifiably) find a globally optimal solution in practice.

2 Related Work & Background

Due to the vast amount of literature related to shape matching and correspondence problems, it is beyond the scope of this paper to provide an exhaustive background of related work. A broad overview of the topic is for example presented in [47]. In the following we summarise works that we consider most relevant.

Rigid shape matching.

Finding a rotation and translation that aligns two shapes is known as rigid shape matching. The Procrustes problem [42] considers the setting when the correspondences between points on both shapes are known, which admits an efficient closed-form solution. However, rigid shape matching becomes significantly harder if the correspondences are unknown. Most commonly, this is addressed via local optimisation. A popular approach is the Iterative Closest Point (ICP) algorithm [7], which also comes in various variants, such as a probabilistic formulation [30]. These methods have in common that they do not guarantee to find a globally optimal solution and therefore their outcome is highly dependent on a good initialisation. Contrary to these local methods, for the rigid shape matching problem there are also global approaches, e.g. based on a semidefinite programming relaxation [26], or on branch and bound algorithms [31, 52].

A downside of these shape matching approaches is that they have the strong assumption that both shapes can be aligned based on a rigid-body transformation. However, in practice this assumption is oftentimes violated, so that non-rigid shape matching approaches are more appropriate, which we will discuss next.

Functional maps.

A popular paradigm for isometric shape matching are functional maps (FM) [32, 18, 34], which define a framework for transferring a function from a source to a target shape. Although FM were shown to be a powerful tool for isometric shape matching, they also have some shortcomings: they are sensitive to noise and suffer from symmetries, point-to-point maps obtained from FM are neither guaranteed to be smooth nor injective, and they are not suitable for severe non-isometries. Rigid shape matching methods applied to spectral embeddings (obtained via FM) can also be used for isometric matching, such as done in PM-SDP [26]. However, in this case the mentioned shortcomings also apply.

Quadratic assignment problem.

Another popular approach for non-rigid shape matching are formulations based on the quadratic assignment problem (QAP) (or graph matching) [25], which aim for a non-rigid deformation with small distortion. In a discrete setting this can be phrased as matching vertices between two shapes in such a way that pairwise geodesic distances (or similar quantities) are (approximately) preserved by the vertex-to-vertex correspondences. The QAP is known to be NP-hard [35]

, so that most solution approaches are based on heuristic approaches without formal guarantees, such as e.g. 

[22]. There are also more principled methods based on convex relaxations, including lifting-free [53, 13, 6] and lifting-based relaxations [40, 46, 20]. However, they do not guarantee to find a globally optimal solution of the original non-convex problem as they rely on some kind of rounding procedure to obtain a binary solution. Globally optimal QAP solvers are based on combinatorial search, e.g. via branch and bound [4], and these methods scale exponentially in the number of variables. Similarly as in the QAP, our method also takes the spatial context of matchings into account, but we demonstrate that in practice our proposed formulation is significantly faster to solve, cf. Fig. 1. We believe that this is because our formulation has a special structure (based on our sparse deformation model) that can more efficiently be leveraged by combinatorial solvers.

Figure 1: Runtime comparison between our formulation, a QAP and a reduced QAP (cf. search space reduction in the Supp. Mat.) when solved with MOSEK [2].
Global non-rigid matching.

It was shown that certain matching problems can be solved globally optimal by finding shortest paths in a graph, or based on dynamic programming. These include matching 2D shapes (contours) to a 2D image [11, 14, 41], or matching a 2D contour to a 3D shape [21]. As for example pointed out in [5], non-rigidly matching two objects in 3D is a significantly more difficult problem as it does not allow such a formulation. In [51]

the elastic matching of two 3D meshes is addressed based on a linear programming formulation. However, the formulation is sensitive to the mesh triangulation, requires a large number of binary variables, and due to the non-tightness of relaxation it relies on sophisticated rounding techniques after which global optimality cannot be guaranteed anymore. In 

[10] the authors propose a convex formulation for nonrigid registration that is solved via message passing. This approach requires an extrinsic term in order to disambiguate intrinsic symmetries, which in practice means that an initial alignment between both shapes is indispensable, thereby mitigating the advantages of a convex formulation.

Local non-rigid matching.

In a similar spirit, local refinement techniques also rely on a good matching initialisation. Such methods include [49, 48] and [27], where a given initial matching is gradually refined. While  [49] relies on a QAP formulation, in [27] a spectral method based on FM is used for a hierarchical upsampling. In Sec. 4.3 we show that our method can be used as initialisation for such methods.

In [44] the authors propose a non-rigid deformation model based on per-triangle affine transformations. Within this framework they also pose a correspondence problem, which, however, requires a good initial alignment between both shapes in order to make the optimisation problem well-posed. Moreover, since the problem is non-convex, in general one only finds local optima. In our work we leverage a similar deformation model, but (i) we phrase the problem in a well-posed way without requiring an initial shape alignment, and (ii) we perform a global optimisation.

Learning-based matching.

Shape matching has also been tackled with machine learning techniques, e.g. with random forests 

[38], supervised deep functional maps [23], deep functional maps trained in self- or unsupervised settings [17, 39], or using PointNet [33] for learning point cloud correspondences [16]

. Undeniably, machine learning has the potential to address many open challenges in shape matching, e.g. for learning appropriate shape representations. In the past it was demonstrated that combinatorial shape matching benefits from learned deep features 

[5]

, and reversely, that embedding combinatorial optimisation solvers into neural networks (“differentiable programming”) opens up new possibilities for tackling a range of interesting matching problems 

[28]. We believe that in the future our method may also be amenable to utilise such synergies, and therefore consider it to be orthogonal to learning-based methods.

Convex mixed-integer programming.

Mixed-integer programming refers to optimisation problems that involve both continuous and discrete variables. Their advantage is that they are extremely flexible and allow to model a wide range of complex problems. For example, they can be used to discretise difficult non-convex problems, such as formulations that impose rotation matrix constraints, or for phrasing matching problems with binary variables. However, the downside is that MIP problems have a search space that has exponential size in the number of discrete variables, so that in general it is very hard to solve large problems to global optimality. Convex mixed-integer programming refers to a subclass of MIP problems that are convex for fixed integer variables. A major advantage is that for this class of problems there exist efficient branch and bound solvers that globally optimise such problems. Albeit the fact that these solvers have a worst-case runtime that is exponential in the number of integer variables, in this work we demonstrate that solving non-rigid shape matching using a convex MIP reformulation is tractable in (most) practical scenarios.

3 Non-Rigid Shape Matching

First, we summarise our notation. For an integer we define . For a matrix and the index set we use to denote the rows of selected by . and denote the

-dimensional vector of all ones and the

-dimensional identity matrix,

denotes the Frobenius norm, and matrix and vector inequalities are understood element-wise.

Let and be triangular surface meshes that are discretisations of Riemannian -manifolds embedded in 3D space. Note that later in Sec. 4.4 we will also address the case when is a point cloud. Our aim is to find a non-rigid deformation that transforms shape to , so that it aligns well with shape , cf. Problem (1). For notational convenience we use and to refer to the matrices containing the and 3D vertex positions of shapes and , respectively. Moreover, let be a matrix that encodes the triangular faces of , where is the number of triangles.

3.1 Non-Rigid Deformation Model

We model the non-rigid deformation of by applying an affine transformation to each triangle. In conjunction with suitable mesh consistency constraints, the individual per-triangle affine deformations globally constitute a non-rigid deformation. Although related deformation models have been introduced before [43, 44, 5, 45], they have not been used for a global optimisation of non-rigid shape matching.

Affine per-triangle transformations.

For the -th vertex we define the non-rigid deformation in terms of its adjacent triangle as

(2)

where is the centroid of the -th triangle in the undeformed shape ,

is a linear transformation, and

is a translation. As such, we first centre a given vertex, apply a linear transformation, undo the centring, and eventually translate it to its global position.

Mesh consistency constraints.

In order to ensure a consistent mesh deformation, we impose the constraints

(3)

where is the set of all triangles in that are adjacent to vertex , cf. Fig. 2. The purpose of the constraints (3) is to enforce that a given vertex is transformed to the same place, no matter which transformation of its adjacent triangles is applied, and thereby ensuring that the triangle topology is preserved by the deformation.

Figure 2: Illustration of the deformation , where each triangle undergoes an affine transformation. The consistency constraint imposes that the transformed is the same, no matter whether it is transformed by or for and .

3.2 Mixed-Integer Non-Rigid Shape Alignment

Low-dimensional correspondence model.

Our non-rigid deformation is indirectly defined by a low-dimensional discrete model. To this end, subsets of the shape vertices are used as control points, which are represented by the matrices and . Here, and denote the total number of control points for each shape, and and denote the index sets that select the control points from the original shapes. A similar approach has been pursued in [45] for the interactive manipulation of shapes, where, however it is assumed that for each control point of the corresponding control point of is already known. In contrast, we are interested in the much more difficult shape matching problem, where the correspondence between control points is unknown, and, moreover, there may not even exist an exact counterpart in for each point in .

Convex polyhedral surface approximation.

We propose to address the issue that there may not exist exact counterparts between control points as follows: rather than matching control points of directly to control points of , we match the points of to convex polyhedra that locally approximate the surface of , see Fig. 3. To this end, we associate a convex polyhedron with each control point of , which we represent using the matrix for . Here, each row of contains one of the vertices (corner points) of the -th polyhedron on . As such, any point that lies inside the convex polyhedron can be specified as a convex combination of rows of , i.e. , where the -dimensional vector satisfies the convex combination constraints and . We note that this point-to-polyhedron matching is a strict generalisation of point-to-point matching, since the latter is achieved for . Using this formulation allows to find a matching between and even when there exists only an approximate counterpart between the control points on both shapes. For details how we obtain the polyhedra see the Supp. Mat.

Figure 3: Illustration of control points of (left) that are matched to convex polyhedra of (right). Colours indicate correspondences between control points and convex polyhedra.
Correspondence term.

We tackle the non-rigid shape matching problem by establishing correspondences between the control points and the convex polyhedra of , while at the same time ensuring that the resulting non-rigid deformation is “regular”. For now, let us assume that and that each control point of is matched to one of the convex polyhedra of . Moreover, we also allow that more than one control point of can be matched to the same convex polyhedron of . We model these matching constraints using the matrix , where we define

(4)

An element means that the -th control point of is matched to the -th convex polyhedron of . Later, in Sec. 4.4, we will also present more general formulations that allow to also match shapes when some control points on do not have a counterpart on .

Since there are control points of , where each of them is matched to one of the convex polyhedra on , for each we introduce a convex combination weight vector . Here, is the index of the control point of and is the index of the convex polyhedron of . By defining , , as well as the matrix of convex combination weights

(5)

we model our correspondence term as

(6)

In addition we impose , , and for . As such, we can effectively enforce the convex combination and matching constraints using linear equalities. With that, the -dimensional matrix contains points that lie inside the convex polyhedra, where each of its rows correspond to the respective row of the transformed control point matrix .

Moreover, to avoid that multiple control points are assigned to the same vertex of a convex polyhedron, we impose the “soft-injectivity” constraints . The soft-injectivity constraint enforces that the sum of weights in each column of is at most one. As such, if a (single) element in a column is exactly one, only this control point is assigned to the respective vertex of the convex polyhedron. If elements in a column of are strictly smaller than one, all respective control points are assigned to non-extreme points of the polyhedron, thereby preventing that multiple control points are matched to the same vertex of a polyhedron.

We use the notation to refer to the four constraints introduced in this paragraph. In overall, the correspondence term has the purpose to minimise the discrepancy between the control points of the transformed shape and their corresponding convex polyhedra of .

Deformation regularisers.

For regularising the deformation we decompose each linear transformation in (2) into the sum of a rotation matrix and a (small) general linear part , so that in (2) now becomes

(7)

The purpose of using the additive factorisation (with small) is to ensure that the global shape deformation (approximately) preserves the morphology of . This has a similar effect as the as-rigid-as-possible (ARAP) model [43], but requires only a single rotation matrix compared to rotation matrices as used in ARAP. In order to keep the linear part small, we impose the rigidity loss as

(8)

Moreover, for achieving a locally smooth deformation, we introduce the smoothness loss

(9)

where denotes the set of all neighbouring triangle pairs in , and is a scalar weight. For , so that triangles and are neighbours, we define the -th smoothness residual as

(10)

The purpose of the residual is to quantify the difference between transforming the triangle centroid using the transformation of the same triangle, and using the transformation defined for its neighbour triangle .

Optimisation problem.

Based on the introduced terms and constraints our mixed-integer non-rigid alignment (MINA) formulation reads

(11)

We assume that all weights . The mesh consistency and the constraints are affine in the variables and , and all objective function terms are compositions of affine transformations with the Frobenius norm, so that they are convex. However, due to the binary constraints imposed upon , and the non-convex quadratic equality constraints , the overall problem is non-convex.

Convex mixed-integer formulation.

To transform Problem (11) into a convex MIP problem, we use a piece-wise linear approximation of the constraint based on binary variables, see the Supp. Mat. and [12]. To keep the number of binary variables small, we use an efficient Gray encoding for the piece-wise linear approximation, cf. [50], so that the number of binary variables is logarithmic in the number of discretisation bins . The main idea here is to utilise a more efficient representation that requires fewer binary variables and thus admits a more efficient optimisation. In particular, this results in binary variables, in contrast to binary variables for a naive linear encoding. In Fig. 4 we compare our used logarithmic encoding with a linear one, where it can be seen that the logarithmic one requires less computation time, and that the determinant of the resulting matrix is already very close to for .

Figure 4: Runtime, relative speed-up w.r.t. linear, and determinants of for linear and logarithmic encodings of the discretisation.

Figure 5: Correspondences obtained from our method for several shape matching instances from the TOSCA dataset [9]. Correspondences between in the top row and in the bottom row are indicated by dots with corresponding colours.

Figure 6: Each plot summarises the percentage of correct matchings for different TOSCA shape classes. The horizontal axis shows the geodesic error threshold, and the vertical axis shows the percentage of matches that are smaller than or equal to this error.

4 Experiments

In this section we present an experimental evaluation of our proposed MINA approach. To this end, we compare it to other sparse correspondence methods, we analyse the gaps to global optimality, we demonstrate that MINA can be used as initialisation for dense shape matching, and we showcase its flexibility on several exemplary settings. We provide additional implementation details in the Supp. Mat..

4.1 Sparse Shape Matching

In this section we compare our method with other approaches that perform a sparse matching between a pair of shapes. In particular, we consider the convex matching method PM-SDP [26] in a rigid setting, the sparse game-theoretic approach by Rodola et al. [37], the coherent point drift (CPD) algorithm [30] (randomly initialised), and the convex relaxation by Chen & Koltun [10]. As such, we cover a wide range of shape matching paradigms, including convex relaxations for rigid (PM-SDP) and non-rigid (Chen & Koltun) shape matching, a local non-rigid method (CPD), and a sparse method that considers a quadratic assignment problem formulation (Rodola et al.). In this set of experiments we use the sparse points from [19] for matching pairs of shapes from the TOSCA dataset [9]. Hence, we directly match control points on to control points on when using our MINA method (i.e.  for ).

In Fig. 5 we show correspondences obtained from our method for various shape matching pairs. In Fig. 6 we show quantitative results, where we summarise the percentage of correct matches (relative to the number of given control points) for each shape class in the TOSCA dataset. It can be seen that our MINA method generally outperforms the other sparse matching approaches. The lower scores for smaller geodesic thresholds arise due to our sparse modelling, since matchings can only be as accurate as the sparse control points allow for. Since the method by Rodola et al. [37] does not match all of the given points, the respective curves do not reach 100%. Moreover, the performance of PM-SDP indicates that a rigid matching setting is too restrictive. Additional results can be found in the Supp. Mat.

4.2 Global Optimality Analysis

Here, we analyse the gaps to global optimality dependent on the processing time for the TOSCA shape matching instances in Sec. 4.1. To this end, we define

(12)

where is the total number of shape matching pairs ( for TOSCA), denotes the total solver time for the -th shape matching problem, and is the relative gap of the -th problem that is defined as (see [2]). Here, and are the upper and lower bounds of the objective value of the MIP formulation of Problem (11), respectively, and is a small number. In Fig. 7 (left) it can be seen that after h (our time budget for the MIP solver) the value of reaches , i.e. on average the solutions are close to being globally optimal (a value of means that all instances are solved to global optimality). After h, for of the cases we certify global optimality, see Fig. 7 (right).

   

Figure 7: Optimality gaps (left) and proportion of instances solved to global optimality (right) dependent on the solver runtime.

Reference    random    PM-SDP    Rodola et al.    MINA   

Figure 8: Comparison of different sparse matchings (random, PM-SDP [26], Rodola et al. [37], MINA) that are used as initialisation for PMF [49] to obtain a dense matching. The first row shows the reference shape and the other rows show the colour-coded dense correspondences for the respective methods.

4.3 Dense Shape Matching

Next, we demonstrate that our method can be used to obtain a suitable initialisation for dense correspondence methods. Since dense non-rigid matching approaches are highly initialisation-dependent (even the convex approach [10] requires a good initial alignment, cf. Sec. 2), it is crucial that they are provided with a good initialisation.

For these experiments we use the product manifold filter (PMF) [49] for obtaining a dense matching from a given sparse matching. To obtain the initial sparse matching, in addition to our MINA approach, we consider a random matching, a rigid alignment obtained via PM-SDP [26], and the approach by Rodola et al. [37]. Unlike in the previous section, here we extract the sparse points that we want to match based on geodesic farthest point sampling (FPS), which obtains an (approximately) uniform sampling of control points on the shapes.

In Fig. 8 we show results for the PMF-based densification for shapes from the TOSCA dataset [9] (cat, dog, wolf, human), the SHREC watertight dataset [15] (glasses, teddy, pigs), the FAUST dataset [8] (human) and the SCAPE dataset [1] (human). Using a random initialisation (second row) fails in all cases and therefore confirms the dependence of PMF to its initialisation. Although PM-SDP finds a global optimum (of a convex relaxation), the rigid deformation model is too restricted and therefore does not produce reliable dense correspondences for non-rigid shape matching (third row). The method by Rodola et al. [37] works well for several cases (fourth row), but due to its initialisation-dependence and potential orientation flips it also leads to several wrong matchings. We find that for various types of matching problems, including strong non-rigid deformations (cat in the first column), or inter-object matching (wolf-dog in the second column), our MINA method provides the most reliable initialisation (last row). Although in many cases MINA is able to properly handle self-symmetries, such symmetries form a particular difficulty for all considered methods and therefore may lead to wrong matchings (last two columns). Another difficulty are drastic non-rigid deformations (dog in the fourth last column).

4.4 Flexibility of MINA Formulation

Next, we demonstrate the flexibility of our MINA model by addressing several variants of shape matching formulations in a proof-of-concept manner.

Outlier rejection.

So far, we assumed that there exists a corresponding convex polyhedron on for each control point . In order to allow that some control points of are not matched to a convex polyhedron on

, we propose to use an outlier rejection mechanism where up to

of the control points can remain unmatched. To this end, we replace the correspondence term in (6) with

(13)

where is a sparse error variable with . Here, we use to denote the row-wise -norm that counts the total number of non-zero rows. To model the -norm as MIP, we introduce the outlier indicator variable , where we impose . Moreover, we make use of the fact that both shapes are spatially bounded, which implies a bounded correspondence error. With that, we can enforce sparsity of with the linear constraint for a sufficiently large (positive) number . As such, whenever , the -th control point does not contribute any error towards the term since will compensate for the discrepancy between and . In Fig. 9 we compare our original formulation with the outlier rejection mechanism, which makes it possible to match pairs of shapes even when the control points are inconsistent between both shapes.

no outlier rejection                         with outlier rejection

 

                                                                                               

Figure 9: Left: without outlier rejection the dog’s upper thigh is matched to the wolf’s neck (red arrows). Right: our outlier rejection variant (13) effectively disregards this control point inconsistency (the unmatched point on is shown in black).
Shape to point cloud matching.

We used MINA for matching a human body mesh (from [1]) to a real point cloud that we acquired using a TreedyScan Full Body Scanner

. The raw point cloud was cropped using a manually specified bounding box, downsampled to about 10k points, and denoised. The control points where sampled using geodesic FPS, where we used a nearest neighbour graph for computing geodesics (and estimating normals) on the point cloud. For this experiment we enforce that

is an injective matching, i.e. we impose . In Fig. LABEL:fig:teaser (middle left) we show the resulting matching, which confirms that our method works well in this setting.

Different topologies.

We used MINA for matching two human shapes with different topologies, as shown in Fig. LABEL:fig:teaser (middle right), where the hands in are not touching, whereas the hands in are touching, as indicated by the geodesic paths between both hands shown as red lines. Here, we used geodesic FPS to sample the control points.

Partial shape matching:

We also match a partial shape to a full shape, which we show in Fig. LABEL:fig:teaser (right). Here, we used geodesic FPS to sample the control points and we enforce that is an injective matching, as above.

Bounded distortion matching.

Our formulation also allows to bound the maximum distortion of a matching. This can be implemented by imposing linear constraints for those and where the geodesic distance between points on and points on exceed the maximum allowed distortion. With that, at most one of the matchings or is allowed.

5 Discussion & Limitations

Although our proposed MINA method has a range of desirable properties, including its high flexibility, its tractability (in practice) due to a low-dimensional matching representation, or its initialisation independence, there are also open points that we aim to address in the future. In Sec. 4.4 we demonstrated that MINA enables matching a mesh to real-world point cloud data. Considering severely cluttered data, cf. [36], or matching shapes with other data representations (e.g. polygon soups) are interesting next steps. A prominent strength of our formulation is that solely using geometric properties already achieves good results. However, additionally incorporating feature descriptors, as commonly done for shape matching, is straightforward and may be useful for further boosting the matching performance.

Scalability.

Our MINA formulation allows to solve non-rigid shape matching problems with being of order . Ideally one would be able to address matching problems with a much denser sampling of control points, so that more severe non-rigid deformations can be modelled accurately. Although we gained a significant scalability improvement compared to a QAP formulation, cf. Fig. 1, a further reduction of the computational time would be beneficial.

Multi matching.

The presented MINA formulation is phrased for matching pairs of shapes. We believe that multi matching problems would also benefit from related formulations. One potential way for achieving this is to consider all pairwise matching problems (in a symmetric fashion), and coupling these using cycle-consistency constraints.

6 Conclusion

We have presented a convex mixed-integer programming formulation for non-rigid shape matching problems, and we have demonstrated that finding the global optimum is tractable in (most) practical scenarios (see Fig. 7). In overall, our formulation comes with a range of benefits: (i) it is more efficient to solve to global optimality compared to the frequently used QAP formulation (Fig. 1), (ii) it is initialisation independent, (iii) it is able to obtain suitable initialisations for dense shape matching methods (Sec. 4.3), and (iv) it is highly flexible in the type of non-rigid shape matching problems it can handle (Sec. 4.4

). Although MIP formulations are oftentimes evaded for matching problems in computer vision (due to their high computational complexity), in this work we have shown that a suitable problem-specific modelling indeed allows to solve non-rigid shape matching problems as MIP.

Acknowledgement: This work was funded by the ERC Consolidator Grant 4DRepLy.

References

  • [1] Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. Scape: Shape completion and animation of people. In SIGGRAPH, 2005.
  • [2] MOSEK ApS. The MOSEK optimization toolbox for MATLAB manual. Version 9.0., 2019.
  • [3] MOSEK ApS. MOSEK Modeling Cookbook. Release 3.2.1, 2020.
  • [4] Mokhtar S Bazaraa and Alwalid N Elshafei. An exact branch-and-bound procedure for the quadratic-assignment problem. Naval Research Logistics Quarterly, 26(1):109–121, 1979.
  • [5] Florian Bernard, Frank R Schmidt, Johan Thunberg, and Daniel Cremers. A combinatorial solution to non-rigid 3d shape-to-image matching. In CVPR, 2017.
  • [6] Florian Bernard, Christian Theobalt, and Michael Moeller. DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems. In CVPR, 2018.
  • [7] Paul J Besl and Neil D McKay. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, volume 1611, pages 586–606. International Society for Optics and Photonics, 1992.
  • [8] Federica Bogo, Javier Romero, Matthew Loper, and Michael J. Black. FAUST: Dataset and evaluation for 3D mesh registration. In CVPR, 2014.
  • [9] Alexander Bronstein, Michael Bronstein, and Ron Kimmel. Numerical Geometry of Non-Rigid Shapes. Springer Publishing Company, Incorporated, 1 edition, 2008.
  • [10] Qifeng Chen and Vladlen Koltun. Robust nonrigid registration by convex optimization. In CVPR, 2015.
  • [11] James Coughlan, Alan Yuille, Camper English, and Dan Snow. Efficient deformable template detection and localization without user initialization. Computer Vision and Image Understanding, 78(3):303–319, 2000.
  • [12] Hongkai Dai, Gregory Izatt, and Russ Tedrake. Global inverse kinematics via mixed-integer convex optimization. The International Journal of Robotics Research, May 2017.
  • [13] Nadav Dym, Haggai Maron, and Yaron Lipman. DS++ - A flexible, scalable and provably tight relaxation for matching problems. ACM Transactions on Graphics (TOG), 36(6), 2017.
  • [14] Pedro F Felzenszwalb. Representation and detection of deformable shapes. TPAMI, 27(2):208–220, 2005.
  • [15] Daniela Giorgi, Silvia Biasotti, and Laura Paraboschi. Shape retrieval contest 2007: Watertight models track, 2007.
  • [16] Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. Unsupervised cycle-consistent deformation for shape matching. In Computer Graphics Forum, volume 38, pages 123–133, 2019.
  • [17] Oshri Halimi, Or Litany, Emanuele Rodola, Alex M Bronstein, and Ron Kimmel. Unsupervised learning of dense shape correspondence. In CVPR, 2019.
  • [18] Ruqi Huang and Maks Ovsjanikov. Adjoint map representation for shape analysis and matching. In Computer Graphics Forum, volume 36, pages 151–163, 2017.
  • [19] Vladimir G. Kim, Yaron Lipman, and Thomas Funkhouser. Blended Intrinsic Maps. ACM Transactions on Graphics (TOG), 30(4), 2011.
  • [20] Yam Kushinsky, Haggai Maron, Nadav Dym, and Yaron Lipman. Sinkhorn algorithm for lifted assignment problems. SIAM Journal on Imaging Sciences, 12(2):716–735, 2019.
  • [21] Zorah Lähner, Emanuele Rodola, Frank R Schmidt, Michael M Bronstein, and Daniel Cremers. Efficient globally optimal 2d-to-3d deformable shape matching. In CVPR, 2016.
  • [22] D Khuê Lê-Huu and Nikos Paragios. Alternating direction graph matching. In CVPR, 2017.
  • [23] Or Litany, Tal Remez, Emanuele Rodola, Alex Bronstein, and Michael Bronstein. Deep functional maps: Structured prediction for dense shape correspondence. In CVPR, 2017.
  • [24] Johan Lofberg. Yalmip: A toolbox for modeling and optimization in matlab. In ICRA, 2004.
  • [25] Eliane Maria Loiola, Nair Maria Maia de Abreu, Paulo Oswaldo Boaventura Netto, Peter Hahn, and Tania Maia Querido. A survey for the quadratic assignment problem. European Journal of Operational Research, 176(2):657–690, 2007.
  • [26] Haggai Maron, Nadav Dym, Itay Kezurer, Shahar Kovalsky, and Yaron Lipman. Point registration via efficient convex relaxation. ACM Transactions on Graphics (TOG), 35(4):73, 2016.
  • [27] Simone Melzi, Jing Ren, Emanuele Rodolà, Abhishek Sharma, Peter Wonka, and Maks Ovsjanikov. Zoomout: Spectral upsampling for efficient shape correspondence. ACM Transactions on Graphics (TOG), 38(6), 2019.
  • [28] Gonzalo Mena, David Belanger, Scott Linderman, and Jasper Snoek. Learning latent permutations with gumbel-sinkhorn networks. In ICLR, 2018.
  • [29] James Munkres. Algorithms for the Assignment and Transportation Problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, Mar. 1957.
  • [30] Andriy Myronenko and Xubo Song. Point set registration: Coherent point drift. TPAMI, 32(12):2262–2275, 2010.
  • [31] Carl Olsson, Fredrik Kahl, and Magnus Oskarsson. Branch-and-bound methods for euclidean registration problems. TPAMI, 31(5):783–794, 2008.
  • [32] Maks Ovsjanikov, Mirela Ben-Chen, Justin Solomon, Adrian Butscher, and Leonidas Guibas. Functional maps: a flexible representation of maps between shapes. ACM Transactions on Graphics (TOG), 31(4):30, 2012.
  • [33] Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017.
  • [34] Jing Ren, Adrien Poulenard, Peter Wonka, and Maks Ovsjanikov. Continuous and orientation-preserving correspondences via functional maps. In SIGGRAPH Asia, 2018.
  • [35] F Rendl, P Pardalos, and H Wolkowicz. The quadratic assignment problem: A survey and recent developments. In Proceedings of the DIMACS workshop on quadratic assignment problems, volume 16, pages 1–42, 1994.
  • [36] Emanuele Rodolà, Andrea Albarelli, Filippo Bergamasco, and Andrea Torsello. A scale independent selection process for 3d object recognition in cluttered scenes. IJCV, 102(1-3):129–145, 2013.
  • [37] Emanuele Rodola, Alex M Bronstein, Andrea Albarelli, Filippo Bergamasco, and Andrea Torsello. A game-theoretic approach to deformable shape matching. In CVPR, 2012.
  • [38] Emanuele Rodolà, Samuel Rota Bulo, Thomas Windheuser, Matthias Vestner, and Daniel Cremers. Dense non-rigid shape correspondence using random forests. In CVPR, 2014.
  • [39] Jean-Michel Roufosse, Abhishek Sharma, and Maks Ovsjanikov. Unsupervised deep learning for structured shape matching. In ICCV, 2019.
  • [40] Christian Schellewald and Christoph Schnörr. Probabilistic subgraph matching based on convex relaxation. In EMMCVPR, 2005.
  • [41] Thomas Schoenemann and Daniel Cremers. A combinatorial solution for model-based image segmentation and real-time tracking. TPAMI, 32(7):1153–1164, 2009.
  • [42] Peter H Schönemann. A generalized solution of the orthogonal procrustes problem. Psychometrika, 31(1):1–10, 1966.
  • [43] Olga Sorkine and Marc Alexa. As-rigid-as-possible surface modeling. In Symposium on Geometry processing, 2007.
  • [44] Robert W Sumner and Jovan Popović. Deformation transfer for triangle meshes. ACM Transactions on graphics (TOG), 23(3):399–405, 2004.
  • [45] Robert W Sumner, Johannes Schmid, and Mark Pauly. Embedded deformation for shape manipulation. ACM Transactions on Graphics (TOG), 26(3):80, 2007.
  • [46] Paul Swoboda, Carsten Rother, Hassan Abu Alhaija, Dagmar Kainmüller, and Bogdan Savchynskyy. Study of lagrangean decomposition and dual ascent solvers for graph matching. In CVPR, 2017.
  • [47] Oliver Van Kaick, Hao Zhang, Ghassan Hamarneh, and Daniel Cohen-Or. A survey on shape correspondence. In Computer Graphics Forum, volume 30, pages 1681–1707. Wiley Online Library, 2011.
  • [48] Matthias Vestner, Zorah Lähner, Amit Boyarski, Or Litany, Ron Slossberg, Tal Remez, Emanuele Rodola, Alex Bronstein, Michael Bronstein, Ron Kimmel, and Daniel Cremers. Efficient deformable shape correspondence via kernel matching. In 3DV, 2017.
  • [49] Matthias Vestner, Roee Litman, Emanuele Rodolà, Alex Bronstein, and Daniel Cremers.

    Product manifold filter: Non-rigid shape correspondence via kernel density estimation in the product space.

    In CVPR, 2017.
  • [50] Juan Pablo Vielma and George L Nemhauser. Modeling disjunctive constraints with a logarithmic number of binary variables and constraints. Mathematical Programming, 128(1-2):49–72, 2011.
  • [51] Thomas Windheuser, Ulrich Schlickewei, Frank R Schmidt, and Daniel Cremers. Geometrically consistent elastic matching of 3d shapes: A linear programming solution. In ICCV, 2011.
  • [52] Jiaolong Yang, Hongdong Li, and Yunde Jia. Go-icp: Solving 3d registration efficiently and globally optimally. In CVPR, 2013.
  • [53] Feng Zhou and Fernando De la Torre. Factorized Graph Matching. TPAMI, 38(9):1774–1789, 2016.

Appendix A Obtaining Convex Polyhedra on

Given the -th control point of , we obtain its associated convex polyhedron using a neighhourhood propagation strategy. To this end, we define a planarity criterion using the maximum of the mean absolute deviation (MAD) of the surface normals at the points in . For a given matrix and its column mean , the MAD is defined as . As such, starting with , we consider the vertices of the -ring of the -th vertex as , where we increase as long as . Here, denotes the matrix of the normals of the -ring of the -th vertex and specifies a user-defined threshold. Once we have determined the largest such that the -ring neighborhood is sufficiently planar (below the threshold ), we discard all points in the rows of that are interior vertices of the convex polygon defined by (as they are redundant). In Fig. 3 (right) we show so-obtained convex polyhedra.

Appendix B Piece-wise Linear Approximation of Constraints

The constraint can be expressed as the orthogonality constraint in combination with . Hence, the constraint comprises exactly quadratic equality constraints, which form a non-convex set. In order to define a piece-wise linear approximation we use specially-ordered set of type 2 (sos2) variables. An sos2 variable is a non-negative vector where at most two consecutive element can be non-zero. With that, such a variable allows to encode a non-convex quadratic function in terms of a piece-wise linear one, so that in the end all quadratic constraints become linear, and the sos2 constraints are imposed based on (few) binary variables.

For illustrative purposes, we will now provide a simple example for a piece-wise linear approximation of a quadratic function. Let us consider the function on the interval . First, we split the domain into bins, so that we evaluate at these discrete positions, and then compute all values that fall in-between the sampled points as linear approximation between its two neighbour sample points. Let , and let be a vector that contains the discretised domain, so that defines the elementwise square of . Moreover, let be a non-negative sos2 variable that sums to one (as mentioned, sos2 means that only two consecutive elements can be non-zero). Then, we can approximate

(14)

For example, for , we obtain the sos2 variable (since ). With that, we obtain . The important property is that (14) allows to approximate the quadratic function based on a representation that is linear in the variables and . In addition to [12] and [50], we refer the interested reader to [3, Ch. 9.1.11]111also available online at
https://docs.mosek.com/modeling-cookbook/mio.html#continuous-piecewise-linear-functions
, where sos2 constraints as well as the idea of using a logarithmic Gray encoding are explained.

Appendix C Search Space Reduction

In addition to using a logarithmic encoding of the discretisation variables, we also impose further constraints upon the matching matrix , so that the size of the overall search space can be reduced. A similar idea has also been pursued in [26]

, where a scalar criterion based on the average geodesic distance (ADG) was used. In contrast, rather than using a single scalar value for each vertex, we propose to leverage a more powerful approach that considers more descriptive statistics of geodesic distances, see Fig. 

10. To this end, for each control point we compute evenly spaced percentiles from to of the geodesic distance from this control point to all other points. Let and denote the so-obtained percentile matrices, where the columns are the ordered percentiles from to . As such, the matrices and can be seen as features of the respective shapes extracted at the control points. Whenever two control points correspond to each other, the features and should be similar, so that is small. Based on this observation, we use the feature distances and sequentially solve linear assignment problems (LAP) [29] to match features. The idea of solving a sequence of LAPs is to not only find the single best matching of features, but rather finding multiple solutions , so that the nonzero elements in define the allowed matchings in . Here, the matrix is obtained by performing a feature matching using when forbidding all previous matchings . As such, when optimising MINA, we constrain all elements of to be zero for those elements where is zero. Using this procedure is advantageous over simple thresholding of , since on the one hand feasibility is guaranteed, and on the other hand the number of allowed matchings is equal for all control points.

    (ADG)     (ours)

Figure 10: Shape (left) is matched to , where a search space reduction using ADG [26] leads to wrong matchings (middle), whereas ours produces correct correspondences (right).

Appendix D Further Implementation Details

We have implemented MINA in the optimisation modelling toolbox Yalmip [24], which uses the conic mixed-integer branch and bound solver MOSEK [2] as backend (with default parameters). In all experiments we used and , where we account for different problem sizes by multiplying each with , where denotes the total number of elements that the norm is applied to. We set the weights for the smoothness term to , where for by we denote the length of the common edge of triangles . With that, we achieve that the deformation of two adjacent triangles is more flexible when their common edge is small. We set the planarity threshold to . For keeping the number of variables small, for each convex polyhedron we only keep the respective control point as well as four additional points obtained via farthest point sampling (FPS) using geodesic distances as metric. Note that this results in convex polyhedra that are either a single point (if none of the -rings of the -th control point satisfies the planarity criterion), or is a matrix. Since the non-rigid deformation induced by a sparse set of matched control points is relatively coarse, rather than modelling with the original mesh resolution we use downsampled meshes with about faces, similarly as in [45]. We set , , and use bins for the discretisation.

Next, we provide additional details on shape to point cloud matching and the relation between partial shape matching and outlier rejection.

Shape to point cloud matching.

The main difference when is represented as a point cloud rather than a mesh is that we need to use a different approach for computing geodesic distances and normals (required for sampling control points, for the definition of the convex polyhedra as described in Sec. A, and for the search space reduction described in Sec. C). In our case we compute geodesic distances and normals based on a nearest neighbour graph, where we use the nearest neighbours. After this information is obtained, the overall optimisation problem is equivalent to the one when is a mesh, since the only information of that is explicitly used in our optimisation problem formulation are the convex polyhedra.

Relation between partial shape matching and outlier rejection.

In our considered partial shape matching setting we match all control points of the partial shape to the full shape . This is in contrast to our outlier rejection setting, where we allow that some control points of are not matched to . However, although for partial shape matching we do not use outlier rejection, we mention that principally it could be used for matching a full shape to a partial one.

Appendix E Additional TOSCA Results

In Fig. 11 we report runtime statistics over all shape matching instances from the TOSCA datasets for all considered methods. On this dataset, the median processing time of our method is min, whereas the other methods require less than one minute.

Figure 11: Runtime statistics for the TOSCA dataset. Note that the vertical axis is shown in log-scale.

In Fig. 12 we present further results where also the deformed shape is shown.


 

Figure 12: Correspondences obtained from our method for several shape matching instances from the TOSCA dataset [9]. Correspondences are indicated by dots with corresponding colours. In each triplet of rows we show , , and the deformed shape from top to bottom. Note that the deformation is not always able to obtain a good alignment (particularly for severe non-rigid transformations, e.g. the cat or the gorilla), but the correspondences are still reasonable in many cases.