A Message Passing Algorithm for the Minimum Cost Multicut Problem

12/16/2016 ∙ by Paul Swoboda, et al. ∙ 0

We propose a dual decomposition and linear program relaxation of the NP -hard minimum cost multicut problem. Unlike other polyhedral relaxations of the multicut polytope, it is amenable to efficient optimization by message passing. Like other polyhedral elaxations, it can be tightened efficiently by cutting planes. We define an algorithm that alternates between message passing and efficient separation of cycle- and odd-wheel inequalities. This algorithm is more efficient than state-of-the-art algorithms based on linear programming, including algorithms written in the framework of leading commercial software, as we show in experiments with large instances of the problem from applications in computer vision, biomedical image analysis and data mining.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Decomposing a graph into meaningful clusters is a fundamental primitive in computer vision, biomedical image analysis and data mining. In settings where no information is given about the number or size of clusters, and information is given only about the pairwise similarity or dissimilarity of nodes, a canonical mathematical abstraction is the minimum cost multicut (or correlation clustering) problem [14]. The feasible solutions of this problem, multicuts, relate one-to-one to the decompositions of the graph. A multicut is the set of edges that straddle distinct clusters. The cost of a multicut is the sum of costs attributed to its edges.

In the field of computer vision, the minimum cost multicut problem has been applied in [3, 4, 39, 6] to the task of unsupervised image segmentation defined by the BSDS data sets and benchmarks [30] . In the field of biomedical image analysis, the minimum cost multicut problem has been applied to an image segmentation task for connectomics [5]. In the field of data mining, applications include [7, 33, 12, 13]. As the minimum cost multicut problem is np-hard [9, 16], even for planar graphs [8] large and complex instances with millions of edges, especially those for connectomics, pose a challenge for existing algorithms.

Related Work. Due to the importance of multicuts for applications, many algorithms for the minimum cost multicut problem have been proposed. They are grouped below into three categories: primal feasible local search algorithms, linear programming algorithms and fusion algorithms.

Primal feasible local search algorithms [35, 31, 20, 18, 19] attempt to improve an initial feasible solution by means of local transformations from a set that can be indexed or searched efficiently. Local search algorithms are practical for large instances, as the cost of all operations is small compared to the cost of solving the entire problem at once. On the downside, the feasible solution that is output typically depends on the initialization. And even if a solution is found, optimality is not certified, as no lower bound is computed.

Linear programming algorithms [24, 25, 27, 32, 38] operate on an outer polyhedral relaxation of the feasible set. Their output is independent of their initialization and provides a lower bound. This lower bound can be used directly inside a branch-and-bound search for certified optimal solutions. Alternatively, the LP relaxation can be tightened by cutting planes. Several classes of planes are known that define a facet of the multicut polytope and can be separated efficiently [14]. On the downside, algorithms for general LPs that are agnostic to the structure of the multicut problem scale super-linearly with the size of the instance.

Fusion algorithms attempt to combine feasible solutions of subproblems obtained by combinatorial or random procedures into successively better multicuts. The fusion process can either rely on column generation [39], binary quadratic programming [11] or any algorithm for solving integer LPs [10]. In particular, [39] provides dual lower bounds but is restricted to planar graphs. [11, 10] explore the primal solution space in a clever way, but do not output dual information.

Outline. Below, a discussion of preliminaries (Sec. 2) is followed by the definition of our proposed decomposition (Sec. 3) and algorithm (Sec. 4) for the minimum cost multicut problem. Our approach combines the efficiency of local search with the lower bounds of LPs and the subproblems of fusion, as we show in experiments with large and diverse instances of the problem (Sec. 5). All code and data will be made publicly available upon acceptance of the paper.

2 Preliminaries

2.1 Minimum Cost Multicut Problem

A decomposition (or clustering) of a graph is a partition of the node set such that and every cluster , is connected. The multicut induced by a decomposition is the subset of those edges that straddle distinct clusters (cf. Fig. 1). Such edges are said to be cut. Every multicut induced by any decomposition of is called a multicut of . We denote by the set of all multicuts of .

Given, for every edge , a cost of this edge being cut, the instance of the minimum cost multicut problem w.r.t. these costs is the optimization problem (1) whose feasible solutions are all multicuts of . For any edge , negative costs favour the nodes and to be in distinct components. Positive costs favour these nodes to lie in the same component.


Figure 1: Depicted above is a decomposition of a graph into three components (green). The multicut induced by this decomposition consists of the edges that straddle distinct components (red).

This problem is np-hard [9, 16], even for planar graphs [8]. Below, we recapitulate its formulation as a binary LP and then turn to LP relaxations: For any 01-labeling of the edges of , the subset of those edges labeled 1 is a multicut of if and only if satisfies the system (3) of cycle inequalities [14]. Hence, (1) can be stated equivalently in the form of the binary LP (2)–(4).

subject to (3)

An LP relaxation is obtained by replacing the integrality constraints (4) by with . This results in an outer relaxation of the multicut polytope

, which is the convex hull of the characteristic functions of all multicuts of

. The LP relaxation obtained for , i.e., with only the cycle inequalities, will not in general be tight.

A tighter LP relaxation is obtained by enforing also the odd wheel inequalities [14]. A -wheel is a cycle in with nodes all of which are connected to an additional node that is not in the cycle and is called the center of the -wheel (cf. Fig. 3). For any odd number , any -wheel of , the cycle and the center of the -wheel, every characteristic function of a multicut of satisfies the odd wheel inequality


For completeness, we note that other inqualities known to further tighten the LP relaxation can be included in our algorithm, e.g., the bicycle inequalities [14] defind on graphs as in Fig. 3. We, however, do not consider inequalities other than cycles and odd wheels in the algorithm we propose.

Figure 2: Odd Wheel

Figure 3: Odd Bicycle Wheel

2.2 Integer relaxed pairwise separable LPs

LP relaxations of the multicut problem can in principle be solved with algorithms for general LPs which are available in excellent software such as CPlex [2] and Gurobi [22]. However, these algorithms scale super-linearly with the size of the problem and are hence impractical for large instances.

We define in Sec. 3 an LP relaxation of the multicut problem in form of an IRPS-LP (Def. 1). IRPS-LPs are a special case of dual decomposition [21]. In Def. 1, every defines a subproblem, and every edge defines a dependency of subproblems. Def. 1 is more specific in that, firstly, the subproblems are binary and, secondly, the linear constraints (9

) that describe the dependence of subproblems are defined by 01-matrices that map 01-vectors to 01-vectors. IRPS-LPs are amenable to efficient optimization by message passing in the framework of


Definition 1 (Irps-Lp [36]).

Let and let be a graph with . For every , let , let , and let . Let . For every , let , and such that


Then, the LP written below is called integer relaxed pairwise separable w.r.t. the graph .

subject to (9)

3 Dual Decomposition

Figure 4: Depicted above is a triangulated cycle (black) covered by three triangles (red, green and blue)

A straight-forward decomposition of the minimum cost multicut problem (2)–(4) in the form of an IRPS-LP (Def. 1) consists of one subproblem for every edge, one subproblem for every cycle inequality and one subproblem for every odd-wheel inequality. From a computational perspective, it is however advantageous to triangulate cycles and odd wheels, and to consider the resulting smaller subproblems. Below, three classes of subproblems are defined rigorously.

Edge Subproblems.

For every edge , we consider a subproblem with the feasible set , encoding whether edge is cut (1) or uncut (0).

Triangle Subproblems

For every cycle , we consider the triangles to , as depicted in Fig. 4. If some edge of a triangle is not in , we add it to with cost zero, i.e., we triangulate the cycle in . For each triangle , we introduce a subproblem whose feasible set consists of the five feasible multicuts of the triangle, i.e., .

Lollipop Subproblems

For every odd number and every -wheel of consisting of a center node and cycle nodes , we introduce two classes of subproblems. For the 5-wheel depicted in Fig. 3, these subproblems are depicted in Fig. 5.

For every , we add the triangle subproblem , as described in the previous section.

For every , we add the subproblem for the lollipop graph that consists of the triangle and the additional edge . The feasible set of a lollipop graph has ten elements, five feasible multicuts of the triangle times two for accounting for the additional edge.

3.1 Dependencies

The dependency between triangle subproblems and edge subproblems are expressed below in the form of a linear system. It fits into thee form (9) of an IRPS-LP.

The dependency between a lollipop subproblem with edge set and a triangle subproblem with edge set is stated below as a linear system with sums over edges not shared between and . This linear system has the form (9) of an IRPS-LP.

3.2 Remarks

Remark 1. The triangulation of cycles can be understood as the constructing of a junction tree [37] in such a way that the minimum cost multicut problem over the cycle can be solved by dynamic programming. The triangulation of cycles can also be understood as a tightening of an outer polyhedral relaxation of the multicut polytope: A cycle inequality (3) defines a facet of the multicut polytope if and only if the cycle is chordless [14]. By triangulating a cycle, we obtain a set of minimal chordless cycles (triangles) whose cycle inequalities together imply that of the entire cycle.

Remark 2. Technically, we would not have needed to include triangle subproblems for odd wheels. Instead, we could have introduced dependencies between lollipops directly in the form of an IRPS-LP. However, by introducing triangle factors in addition and by expressing dependencies between lollipops and triangles, we couple lollipop factors from different odd wheels more tightly whenever they share the same triangles.

Figure 5: Depicted above is a triangulation of the odd wheel from Figure 3. It consists of the triangles and the lollipop graphs .

4 Algorithm

We now define an algorithm for the minimum cost multicut problem (2)–(4). This algorithm takes an instance of the problem as input and alternates for a fixed number of iterations between two main procedures.

The first procedure, defined in Sec. 4.1, solves an instance of a dual of the IRPS-LP relaxation defined in the previous section. The output consists in a lower bound and a re-parameterization of the instance of the minimum cost multicut problem given as input. The second procedure tightens the IRPS-LP relaxation by adding subproblems for cycle inequalities (3) and odd wheel inequalities (5) violated by the current solution. Separation procedures for finding such violated inequalities, more efficiently than in cutting plane algorithms for the primal [24, 25, 27], are defined in Sec. 4.2.

To find feasible solutions of the instance of the minimum cost multicut problem given as input, we apply a state-of-the-art local search algorithm on the computed re-parameterizations, a procedure commonly referred to as rounding (Sec. 4.3).

4.1 Message Passing

Like other algorithms based on dual decomposition, the algorithm we propose does not solve the IRPS-LP directly, in the primal domain, but optimizes a dual of (8)–(9). Specifically, it operates on a space of re-parametrizations of the problem defined below: For any two dependent subproblems , we can change the costs and by an arbitrary vector according to the update rules


We refer to any update of according to the rules (10)–(11) as message passing. Message passing does not change the cost of any primal feasible solution, as


Message passing does, however, change the dual lower bound to (8) given by


The maximum of over all costs obtainable by message passing is equal to the minimum of (8), by linear programming duality. We seek to alter the costs by means of message passing so as to maximize the lower bound . For the general IRPS-LP, a framework of algorithms to achieve this goal is defined in [36]. For the minimum cost multicut problem, we define and implement Alg. 1 within this framework. The specifics of this algorithm for the minimum cost multicut problem are discussed below. General properties of message passing for IRPS-LP s are discussed in [36].

Data: , ,
for  do
       if  is an edge subproblem : then
             Receive messages:
             for  do
             end for
            Send messages:
             for  do
             end for
       end if
      if  is a triangle subproblem with edges : then
             Receive messages:
             for lollipops with  do
             end for
            Send messages:
             for lollipops with  do
             end for
            for lollipops with  do
             end for
       end if
end for
Algorithm 1 Message passing for the multicut problem
Factor Order.

Alg. 1 iterates through all edge and triangle subproblems. The order is specified as follows: We assume that a node order is given. With respect to this node order, edges are ordered lexicographically. For every triangle and its edge set with , we define the ordering constraint . For every lollipop graph and its edge set with , we define the ordering constraint . The strict partial order defined by these constraints is extended to a total order by topological sorting.

Message Passing Description.

When an edge subproblem is visited, Alg. 1 receives messages from all dependent triangle subproblems. Having received a message from triangle , the costs satisfy the condition

In other words, the cost of the triangle factor has no preference for either or . Sending messages from is analoguous: Having sent messages from , we have , i.e., there is again no preference for either or .

When we visit a triangle subproblem , we do the analogous with all dependent lollipop subproblems: Once messages have been received, lollipop subproblems have no preference for incident edges. Once messages have been sent, this holds true for the triangle subproblems.

Once Alg. 1 has visited all subproblems and terminates, we reverse the order of subproblems and invoke Alg. 1 again. This double call of Alg. 1 is repeated for a fixed number of iterations that is a parameter of our algorithm.

4.2 Separation

Applying Alg. 1 with all cycles and all odd wheels of a graph is impractical, as the number of triangles for cycle inequalities (3) is cubic, and the number of lollipop graphs for odd wheels (5) is quartic in . In order to arrive at a practical algorithm, we take a cutting plane approach in which we separate and add subproblems for violated cycle and odd wheel inequalities periodically. Initially, contains only one element for every edge , and is empty.

In the primal, given some fractional , it is common to look for maximally violated inequalities (3) and (5). This is possible in polynomial time via shortest path computations [14, 17]. In our dual formulation, we have no primal solution to search for violated inequalities. Here, a suitable criterion is to consider those additional triangle or lollipop subproblems that necessarily increase the dual lower bound by some constant . Among these subproblems, we choose those for which the increase is maximal and add them to the graph . A similar dual cutting plane approach has shown to be useful for graphical models in [34]. As we discuss below, separation is more efficient in the dual than in the primal.

4.2.1 Cycle Inequalities

for  do
       if  then
       end if
end for
for  do
       if  and find(u) = find(v) then
       end if
end for
Algorithm 2 Separation of cycle inequalities (3)

We characterize those cycles whose subproblem increases the dual lower bound by at least .

Proposition 1.

Let be a cycle with and for . Then, the dual lower bound can be increased by by including a triangulation of .

In order to find such cycles, we apply Alg. 2. This algorithm first records in a disjoint set data structure whether distinct nodes are connected via edges with weight . Then, it visits all edges with . If the endpoints of are connected by a path along which all edges have weight at least , it searches for a shortest such path by means of breadth first search.

In the primal, finding a maximally violated cycle inequality (3) is more expensive, requiring, for every edge , the search for a -path with minimum cost  [14] by, e.g., Dijkstra’s algorithm.

4.2.2 Odd Wheel Inequalities

Data: Triangles , costs ,
for  do
       , ,
       for triangles  do
             if (16) holds true then
             end if
       end for
      for  do
             if  then
                   if  is a simple cycle in  then
                   end if
             end if
       end for
end for
Algorithm 3 Separation of odd wheel inequalities (5)

We characterize those odd wheels whose lollipop subproblem increases the lower bound by at least .

Proposition 2.

Let an odd wheel with center node and cycle nodes . Adding the lollipop subproblems for increases by at least if the costs of each triangle are such that the minimal cost of any edge labeling of the triangle cutting precisely one edge incident to is smaller by than the minimal cost of any edge labeling of the triangle cutting or edges incident to . That is:


In order to find such odd wheels, we apply Alg. 3. This algorithm builds on our observation that we need to look only at triangles whose subproblem has already been added. Hence, Alg. 3 visits each node and builds a bipartite graph as follows. (An example is depicted in Fig. 6 for a 5-wheel and (16) holding true for all triangles of the wheel.) For each triangle such that (16) holds true, four nodes are added to , two copies of each original node. These are joined by edges . If a path from to exists in , we have found a violated odd wheel inequality (5). As is bipartite, a -path in corresponds to an odd cycle in . As before, the search for paths is accelerated by connectivity tests via a disjoint set data structure and is carried out by breadth first search.

In the primal, finding a maximally violated odd wheel inequality (5) entails the same construction of the bipartite graph for each node  [17]. However, a shortest path search w.r.t. edge costs needs to be carried out by Dijkstra’s algorithm instead of breadth first search. Further complication in the primal comes from the fact that a separation algorithm needs to visit all in order to compute the shortest -path in .

Figure 6: Depicted above is the bipartite graph constructed by Alg. 3 for separating the 5-wheel depicted in Fig. 3.

4.3 Rounding

Our message passing Alg. 1 improves a dual lower bound on (2), but does not provide a feasible solution of (2)–(4). In order to obtain a feasible multicut, we apply a local search algorithm defined in [26], namely greedy additive edge contraction (GAEC), followed by Kernighan-Lin with joins (KLj). GAEC computes a multicut by greedily contracting those edges for which the join decreases the cost maximally. It stops as soon as no contraction of any edge strictly decreases the cost. KLj attempts to improve a given multicut recursively by applying transformations from three classes: (1) moving nodes between two components, (2) moving nodes from a given component to a newly forming one or (3) joining two components. GAEC and KLj are local search algorithms that output a feasible multicut that need not be optimal.

We apply GAEC and KLj not only to the instance of the minimum cost multicut problem given as input but also to the re-parameterization of this instance output by Alg. 1. The rationale for doing so comes from LP duality:

Proposition 3.

Assume maximizes the dual lower bound and the relaxation is tight, i.e.


Moreover, let such that is an optimal multicut of . Then,


Having run Alg. 1 for a while, we expect to fulfill the sign condition of Prop. 3 approximately. Therefore, the sign of will be a good hint of the edge being cut. Thus, informally, we expect local search algorithms operating on the re-parameterized instance of the problem to yield better feasible multicuts than local search algorithms operating on the given instance.

For MAP-inference in discrete graphical models, it is known from [28, 29] that primal rounding can be improved greatly when applied to cost functions re-parameterized by message passing.

5 Experiments


We compare against several state of the art algorithms.

  • The algorithm MC-ILP [25] is an efficient implementation of a cutting plane algorithm solving (2) using cycle inequalities (3) in a cutting plane fashion. CPlex [2] is used to solve the underlying ILP problems. The integrality conditions in (4) are directly given to the solver. According to [25] this is beneficial due to the excellent branch and cut capabilities of CPlex [2].

  • Cut, Glue & Cut [11], abbreviated as CGC, is a move making algorithm using planar max-cut subproblems to improve multicuts.

  • Fusion moves for correlation clustering [10], abbreviated as CC-Fusion, fuses multicuts generated by various proposal generator with the help of auxiliary multicut problems, solved in turn by MC-ILP

    . We use randomized hierarchical clustering and randomized watersheds as proposal generators, identified by the suffixes

    -RHC and -RWS. We use parameters for the proposal generators as recommended by the authors [10].

  • MP-C denotes Algorithm 1 when we only separate for cycle inequalities (3) by Algorithm 2, while MP-COW denotes that we additionally separate for odd wheel inequalities (5) by Algorithm 3. We search for triangles and lollipops to add every 10th iteration.

  • KL is the GAEC and KLj implementation [26] described in Section 4.3 for computing multicuts. We let KL run every 100th iteration of MP-C and MP-COW on the current reparametrized edge costs.

MC-ILP, CGC and CC-Fusion are implemented as part of the OpenGM suite [23]. Only MC-ILP and our solvers MP-C and MP-COW generate dual lower bounds. CGC also outputs dual lower bounds, but these are equivalent to the trivial lower bound , where edge weights are as given by the problem. It has been shown that CGC, CC-Fusion and KL

outperform other primal heuristics 

[10], hence we do not compare to any other heuristic algorithm. Also MC-ILP outperforms the LP-based solver [32], due to the latter using the slower COIN-OR CLP [15] solver internally, hence we exclude it from the comparison as well.

All solvers were run on a laptop computer with a i5-5200 CPU with 2.2 GHz and 8GB RAM.


We compare on 8 datasets of diverse origin.

  • image-seg consists of images of the Berkeley segmentation dataset [30], presegmented with superpixels, for which pairwise affinity values have been computed as in [4].

  • The knott-3d-{150|300|450|550} datasets come from a neural circuit reconstruction problem of tissue [5] with , , and voxels. The data is presegmented into supervoxels.

  • modularity clustering aims to cluster a social network into subgroups based on affinity between individual persons.

  • CREMI-{small|large} datasets were constructed as part of the CREMI [1] challenge, which aims to reconstruct neural circuits of the adult fly brain. The images are taken by electron microscopy. The -small instances are cropped versions of the -large ones. To our knowledge, the CREMI-large dataset contain the largest multicut problems approached with LP-based methods.

The image-seg, knott-3d and modularity clustering datasets were taken from the OpenGM benchmark [23], while the CREMI datasets were kindly provided by their authors and are not yet published.

The dataset consists of 100, 8, 8, 8, 8, 6, 3 and 3 instances, in total 144. Dataset details can be found in Table 1.


We have set a timelimit of one hour for all algorithms. In Table 1 results averaged over all instances in specific datasets are reported. In Figure 7 primal solution energy and dual lower bound (where applicable) averaged over all instances in specific datasets are drawn against runtime.

As can be seen from Table 1, except for dataset CREMI-large, our solver MP-COW gives dual bounds that are within 0.0045%, 1.9%, 0.0061%, 0.0068%, 0.0017%, 0.0007% and 0.0083% of the dual lower bound obtained by MC-ILP, which uses the advanced branch-and-cut facilities CPlex [2] provides. For CREMI-large only our solvers MP-C and MP-COW output dual lower bounds, as MC-ILP did not finish a single iteration after one hour. As can be seen from Fig. 7 our lower bound usually converges faster than MC-ILP’s. We conjecture that MC-C and MP-COW inside a branch-and-bound solver can significantly extend the reach of exact methods for the multicut problem.

Strangely, KL does not perform well on image-seg, even though the lower bound we achieve with MP-C and MP-COW are not far from the optimal lower bounds computed by MC-ILP. On the other hand, MP-C and MP-COW give much better dual and primal results for modularity-clustering early on. Generally, when compared to MC-ILP’s primal convergence, we give much lower values early on, and for the large-scale datasets knott-3d-550, CREMI-small, CREMI-large, MCI-ILP’s primal solutions are not useful anymore.

Unlike MC-ILP, our reparametrized costs can be used to improve heuristic primal algorithms. An example of this can be seen in Fig. 8, where reparametrized costs improve KL’s solutions.

Dataset / Algorithm MP-C MP-COW CGC MC-ILP CC-Fusion-RWS CC-Fusion-RHC
image-seg #I 100 UB 4730.66 4732.66 4600.81 4434.91 4447.06 4436.33
#V LB 4434.69 4434.71 4129.70 4434.91
#E time(s) 21.92 41.35 0.14 11.89 1.19 1.30
modularity clustering #I 6 UB -0.49 -0.49 -0.30 -0.44 0.00 -0.44
#V LB -0.53 -0.53 -0.79 -0.52
#E time(s) 0.68 0.80 0.15 2911.10 0.00 17.78
knott-3d-150 #I 8 UB -4570.61 -4570.26 -4220.66 -4571.69 -4534.76 -4552.51
#V LB -4572.65 -4571.97 -4855.18 -4571.69
#E time(s) 0.84 3.81 0.04 2.37 0.26 0.53
knott-3d-300 #I 8 UB -27285.41 -27285.15 -24864.59 -27302.78 -27242.03 -27247.29
#V LB -27307.36 -27304.64 -28901.58 -27302.78
#E time(s) 475.63 747.81 2.73 227.33 2.96 8.15
knott-3d-450 #I 8 UB -78426.70 -78426.70 -70865.27 -78391.32 -78386.14 -78381.06
#V LB -78527.23 -78523.89 -83272.85 -78522.51
#E time(s) 3649.05 3640.66 31.56 1840.47 16.52 119.23
knott-3d-550 #I 8 UB -136439.78 -136439.78 -123841.47 -135766.90 -136464.05 -136395.89
#V LB -136814.88 -136803.34 -144703.64 -136755.36
#E time(s) 3783.33 3745.76 102.42 3683.22 72.94 594.60
CREMI-small #I 3 UB -213167.45 -213167.45 -194616.60 -209594.49 -168905.17 -213117.84
#V LB -213225.65 -213225.67 -215473.98 -213208.94
#E time(s) 3645.51 3661.71 319.01 2775.81 3543.61 2555.48
CREMI-large #I 3 UB -3886840.98 -3886840.98 -3772597.37 -3619190.20
#V LB -3892753.21 -3893090.30
#E time(s) 3667.67 3806.33 5978.08 23139.40
Table 1: Primal solution energy (UB)/dual lower bound (LB)/runtime in seconds averaged over all instances of datasets. #I means number of instances in dataset, #V and #E mean number of vertices and edges in multicut instances. signifies method did not finish one iteration after one hour, so was excluded from comparison. means method does not output dual lower bound. Bold numbers signify lowest primal solution energy, highest lower bound, fastest runtime.

Figure 7: Averaged runtime plots for image-seg, modularity clustering, knott-3d-150, knott-3d-300, knott-3d-450, knott-3d-550, CREMI-small and CREMI-large datasets. Continuous lines denote dual lower bounds and dashed ones primal energies. Values are averaged over all instances of the dataset. The x-axis is logarithmic.
Figure 8: Instance gm_knott_3d_072 from dataset knott-3d-300 where reparametrized costs improve KL’s solutions.

We have shown that LP-based methods are feasible for solving large scale multicut problems on commodity hardware and one does not have to resort to heuristic primal algorithms. We achieve dual bounds very close to those computed by state-of-the-art branch-and-cut solvers. Additionally, our method usually gives much faster dual bound convergence, resulting in superior solutions when terminated early. Also the primal heuristic GAEC + KLj can be improved when run on costs as computed by our method.

It remains an interesting task to integrate primal heuristics more tightly into our message passing approach and further improve the dual lower bound by e.g. embedding our solver into branch and cut.

6 Acknowledgments

The authors would like to thank Vladimir Kolmogorov for helpful discussions. This work is partially funded by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no 616160.


  • [1] CREMI MICCAI Challenge on circuit reconstruction from Electron Microscopy Images. https://cremi.org.
  • [2] IBM ILOG CPLEX Optimizer. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/.
  • [3] A. Alush and J. Goldberger. Break and conquer: Efficient correlation clustering for image segmentation. In E. R. Hancock and M. Pelillo, editors, SIMBAD, volume 7953 of Lecture Notes in Computer Science, pages 134–147. Springer, 2013.
  • [4] B. Andres, J. H. Kappes, T. Beier, U. Köthe, and F. A. Hamprecht. Probabilistic image segmentation with closedness constraints. In D. N. Metaxas, L. Quan, A. Sanfeliu, and L. J. V. Gool, editors, ICCV, pages 2611–2618. IEEE Computer Society, 2011.
  • [5] B. Andres, T. Kröger, K. L. Briggman, W. Denk, N. Korogod, G. Knott, U. Köthe, and F. A. Hamprecht. Globally optimal closed-surface segmentation for connectomics. In A. W. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, editors, ECCV (3), volume 7574 of Lecture Notes in Computer Science, pages 778–791. Springer, 2012.
  • [6] B. Andres, J. Yarkony, B. S. Manjunath, S. Kirchhoff, E. Turetken, C. C. Fowlkes, and H. Pfister. Segmenting planar superpixel adjacency graphs w.r.t. non-planar superpixel affinity graphs.

    Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR)

    , 2013.
  • [7] A. Arasu, C. Ré, and D. Suciu. Large-scale deduplication with constraints using dedupalog. In Y. E. Ioannidis, D. L. Lee, and R. T. Ng, editors, ICDE, pages 952–963. IEEE Computer Society, 2009.
  • [8] Y. Bachrach, P. Kohli, V. Kolmogorov, and M. Zadimoghaddam. Optimal coalition structure generation in cooperative graph games. In

    Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, July 14-18, 2013, Bellevue, Washington, USA.

    , 2013.
  • [9] N. Bansal, A. Blum, and S. Chawla. Correlation clustering. Machine Learning, 56(1):89–113, 2004.
  • [10] T. Beier, F. A. Hamprecht, and J. H. Kappes. Fusion moves for correlation clustering. In CVPR, pages 3507–3516. IEEE Computer Society, 2015.
  • [11] T. Beier, T. Kröger, J. H. Kappes, U. Köthe, and F. A. Hamprecht. Cut, glue & cut: A fast, approximate solver for multicut partitioning. In CVPR. Proceedings, 2014.
  • [12] Y. Chen, S. Sanghavi, and H. Xu. Clustering sparse graphs. In P. L. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, NIPS, pages 2213–2221, 2012.
  • [13] F. Chierichetti, N. Dalvi, and R. Kumar. Correlation clustering in mapreduce. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pages 641–650, New York, NY, USA, 2014. ACM.
  • [14] S. Chopra and M. R. Rao. The partition problem. Mathematical Programming, 59(1):87–115, 1993.
  • [15] COIN-OR CLP, 2016. http://www.coin-or.org/projects/Clp.xml.
  • [16] E. D. Demaine, D. Emanuel, A. Fiat, and N. Immorlica. Correlation clustering in general weighted graphs. Theor. Comput. Sci., 361(2):172–187, Sept. 2006.
  • [17] M. M. Deza and M. Laurent. Geometry of Cuts and Metrics. Springer Publishing Company, Incorporated, 1st edition, 2009.
  • [18] M. Elsner and E. Charniak. You talking to me? a corpus and algorithm for conversation disentanglement. In K. McKeown, J. D. Moore, S. Teufel, J. Allan, and S. Furui, editors, ACL, pages 834–842. The Association for Computer Linguistics, 2008.
  • [19] M. Elsner and W. Schudy. Bounding and comparing methods for correlation clustering beyond ILP. In Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, ILP ’09, pages 19–27, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics.
  • [20] A. Gionis, H. Mannila, and P. Tsaparas. Clustering aggregation. ACM Trans. Knowl. Discov. Data, 1(1):4, 2007.
  • [21] M. Guignard and S. Kim. Lagrangean decomposition for integer programming: theory and applications. Revue française d’automatique, d’informatique et de recherche opérationnelle. Recherche opérationnelle, 21(4):307–323, 1987.
  • [22] Gurobi Optimization, Inc., 2015. http://www.gurobi.com.
  • [23] J. H. Kappes, B. Andres, F. A. Hamprecht, C. Schnörr, S. Nowozin, D. Batra, S. Kim, B. X. Kausler, T. Kröger, J. Lellmann, N. Komodakis, B. Savchynskyy, and C. Rother. A comparative study of modern inference techniques for structured discrete energy minimization problems. International Journal of Computer Vision, 115(2):155–184, 2015.
  • [24] J. H. Kappes, M. Speth, B. Andres, G. Reinelt, and C. Schnörr. Globally optimal image partitioning by multicuts. In EMMCVPR. Springer, Springer, 2011.
  • [25] J. H. Kappes, M. Speth, G. Reinelt, and C. Schnörr. Higher-order segmentation via multicuts. CoRR, abs/1305.6387, 2013.
  • [26] M. Keuper, E. Levinkov, N. Bonneel, G. Lavoué, T. Brox, and B. Andres. Efficient decomposition of image and mesh graphs by lifted multicuts. In ICCV, 2015.
  • [27] S. Kim, S. Nowozin, P. Kohli, and C. D. Yoo. Higher-order correlation clustering for image segmentation. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, editors, NIPS, pages 1530–1538, 2011.
  • [28] V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell., 28(10):1568–1583, 2006.
  • [29] V. Kolmogorov. A new look at reweighted message passing. IEEE Trans. Pattern Anal. Mach. Intell., 37(5):919–930, 2015.
  • [30] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, volume 2, pages 416–423 vol.2, 2001.
  • [31] V. Ng and C. Cardie. Improving machine learning approaches to coreference resolution. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02, (July):104, 2001.
  • [32] S. Nowozin and S. Jegelka.

    Solution stability in linear programming relaxations: graph partitioning and unsupervised learning.

    In A. P. Danyluk, L. Bottou, and M. L. Littman, editors, ICML, volume 382 of ACM International Conference Proceeding Series, pages 769–776. ACM, 2009.
  • [33] E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query refinements by user intent. In World Wide Web Conference (WWW). ACM Press, April 2010.
  • [34] D. Sontag, D. K. Choe, and Y. Li. Efficiently searching for frustrated cycles in MAP inference. In UAI, pages 795–804. AUAI Press, 2012.
  • [35] W. M. Soon, H. T. Ng, and D. C. Y. Lim. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521–544, 2001.
  • [36] P. Swoboda, J. Kuske, and B. Savchynskyy. A dual ascent framework for Lagrangean decomposition of combinatorial problems. CoRR, 2016.
  • [37] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008.
  • [38] J. Yarkony, T. Beier, P. Baldi, and F. A. Hamprecht. Parallel multicut segmentation via dual decomposition. In New Frontiers in Mining Complex Patterns - Third International Workshop, NFMCP 2014, Held in Conjunction with ECML-PKDD 2014, Nancy, France, September 19, 2014, Revised Selected Papers, pages 56–68, 2014.
  • [39] J. Yarkony, A. Ihler, and C. C. Fowlkes. Fast Planar Correlation Clustering for Image Segmentation, pages 568–581. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.

7 Apendix

7.1 Proofs

Proof of Proposition 1

Let be a cycle with and for . Then, the dual lower bound can be increased by by including a triangulation of .


Let cycle have vertices and assume that with for notational purposes. After triangulation, triangle factors on vertices will be present in the model. Let the current reparametrization be .

The triangle factors corresponding to cycle will enforce the cycle inequality (3)


It holds that


where the dual lower bound on cycle . The first inequality above is due to either in the optimal solution or one being one due to (19). The second inequality is due to the fact that (i)  by linear programming duality and (ii) the triangle factors enforce more inequalities than only (19). ∎

Proof of Proposition 2

Let an odd wheel with center node and cycle nodes . Adding the lollipop subproblems for increases by at least if the costs of each triangle are such that the minimal cost of any edge labeling of the triangle cutting precisely one edge incident to is smaller by than the minimal cost of any edge labeling of the triangle cutting or edges incident to . That is:


Condition (16) means that in all triangles in the odd wheel , the minimal assignment with regard to the current reparametrization, has exactl one edge incident to . All other assignment have cost greater by at least . As is odd, there is no possiblity to combine those local assignments to a global assignment on .

On the other hand, our construction of lollipop factors ensures exactness on odd wheels. As at least one triangle must then be assigned costs that are not locally optimal and which is larger by than its minimal reparametrized cost, the result follows. ∎

Proof of Proposition 3

Assume maximizes the dual lower bound and the relaxation is tight, i.e.


Moreover, let such that is an optimal multicut of . Then,


Follows from the complementary slackness conditions in linear programming duality. ∎

7.2 Detailed experimental evaluation

In Table 2 a detailed per instance evaluation of all algorithms considered in the experimental section can be found.

Instance MP-C MC-COW CGC MC-ILP CC-Fusion-RWS CC-Fusion-RHC
101087.bmp UB 2906.16 2906.16 2853.56 2789.90 2800.22 2789.90
LB 2788.69 2788.95 2622.38 2789.90
runtime(s) 0.31 0.59 0.02 5.11 0.63 0.81
102061.bmp UB 3017.57 3017.57 3090.33 2943.77 2963.42 2944.46
LB 2932.80 2933.39 2750.99 2943.77
runtime(s) 3.26 5.83 0.05 8.75 0.61 0.96
103070.bmp UB 4437.16 4444.27 4457.13 4199.38 4205.03 4200.64
LB 4196.88 4196.94 3842.84 4199.38
runtime(s) 11.84 15.93 0.15 6.97 1.32 0.98
105025.bmp UB 6332.73 6333.88 6290.71 6055.33 6070.84 6061.05
LB 6045.21 6046.08 5506.01 6055.33
runtime(s) 33.75 55.69 0.35 32.15 1.91 1.59
106024.bmp UB 1832.16 1832.16 1654.27 1599.25 1618.18 1599.83
LB 1597.07 1597.21 1466.60 1599.25
runtime(s) 0.64 4.09 0.02 4.76 0.18 0.25
108005.bmp UB 6839.76 6841.08 6855.33 6578.03 6584.62 6578.18
LB 6567.48 6569.63 6151.29 6578.03
runtime(s) 4.58 22.38 0.26 10.86 2.49 1.54
108070.bmp UB 9082.04 9083.42 8612.32 8422.24 8445.73 8424.36
LB 8414.64 8414.99 7818.70 8422.24
runtime(s) 28.23 52.32 0.41 26.67 3.35 1.75
108082.bmp UB 5125.72 5127.99 5090.88 4800.15 4815.78 4806.04
LB 4786.46 4789.19 4380.66 4800.15
runtime(s) 6.85 25.07 0.16 13.51 1.20 1.89
109053.bmp UB 4575.76 4579.97 4616.61 4421.13 4424.05 4421.13
LB 4412.45 4413.08 4021.22 4421.13
runtime(s) 20.74 71.75 0.17 9.64 1.50 0.84
119082.bmp UB 4512.04 4512.04 4642.96 4530.71 4535.85 4532.29
LB 4526.91 4526.91 4346.24 4530.71
runtime(s) 0.10 0.10 0.02 0.48 1.53 1.57
12084.bmp UB 7502.34 7502.34 7443.28 7284.45 7301.17 7287.68
LB 7276.26 7277.38 6941.02 7284.45
runtime(s) 3.48 8.10 0.15 2.47 2.20 3.99
123074.bmp UB 4200.21 4204.24 4031.03 3842.74 3856.82 3847.83
LB 3829.51 3829.04 3439.47 3842.74
runtime(s) 35.27 24.55 0.14 23.01 0.47 0.86
126007.bmp UB 2756.86 2760.64 2747.72 2684.83 2706.76 2685.26
LB 2677.02 2677.08 2512.08 2684.83
runtime(s) 0.23 0.53 0.01 0.79 0.38 0.82
130026.bmp UB 6066.91 6066.91 5580.58 5350.83 5369.95 5354.31
LB 5331.00 5336.93 4828.82 5350.83
runtime(s) 36.91 167.54 0.26 19.66 0.99 1.40
134035.bmp UB 6840.69 6840.69 6679.89 6578.98 6595.87 6579.62
LB 6562.20 6565.03 6166.95 6578.98
runtime(s) 8.44 43.34 0.19 28.82 1.81 1.33
14037.bmp UB 1566.34 1582.13 1431.56 1383.14 1393.66 1383.14
LB 1375.55 1375.55 1274.27 1383.14
runtime(s) 0.09 0.17 0.01 0.25 0.13 0.22
143090.bmp UB 1728.22 1728.22 1807.41 1714.38 1725.88 1715.76
LB 1712.44 1712.26 1595.54 1714.38
runtime(s) 0.43 2.49 0.01 0.56 0.44 0.34
145086.bmp UB 3806.23 3808.75 3407.83 3322.21 3329.14 3322.59
LB 3319.36 3319.34 3197.53 3322.21
runtime(s) 0.32 0.45 0.01 0.83 0.41 1.59
147091.bmp UB 4092.25 4092.25 4129.72 3973.71 3982.30 3975.15
LB 3968.35 3970.58 3734.67 3973.71
runtime(s) 1.50 10.96 0.10 13.24 0.90 0.83
148026.bmp UB 8411.55 8411.55 8436.70 8205.98 8226.20 8207.72
LB 8198.62 8199.84 7780.68 8205.98
runtime(s) 6.48 29.65 0.15 4.49 3.10 2.73
148089.bmp UB 6680.96 6682.82 6666.83 6439.58 6455.48 6440.33
LB 6432.06 6431.52 6030.94 6439.58
runtime(s) 10.44 15.57 0.17 18.64 2.00 1.53
156065.bmp UB 5798.42 5801.59 5429.45 5234.15 5248.18 5234.76
LB 5224.22 5225.10 4857.14 5234.15
runtime(s) 10.19 35.24 0.15 18.51 1.49 1.12
157055.bmp UB 4797.86 4798.32 4768.87 4685.17 4696.42 4685.17
LB 4679.04 4679.39 4472.39 4685.17
runtime(s) 0.49 3.10 0.02 2.60 1.37 1.48
159008.bmp UB 4688.87 4694.40 4814.85 4540.87 4569.12 4541.22
LB 4534.88 4535.76 4217.95 4540.87
runtime(s) 3.53 21.00 0.10 16.64 0.83 1.29
160068.bmp UB 3263.91 3265.02 3264.31 3089.32 3103.23 3089.32
LB 3088.23 3088.17 2866.45 3089.32
runtime(s) 1.18 2.28 0.05 1.80 0.61 1.07
16077.bmp UB 4440.51 4443.05 4408.62 4227.88 4236.46 4228.78
LB 4224.10 4224.60 3921.75 4227.88
runtime(s) 4.37 13.81 0.07 3.22 1.18 1.63
163085.bmp UB 4493.17 4493.17 4577.62 4381.13 4406.98 4384.59
LB 4370.36 4371.00 3983.52 4381.13
runtime(s) 21.91 35.30 0.15 9.82 0.91 1.30
167062.bmp UB 1623.20 1623.20 1281.48 1273.72 1275.87 1275.67
LB 1273.01 1273.29 1233.39 1273.72
runtime(s) 0.10 0.66 0.00 1.00 0.11 0.17
167083.bmp UB 8979.42 8979.42 8545.37 8331.63 8344.35 8331.90
LB 8325.80 8328.03 7921.06 8331.63
runtime(s) 9.83 53.28 0.23 11.74 2.27 1.80
170057.bmp UB 3602.31 3602.31 3355.95 3266.17 3273.20 3266.73
LB 3260.19 3260.98 2989.38 3266.17
runtime(s) 21.07 26.30 0.07 18.20 0.53 0.67
175032.bmp UB 11863.40 11863.40 11926.00 11542.63 11566.74 11547.67
LB 11525.63 11526.57 10543.16 11542.63
runtime(s) 623.66 632.98 1.27 165.83 3.25 3.98
175043.bmp UB 8022.65 8033.50 8224.34 7816.92 7844.49 7822.01
LB 7809.60 7811.17 7136.44 7816.92