Engineering Kernelization for Maximum Cut

Kernelization is a general theoretical framework for preprocessing instances of NP-hard problems into (generally smaller) instances with bounded size, via the repeated application of data reduction rules. For the fundamental Max Cut problem, kernelization algorithms are theoretically highly efficient for various parameterizations. However, the efficacy of these reduction rules in practice---to aid solving highly challenging benchmark instances to optimality---remains entirely unexplored. We engineer a new suite of efficient data reduction rules that subsume most of the previously published rules, and demonstrate their significant impact on benchmark data sets, including synthetic instances, and data sets from the VLSI and image segmentation application domains. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules. On social and biological networks in particular, kernelization enables us to solve four instances that were previously unsolved in a ten-hour time limit with state-of-the-art solvers; three of these instances are now solved in less than two seconds.



page 1

page 2

page 3

page 4


Boosting Data Reduction for the Maximum Weight Independent Set Problem Using Increasing Transformations

Given a vertex-weighted graph, the maximum weight independent set proble...

Understanding the Effectiveness of Data Reduction in Public Transportation Networks

Given a public transportation network of stations and connections, we wa...

Shared-Memory Branch-and-Reduce for Multiterminal Cuts

We introduce the fastest known exact algorithm for the multiterminal cut...

Faster Parallel Multiterminal Cuts

We give an improved branch-and-bound solver for the multiterminal cut pr...

Reflections on kernelizing and computing unrooted agreement forests

Phylogenetic trees are leaf-labelled trees used to model the evolution o...

Efficient Interleaved Batch Matrix Solvers for CUDA

In this paper we present a new methodology for data accesses when solvin...

Algorithms for Floor Planning with Proximity Requirements

Floor planning is an important and difficult task in architecture. When ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The (unweighted) Max Cut problem is to partition the vertex set of a given graph into two sets and so as to maximize the total number of edges between those two sets. Such a partition is called a maximum cut. Computing a maximum cut of a graph is a well-known problem in the area of computer science; it is one of Karp’s 21 -complete problems [26] While signed and weighted variants are often considered throughout the literature [4, 5, 6, 9, 13, 23, 24], the simpler (unweighted) case still presents a significant challenge for researchers, and solving it quickly is of paramount importance to all variants. Max Cut variants have many applications, including social network modeling [23], statistical physics [4], portfolio risk analysis [24], VLSI design [6, 9], network design [5], and image segmentation [13].

Theoretical approaches to solving Max Cut primarily focus on producing efficient parameterized algorithms through data reduction rules, which reduce the input size in polynomial time while maintaining the ability to compute an optimal solution to the original input. If the resulting (irreducible) graph has size bounded by a function of a given parameter, then it is called a kernel. Recent works focus on parameters measuring the distance between the maximum cut size of the input graph and a lower bound guaranteed for all graphs. The algorithm then must decide if the input graph admits a cut of size  for a given integer . Two such lower bounds are the Edwards-Erdős bound [15, 16] and the spanning tree bound. Crowston et al. [11] were the first to show that unweighted Max Cut is fixed-parameter tractable when parameterized by distance above the Edwards-Erdős bound. Moreover, they show the problem admits a polynomial-size kernel with vertices. Their result was extended to the more general Signed Max Cut problem, and the kernel size was decreased to vertices [10]. Finally, Etscheid and Mnich [17] improved the kernel size to an optimal vertices even for signed graphs, and showed how to compute it in linear time .

Many practical approaches exist to compute a maximum cut or (alternatively) a large cut. Two state-of-the-art exact solvers are Biq Mac (a solver for binary quadratic and Max-Cut problems) by Rendl et al. [31], and LocalSolver [8, 22]

, a powerful generic local search solver that also verifies optimality of a cut. Many heuristic (inexact) solvers are also available, including those using unconstrained binary quadratic optimization 

[35], local search [7], tabu search [27], and simulated annealing [3].

Curiously, data reduction, which has shown promise at preprocessing large instances of other fundamental -hard problems [2, 25, 28], is currently not used in implementations of Max Cut solvers. To the best of our knowledge, no research has been done on the efficiency of data reduction for Max Cut, in particular with the goal of achieving small kernels in practice.

Our Results. We introduce new data reduction rules for the Max Cut problem, and show that nearly all previous reduction rules for the Max Cut problem can be encompassed by only four reduction rules. Furthermore, we engineer efficient implementations of these reduction rules and show through extensive experiments we show that kernelization achieves a significant reduction on sparse graphs. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules. We achieve speedups on all instances tested. On social and biological networks in particular, kernelization enables us to solve four instances that were previously unsolved in a ten-hour time limit with state-of-the-art solvers; three of these instances are now solved in less than two seconds with our kernelization.

2 Preliminaries

Throughout this paper, we consider finite, simple and undirected graphs together with additive edge weight functions . For each vertex let denote its neighbors; its degree in is . The neighborhood of a set is . For a vertex set , let denote the subgraph of induced by . To specify the vertex and edge sets of a specific graph , we use and , respectively. The set of edges between the vertices of different vertex sets is written as .

For an integer , a path of length in is a sequence of distinct vertices such that for . A path with is called a cycle of . Graph is connected if there is a path from to for any pair of distinct vertices in ; and disconnected otherwise. A connected component of  is an inclusion-maximal connected subgraph of . For vertex sets , the set of external vertices is , which is the set of vertices in that have some neighbor in outside . In similar fashion,  defines the set of internal vertices.

A clique is a complete subgraph, and a near-clique is a clique minus a single edge. A clique tree is a connected graph whose biconnected components are cliques, and a clique forest is a graph whose connected components are clique trees. In such graphs, we use the term block to refer to a biconnected component, bridge, or isolated vertex. The class of clique-cycle forests is defined as follows. A clique is a clique-cycle forest, and so is a cycle. The disjoint union of two clique-cycle forests is a clique-cycle forest. In addition, a graph formed from a clique-cycle forest by identifying two vertices, each from a different (connected) component, is also a clique-cycle forest.

The Max Cut problem is to find a vertex set , such that is maximized. We denote the cardinality of a maximum cut by . At times, we may need to reason about a maximum cut given a fixed partitioning of a subset of ’s vertices. A partition of vertices is given as a -coloring . We let denote the size of a maximum cut of , given that is partitioned according to . The Weighted Max Cut problem is to find a vertex set of a given graph with additive weight function such that is maximum. The weight of a maximum cut is then given by . We denote instances of the Max Cut decision problem as , where is a graph and , If the size of a maximum cut in is , then is a “yes”-instance; otherwise, it is a “no”-instance.

We address two more variations Max Cut in this paper. The Vertex-Weighted Max Cut problem takes as input a graph and two vertex weight functions ; the objective is to compute a bipartition  that maximizes . The Signed Max Cut problem takes as input a graph together with an edge labeling ; the goal to find an which maximizes the quantity , where  and for . Similarly, for the neighborhood of a vertex (set), we use the notations  and . We call a triangle positive if its number of “0.75[1.0]”-edges is even. Any Max Cut instance can be transformed into a Signed Max Cut instance by labeling all edges with “0.75[1.0]”.

Let denote the set of input instances for a decision problem. A parameterized problem is fixed-parameter tractable if there is an algorithm  (called a fixed-parameter algorithm) that decides membership in for any input pair in time  for some computable function .

A data reduction rule (often shortened to reduction rule) for a parameterized problem is a function that maps an instance of to an equivalent instance of such that is computable in time polynomial in and . We call two instances of equivalent if either both or none belong to . Observe that for two equivalent “yes”-instances and , the relationship  holds for some .

2.1 Related Work

Several studies have been made in the direction of providing fixed-parameter algorithms for the Max Cut problem [10, 11, 17, 29]. Among these, a fair amount of kernelization rules have been introduced with the goal of effectively reducing Max Cut instances [10, 11, 17, 29, 30, 18]. Those reductions typically have some constraints on the subgraphs, like being clique forests or clique-cycle forest. Later, we propose a new set of reductions that does not need this property and cover most of the known reductions  [11, 17, 29, 18]. There are other reductions rules that are fairly simplistic and focus on very narrow cases [30]. We now explain the Edwards-Erdős bound and the spanning tree bound.

Edwards-Erdős Bound. For a connected graph, the Edwards-Erdős bound [15, 16] is defined as . A linear-time algorithm that computes a cut satisfying the Edwards-Erdős bound for any given graph is provided by Van Ngoc and Tuza [34]. The Max Cut Above Edwards-Erdős (Max Cut AEE) problem asks for a graph and integer if admits a cut of size . All kernelization rules for Max Cut AEE require a set set such that is a clique forest. Etscheid and Mnich [17] propose an algorithm that computes such a set of at most vertices in time .

Spanning Tree Bound. Another approach is based on utilizing the spanning forest of a graph [29]. For a given , a Max Cut of size is searched for. This decision problem is denoted as Max Cut AST (Max Cut Above Spanning Tree). For sparse graphs, this bound is larger than the Edwards-Erdős bound. The reductions for the problem require a set such that is a clique-cycle forest.

3 New Data Reduction Rules

We now introduce our new data reduction rules and prove their correctness. The main feature of our new rules is that they do not depend on the computation of a clique-forest to determine if they can be applied. Furthermore, our new rules subsume almost all rules from previous works [10, 11, 17, 29, 18] with the exception of Reduction Rules 10 and 11 by Crowston et al. [10]. We provide details in [19]. For an overview of how rules are subsumed, consult Table 1. Hence, our algorithm will only apply the rules proposed in this section. We provide proofs for the rules that proved most useful in our experimental evaluation.

Source [18] [11] [10] [17] [29]
Rule A 5 6 7 8 9 9 6 7 8 9 10 11 12 13
Table 1: Reduction rules from previous work subsumed by our new rules. A ✓ in row and column  means that the rule from row subsumes the rule from column . If there are multiple ✓s in a column (say, rows and in column ), then rules and combined subsume rule .

Reduction Rule 1. Let be a graph and let induce a clique in . If , then for .


Note that any partition of the clique into two vertex sets of size and is a maximum cut of . Suppose we fix the partitions of the at most external vertices of . Then the at least internal vertices can be assigned to the partitions so they each contain and vertices. Thus, regardless of how is partitioned, the size of a maximum cut of remains the same. ∎

We can exhaustively apply Reduction Rule 1 in  time by scanning over all vertices in the graph. When scanning vertex , we check whether induces a clique. This finds all cliques with at least one internal vertex. Checking whether Reduction Rule 1 is applicable is then straightforward by counting the number of vertices with degree higher than the size of the clique.

Reduction Rule 2. Let be an induced -path in a graph with and . Construct from by adding a new edge and removing the vertices and . Then .


Let and let be an assignment of vertices to the partitions of a cut in . We distinguish two cases:

  • Case : If , then no edges of are cut. Notice that this cut is not maximum since moving between partitions increases the cut size by two. If , then exactly two edges in are cut.

  • Case : By choosing and , all three edges in are cut. In , the edge between and is cut, so . ∎

Reduction Rule 3. Let be a graph and let induce a near-clique in . Let be the graph obtained from by adding the missing edge so that induces a clique in . If

is odd or

, then .


Let be the edge added to the graph and any 2-coloring of . We show that a maximum cut of  exists such that and are in the same partition. As has one less edge than , this means that , which implies that .

Define for . Without loss of generality, assume . Note that, given the partition for , maximizing the cut of means minimizing . We distinguish three cases:

  • : By adding and to , decreases. The rest of the internal vertices have to be distributed among and such that is minimized

  • : By adding and to , stays . If is odd, then is the minimal value possible and is even. So the remaining internal vertices can be distributed evenly between and . If is even, then an odd number of internal vertices are left (and at least one by the definition of the rule) which can be distributed to balance and .

  • : By adding and to , becomes 2. If is odd, then an odd number of internal vertices is left to assign to such that becomes 1. If is even then there is an even number of internal vertices left which can be distributed to balance and .∎

Since some cliques are irreducible by currently known rules, it may be beneficial to also apply Reduction Rule 1 ‘in reverse’. Although this ‘reverse’ reduction neither reduces the vertex set nor (as our experiments suggest) lead to applications of other rules, it can undo unfruitful additions of edges made by Reduction Rule 1 and may remove other edges from the graph.

Reduction Rule 4. Let be a graph and let induce a clique in . If is odd or , an edge between two vertices of is removable. That is, for , .


Follows from the correctness of Reduction Rule 1. ∎

The following reduction rule is closely related to the upcoming generalization of Reduction Rule 8 by Crowston et al. [10]. It is able to further reduce the case where for a clique of . In comparison, the generalization of Reduction Rule 8 from [10] is able to handle the case . Due to the degree by which these rules are similar, they are also merged together in our implementation, as the techniques to handle both are the same.

Reduction Rule 5. Let induce a clique in a graph , where  and for all . Create from  by removing an arbitrary vertex of . Then .


Let and be any 2-coloring of . Note that – the removal of disconnects from the remainder of the graph.

Define and for . We distribute the vertices in among and such that is maximized. Notice that every vertex in is connected to all other vertices in . The size of any cut is therefore , where and denote the number of vertices from that we want to insert into  and , respectively. This can be rewritten as . As all other parts are constant, this reduces to maximizing . As is constant, is maximized when is minimized.

Because , it is always possible to distribute the vertices of such that , which then maximizes . Removing any vertex from will change the cut by : without loss of generality, let . Then is odd and , which maximizes the cut. Then, . ∎

The following algorithm identifies all candidates of Reduction Rule 1 in linear time. First, we order the adjacencies of all vertices. That is, for every vertex , the vertices in are sorted according to a numeric identifier assigned to every vertex. For this, we create an auxiliary array of empty lists of size . We then traverse the vertices for every vertex and insert each pair in a list identified by indexing the auxiliary array with . We then iterate once over the array from the lowest identifier to the highest and recreate the graph with sorted adjacencies. In total, this process takes time.

For any clique of , we have to check if for all pairs of vertices from that holds (neighborhood condition). Our algorithm uses tries [20, 12] to find all candidates. A trie supports two operations, Insert(key,val) and Retrieve(key). The key parameter is an array of integers and val is a single integer. Function Retrieve returns all inserted values by Insert that have the same key. Internally, a trie stores the inserted elements as a tree, where every node corresponds to one integer of the key and every prefix is stored only once. That means that two keys sharing a prefix share the same path through the trie until the position where they differ.

For each vertex , we use the ordered set as key and as the val parameter. Notice that is already sorted. The key can be then computed through an insertion of into the sequence in time . After Insert(,) is done for every vertex , each trie leaf contains all vertices that satisfy the condition of Reduction Rule 1. Meaning, for every vertex pair of a trie leaf, the neighborhood condition is met. We then verify whether the vertex set of a leaf is a clique, in  time. As each such set  is considered exactly once and the graph is fully partitioned, this requires  time in total. As a last step, we check whether by using the observation that . In Sect. 4, we describe a timestamping system that assists the above procedure in not having to repeatedly check the same structures after any amount of vertices and edges are added or removed from . However, in those later applicability checks, we disregard sorting the adjacencies of all vertices in linear time again. Rather we simply use a comparison based sort on the adjacencies.

The next reduction rule is our only rule whose application turns unweighted instances into instances of Weighted Max Cut. Our experiments show that this can reduce the kernel size significantly. This is noteworthy, given that existing solvers for Max Cut usually support weighted instances.

Reduction Rule 6. Let be a graph, a weight function, and be an induced 2-path with . Let be the edge between vertex and ; let  be the one between and . Construct from by deleting vertex and adding a new edge  with . Then .


Let be a maximum cut of and consider the following two cases:

  • : If , then . Otherwise, . In total, the path contributes to the cut. in , the edge between and is not cut, so .

  • : If , then . Otherwise, . In total, the path contributes to the cut. In , the edge between and is cut and contributes to the cut, so again .∎

Our next two rules (Reduction Rules 1 and 1) generalize Reduction Rule 8 by Crowston et al. [10], which we restate for completeness.

Reduction Rule 8. ([10], Reduction Rule 8)

Let be a signed graph, a set of vertices such that is a clique forest, and a block in . If there is a  such that , and  for all . Construct the graph from by removing any two vertices , then .

Note that, for unsigned graphs, and  for every vertex .

Here, different choices of  lead to different applications of this rule. Our generalizations do not require such a set anymore and can find all possible applications for any choice of .

Reduction Rule 1. Let be the vertex set of a clique in with and for all . Construct the graph by deleting two arbitrary vertices  from . Then .

We show the correctness of Reduction Rule 1 by reducing it to Reduction Rule 8 by Crowston et al. [10].


Let and . Since is a clique, is a clique forest. From it follows that . Also, and , so all conditions for Reduction Rule 1 are satisfied.

It remains to show that . Note that and . By Reduction Rule 1, we know that , therefore we have that


Where (1) follows from . ∎

Reduction Rule 7. Let induce a clique in a signed graph such that  and , , and  for all . Construct by deleting two arbitrary vertices  from . Then .

Proof (Sketch)..

The proof for this rule is almost identical to the proof of Reduction Rule 1. ∎

Using an almost equivalent approach as we did for Reduction Rule 1, we can find all candidates of this reduction rule in linear time.

In order to also reduce weighted instances to some degree, we use a simple weighted scaling of two reduction rules. That is, we extend their applicability from an unweighted subgraph to a subgraph where all edges have the same weight . We do this for Reduction Rules 1 and 1.

Reduction Rule 1. Let be a weighted graph and let induce a clique with for every edge for some constant . Let with for every . If , then .

Reduction Rule 1. Let be a weighted graph and let induce a near-clique in . Furthermore, let for every edge  for some constant . Let be the graph obtained from by adding the edge so that induces a clique in . Set , and  for . If is odd or , then .

4 Implementation

4.1 Kernelization Framework

We now discuss our overall kernelization framework in detail. Our algorithm begins by generating an unweighted instance by replacing every weighted edge by an unweighted subgraph with a specific structure. Afterwards, we apply our full set of unweighted reduction rules: 1, 1 (together with 1), 1, and 1. As already mentioned earlier, Reduction Rule 1 is the unweighted version of 1. We then create a signed instance of the graph by exhaustively executing weighted path compression using Reduction Rule 1 with the restriction that the resulting weights are or . We then exhaustively apply Reduction Rule 1. Once the signed reductions are done, we apply Reduction Rule 1 to fully compress all paths into weighted edges. This is then succeeded by Reduction Rule 1 and 1. We then transform the instance into an unweighted one and apply Reduction Rule 1 in order to avoid cyclic interactions between itself and Reduction Rule 1. Finally, if a weighted solver is to be used on the kernel, we exhaustively perform Reduction Rule 1 to produce a weighted kernel. Note that different permutations of the order in which reduction rules are applied can lead to different results.

4.2 Timestamping

Next we describe how to avoid unnecessary checks for the applicability of reduction rules. For this purpose, let the time of the most recent change in the neighborhood of a vertex be and let the variable describe the current time. Initially, and . Every time a reduction rule performs a change on , set  and increment . For each individual Reduction Rule , we also maintain a timestamp (initialized with ), indicating the upper bound up to which all vertices have already been processes. Hence, all vertices with do not need to be checked again by Reduction Rule . Note that timestamping only works for “local” reduction rules—the rules whose applicability can be determined by investigating the neighborhood of a vertex. Therefore, we only use this technique for Reduction Rules 1 and 1.

5 Experimental Evaluation

5.1 Methodology and Setup

All of our experiments were run on a machine with four Octa-Core Intel Xeon E5-4640 processors running at 2.40GHz CPUs with GB of main memory. The machine runs Ubuntu 18.04. All algorithms were implemented in C++ and compiled using gcc version 7.3.0 with optimization flag -O3. We use the following state-of-the-art Weighted Max Cut solvers for comparisons: the exact solvers LocalSolver [8] (heuristically finds a large cut, and can then verify if it is maximum), Biq Mac [31] as well as the heuristic solver MqLib [14]. MqLib is unable to determine on its own when it reaches a maximum cut and always exhausts the given time limit. We also evaluated an implementation of the reduction rules used by Etscheid and Mnich [17]; however, preliminary experiments indicated that it performs worse than current state-of-the-art solvers. In the following, for a graph , denotes the graph after all reductions have been applied exhaustively. For this purpose, we examine the following efficiency metric: we denote the kernelization efficiency by . Note that is when all vertices are removed after applying all reduction rules, and if no vertices are removed.

For our experiments we use four different datasets: First, we use random instances from four different graph models that were generated using the KaGen graph generator [21, 33]. In particular, we used Erdős-Rényi graphs (GNM), random geometric graphs (RGG2D), random hyperbolic graphs (RHG) and Barabási-Albert graphs (BA). The main purpose of these instances is to study the effectiveness of individual reduction rules for a variety of graph densities and degree distributions. To analyze the practical impact of our algorithm on current-state-of-the-art solvers we use a selection of sparse real-world instances by Rossi and Ahmed [32], as well as instances from VLSI design (g00*) and image segmentation (imgseg-*) by Dunning et al. [14]. Note that the original instances by Dunning et al. [14] use floating-point weights that we scaled to integer weights. Finally, we evaluate denser instances taken from the rudy category of the Biq Mac Library [1]. We further subdivide these instances into medium- and large-sized instances.

5.2 Performance of Individual Rules

To analyze the impact of each individual reduction rule, we measure the size of the kernel our algorithm procedures before and after their removal. Fig. 1 shows our results on RGG2D and GNM graphs with vertices and varying density. We have settled on those two types of graphs as they represent different ends on the spectrum of kernelization efficiency. In particular, kernelization performs good on instances that are sparse and have a non-uniform degree distribution. Such properties are given by the random geometric graph model used for generating the RGG2D instances. Likewise, kernelization performs poor on the uniform random graphs that make up the GNM instances. We excluded Reduction Rule 1 from these experiments as it only removes edges and thus leads to now difference in the kernelization efficiency.

Looking at Fig. 1, we can see that Reduction Rule 1 gives the most significant reduction in size. Its absence always diminishes the result more than any other rule. In particular, we see a difference in efficiency of up to (RGG2D) and (GNM) when removing Reduction Rule 1. The second most impactful rule for the RGG2D instances is Reduction Rule 1 with a difference of only up to . For the GNM instances Reduction Rule 1 is second with a difference of up to . However, note that Reduction Rules 1 and 1 lead to no difference in efficiency on these instances. Thus, we can conclude that depending on the graph type, different reduction rules have varying importance. Furthermore, our simple Reduction Rule 1 seems to have the most significant impact on the overall kernelization efficiency. Note that this is in line with the theoretical results from Table 1, which states that Reduction Rule 1 covers most of the previously published reduction rules and Reduction Rule 1 still covers many but less rules from previous work.


Figure 1: Tests consist of 150 synthetic instances. We compare the kernelization efficiency of our full algorithm to the efficiency of our algorithm without a particular reduction rule.

5.3 Exactly Computing a Maximum Cut

To examine the improvements kernelization brings for medium-sized instances, we compare the time required to obtain a maximum cut for both the kernelized and the original instance. We performed these experiments using both LocalSolver and Biq Mac. Note that we did not use MqLib as it is not able to verify the optimality of the cut it computes. The results of our experiments for our set of real-world instances are given in Table 2 (with weighted path compression) and Table 3 (without weighted path compression). Since the image segmentation instances are already weighted, they are omitted from Table 3. It is noteworthy that we do not include the results for the rudy instances from the Biq Mac library. These instances feature a uniform edge distribution and an overall average degree of at least . Our preliminary experiments indicated that kernelization provides little to no reduction in size for these instances. Therefore, we omit them from further evaluation and focus on more sparse graphs.

ca-CSphd 1 882 0.99 24.07 0.32 [75.40] - 0.06 []
ego-facebook 2 888 1.00 20.09 0.09 [228.91] - 0.01 []
ENZYMES_g295 123 0.86 1.22 0.33 [3.70] 0.82 0.13 [6.57]
road-euroroad 1 174 0.79 - - - - - -
bio-yeast 1458 0.81 - - - - 32 726.75 []
rt-twitter-copen 761 0.85 - 834.71 [] - 1.77 []
bio-diseasome 516 0.93 - 4.91 [] - 0.07 []
ca-netscience 379 0.77 - 956.03 [] - 0.67 []
soc-firm-hi-tech 33 0.36 4.67 1.61 [2.90] 0.09 0.06 [1.41]
g000302 317 0.21 0.58 0.49 [1.17] 1.88 0.74 [2.53]
g001918 777 0.12 1.47 1.41 [1.04] 31.11 17.45 [1.78]
g000981 110 0.28 10.73 4.73 [2.27] 531.47 21.53 [24.68]
g001207 84 0.19 1.10 0.16 [6.88] 53.20 0.06 [962.38]
g000292 212 0.03 0.45 0.45 [1.01] 0.43 0.37 [1.14]
imgseg_271031 900 0.99 10.66 0.19 [55.94] - 0.17 []
imgseg_105019 3 548 0.93 234.01 22.68 [10.32] f 13 748.62 []
imgseg_35058 1 274 0.37 34.93 24.71 [1.41] - - -
imgseg_374020 5 735 0.82 1 739.11 72.23 [24.08] f - -
imgseg_106025 1 565 0.68 159.31 34.05 [4.68] - - -
Table 2: Impact of kernelization on the computation of a maximum cut by LocalSolver (LS) and Biq Mac (BM). Times are given in seconds. Kernelization is accounted for within the timings for . Values in brackets provide the speedup and are derived from . Times labeled with “0.75[1.0]” exceeded the ten-hour time limit and an “f” indicates the solver crashed.

1 882 0.98 24.79 1.12 [22.23] - 0.32 []
ego-facebook 2 888 0.93 20.39 1.72 [11.83] 967.99 1.42 [682.04]
ENZYMES_g295 123 0.82 1.83 0.36 [5.09] 0.96 0.37 [2.60]
road-euroroad 1 174 0.69 - - - - - -
bio-yeast 1 458 0.72 - - - - - -
rt-twitter-copen 761 0.80 - 409.47 [] - 101.14 []
bio-diseasome 516 0.93 - 6.66 [] - 0.35 []
ca-netscience 379 0.67 - 4 116.61 [] - 2.10 []
soc-firm-hi-tech 33 0.30 4.92 2.34 [2.10] 0.29 0.31 [0.94]
g000302 317 0.10 0.71 0.50 [1.41] 1.28 0.89 [1.44]
g001918 777 0.06 1.67 1.51 [1.10] 14.90 11.69 [1.27]
g000981 110 0.22 11.32 1.97 [5.74] 0.98 0.44 [2.23]
g001207 84 0.17 1.56 0.15 [10.11] 0.47 0.37 [1.28]
g000292 212 0.01 0.69 0.51 [1.35] 0.56 0.62 [0.91]
Table 3: Impact of kernelization on the computation of a maximum cut by LocalSolver (LS) and Biq Mac (BM). Times are given in seconds. Kernelization time is included in the solving times for . Values in brackets provide the speedup and are derived from . Times labeled with “0.75[1.0]” exceeded the ten-hour time limit. Weighted path compression by Reduction Rule 1 is not used at the end – the kernel is unweighted.

First, we notice that kernelization is able to provide moderate to significant speedups for all instances that we have tested. In particular, we are able to a speedup between and for instances that were previously solvable by LocalSolver. Likewise, for the instances that Biq Mac is able to process, we achieve a speedup of up to three orders of magnitude. Furthermore, we allow these solvers to now compute a maximum cut for a majority of instances that have previously been infeasible in less than minutes.

To examine the impact when allowing a weighted kernel, we now compare the performance our algorithm using weighted path compression (Table 2) with the unweighted version (Table 3). We can see that by including weighted path compression we can achieve significantly better speedups, especially for the sparse real-world instances by Rossi and Ahmed [32]. For example, on ego-facebook we achieve a speedup of with compression and without.

Finally, it is also noteworthy that we get significant improvements for the weighted instances from VLSI design and image segmentation. By examining the performance of each individual reduction rule, we can see that this is solely due to Reduction Rule 1. These findings could improve the work by de Sousa et al. [13], which also affects the work by Dunning et al. [14]. In conclusion, our novel reduction rules give us a simple but powerful tool for speeding up existing state-of-the-art solvers for computing maximum cuts. Moreover, as mentioned previously, even our simple weighted path compression by itself is able to have a significant impact.

5.4 Analysis on Large Instances

We now examine the performance of our kernelization framework and its impact on existing solvers for large graph instances with up to millions of vertices. For this purpose, we compared the cut size over time achieved by LocalSolver and MqLib with and without our kernelization. Note that we did not use Biq Mac as it was not able to handle instances with more than 3 000 vertices. Our results using a three-hour time limit for each solver are given in Table 4. Furthermore, we present convergence plots in Fig. 2.

inf-road_central 14 081 816 1.20 0.59 362.32 inf% 2.70%
inf-power 4 941 1.33 0.62 0.04 1.64% 0.45%
web-google 1 299 2.13 0.79 0.01 0.69% 0.19%
ca-MathSciNet 332 689 2.47 0.63 8.02 1.33% 0.55%
ca-IMDB 896 305 4.22 0.42 27.55 0.97% 0.32%
web-Stanford 281 903 7.07 0.18 105.17 0.34% 0.30%
web-it-2004 509 338 14.09 0.91 22.10 0.08% 0.02%
ca-coauthors-dblp 540 486 28.20 0.25 72.39 0.05% 0.04%
Table 4: Evaluation of large graph instances. A three-hour time limit was used and five iterations were performed. The columns and indicate the percentage by which the size of the largest computed cut is larger on the kernelized graph compared to the non-kernelized one, for LocalSolver and MqLib, respectively.

First, we note that the time to compute the actual kernel is relatively small. In particular, we are able to compute a kernel for a graph with million vertices and edges in just over six minutes. Furthermore, we achieve an efficiency between and across all tested instances. When looking at the convergence plots (Fig. 2) we can observe that the additional preprocessing time of kernelization is quickly compensated by a significantly steeper increase in cut size compared to the unkernelized version. Furthermore, for instances where a kernel can be computed very quickly, such as web-google, we find a better solution almost instantaneously. In general, the results achieved by kernelization followed by the local search heuristic are always better than just using the local search heuristic alone. However, the final improvement on the size of the largest cut found by LocalSolver and MqLib is generally small for the given time limit of three hours.


Figure 2: Convergence of LocalSolver on large instances. The dashed line represents the size of the cut for the non-kernelized graph, while the full line does so for the kernelized graph.

6 Conclusions

We engineered new efficient data reduction rules for Max Cut and showed that these rules subsume most existing rules. Our extensive experiments show that kernelization has a significant impact in practice. In particular, our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules.

Developing new reduction rules is an important direction for future research. Of particular interest are reduction rules for Weighted Max Cut, where reduction rules yield a weighted kernel.


  • [1] BiqMac Library., 2018. [Online; accessed 2-September-2018].
  • [2] Faisal N. Abu-Khzam, Michael R. Fellows, Michael A. Langston, and W. Henry Suters. Crown structures for vertex cover kernelization. Theory Comput. Syst., 41(3):411–430, 2007. doi:10.1007/s00224-007-1328-0.
  • [3] Emely Arráiz and Oswaldo Olivo. Competitive simulated annealing and tabu search algorithms for the Max-Cut problem. In

    Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation

    , GECCO ’09, pages 1797–1798, New York, NY, USA, 2009. ACM.
  • [4] Francisco Barahona. On the computational complexity of Ising spin glass models. J. Phys. A: Mathematical and General, 15(10):3241, 1982. doi:10.1088/0305-4470/15/10/028.
  • [5] Francisco Barahona. Network design using cut inequalities. SIAM J. Optim., 6(3):823–837, 1996. doi:10.1137/S1052623494279134.
  • [6] Francisco Barahona, Martin Grötschel, Michael Jünger, and Gerhard Reinelt.

    An application of combinatorial optimization to statistical physics and circuit layout design.

    Oper. Res., 36(3):493–513, 1988. doi:10.1287/opre.36.3.493.
  • [7] Una Benlic and Jin-Kao Hao. Breakout local search for the Max-Cut problem.

    Engineering Applications of Artificial Intelligence

    , 26(3):1162–1173, 2013.
  • [8] Thierry Benoist, Bertrand Estellon, Frédéric Gardi, Romain Megel, and Karim Nouioua. Localsolver 1.x: a black-box local-search solver for 0-1 programming. 4OR, 9(3):299, 2011. [used in this work: Localsolver 8.0]. URL:, doi:10.1007/s10288-011-0165-9.
  • [9] Charles Chiang, Andrew B Kahng, Subarnarekha Sinha, Xu Xu, and Alexander Z Zelikovsky. Fast and efficient bright-field AAPSM conflict detection and correction. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 26(1):115–126, 2007. doi:10.1109/TCAD.2006.882642.
  • [10] Robert Crowston, Gregory Gutin, Mark Jones, and Gabriele Muciaccia. Maximum balanced subgraph problem parameterized above lower bound. Theoret. Comput. Sci., 513:53–64, 2013. doi:10.1016/j.tcs.2013.10.026.
  • [11] Robert Crowston, Mark Jones, and Matthias Mnich. Max-Cut parameterized above the Edwards-Erdős bound. Algorithmica, 72(3):734–757, 2015. doi:10.1007/s00453-014-9870-z.
  • [12] Rene De La Briandais. File searching using variable length keys. In Papers presented at the the March 3-5, 1959, western joint computer conference, pages 295–298. ACM, 1959. doi:10.1145/1457838.1457895.
  • [13] Samuel de Sousa, Yll Haxhimusa, and Walter G Kropatsch. Estimation of distribution algorithm for the max-cut problem. In

    International Workshop on Graph-Based Representations in Pattern Recognition

    , volume 7877 of LNCS, pages 244–253. Springer, 2013.
  • [14] Iain Dunning, Swati Gupta, and John Silberholz. What works best when? A systematic evaluation of heuristics for Max-Cut and QUBO. INFORMS J. Comput., 30(3):608–624, 2018. doi:10.1287/ijoc.2017.0798.
  • [15] Christopher S Edwards. Some extremal properties of bipartite subgraphs. Canad. J. Math., 25(3):475–485, 1973. doi:10.4153/CJM-1973-048-x.
  • [16] Christopher S Edwards. An improved lower bound for the number of edges in a largest bipartite subgraph. In Proc. Second Czechoslovak Symposium on Graph Theory, Prague, pages 167–181, 1975.
  • [17] Michael Etscheid and Matthias Mnich. Linear kernels and linear-time algorithms for finding large cuts. Algorithmica, 80(9):2574–2615, 2018. doi:10.1007/s00453-017-0388-z.
  • [18] Luerbio Faria, Sulamita Klein, Ignasi Sau, and Rubens Sucupira. Improved kernels for signed max cut parameterized above lower bound on (, )-graphs. Discrete Math. & Theoret. Comput. Sci., 19(1), 2017. doi:10.23638/DMTCS-19-1-14.
  • [19] Damir Ferizovic. A Practical Analysis of Kernelization Techniques for the Maximum Cut Problem. Master’s Thesis, Karlsruhe Institute of Technology, 2019.
  • [20] Edward Fredkin. Trie memory. Comm. ACM, 3(9):490–499, 1960. doi:10.1145/367390.367400.
  • [21] Daniel Funke, Sebastian Lamm, Ulrich Meyer, Manuel Penschuck, Peter Sanders, Christian Schulz, Darren Strash, and Moritz von Looz. Communication-free massively distributed graph generation. Journal of Parallel and Distributed Computing, 131:200–217, 2019. doi:10.1016/j.jpdc.2019.03.011.
  • [22] Frédéric Gardi, Thierry Benoist, Julien Darlay, Bertrand Estellon, and Romain Megel. Mathematical Programming Solver Based on Local Search. FOCUS Series in Computer Engineering. ISTE Wiley, 2014. doi:10.1002/9781118966464.
  • [23] Frank Harary. On the measurement of structural balance. Behavioral Sci., 4(4):316–323, 1959. doi:10.1002/bs.3830040405.
  • [24] Frank Harary, Meng-Hiot Lim, and Donald C Wunsch. Signed graphs for portfolio analysis in risk management. IMA J. Mgmt. Math., 13(3):201–210, 2002. doi:10.1093/imaman/13.3.201.
  • [25] Demian Hespe, Christian Schulz, and Darren Strash. Scalable kernelization for maximum independent sets. In Proc. ALENEX 2018, pages 223–237, 2018. doi:10.1137/1.9781611975055.19.
  • [26] Richard M Karp. Reducibility among combinatorial problems. In Complexity of Computer Computations, The IBM Research Symposia Series, pages 85–103. Springer, 1972. doi:10.1007/978-1-4684-2001-2_9.
  • [27] Gary A. Kochenberger, Jin-Kao Hao, Zhipeng Lü, Haibo Wang, and Fred Glover. Solving large scale Max Cut problems via tabu search. Journal of Heuristics, 19(4):565–571, Aug 2013. doi:10.1007/s10732-011-9189-8.
  • [28] Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, and Renato F Werneck. Finding near-optimal independent sets at scale. J. Heuristics, 23(4):207–229, 2017. doi:10.1007/s10732-017-9337-x.
  • [29] Jayakrishnan Madathil, Saket Saurabh, and Meirav Zehavi. Max-Cut Above Spanning Tree is fixed-parameter tractable. In Proc. CSR 2018, volume 10846 of LNCS, pages 244–256. Springer, 2018.
  • [30] Elena Prieto. The method of extremal structure on the -Maximum Cut problem. In Proc. CATS 2005, volume 41, pages 119–126. ACM, 2005.
  • [31] Franz Rendl, Giovanni Rinaldi, and Angelika Wiegele. Solving Max-Cut to optimality by intersecting semidefinite and polyhedral relaxations. Math. Prog., 121(2):307, 2010. doi:10.1007/s10107-008-0235-8.
  • [32] Ryan A Rossi and Nesreen K Ahmed. The network data repository with interactive graph analytics and visualization. In Proc. AAAI 2015, volume 15, pages 4292–4293, 2015. URL:
  • [33] Peter Sanders and Christian Schulz. Scalable generation of scale-free graphs. Inf. Proc. Lett., 116(7):489–491, 2016. doi:10.1016/j.ipl.2016.02.004.
  • [34] Nguyen Van Ngoc and Zsolt Tuza. Linear-time approximation algorithms for the max cut problem.

    Combinatorics, Probability Comput.

    , 2(2):201–210, 1993.
  • [35] Yang Wang and Zhipeng L\̇lx@bibnewblockProbabilistic GRASP-tabu search algorithms for the UBQP problem. Computers & Operations Research, 40(12):3100–3107, 2013. doi:10.1016/j.cor.2011.12.006.