I Introduction
Hypergraph partitioning (HGP) is an NPhard problem [1] that occurs in many computer science applications where it is necessary to reduce large problems into a number of smaller, computationally tractable subproblems. Common applications include very large scale integration (VLSI) design [2] and scientific computing [3].
Hypergraphs are a generalisation of graphs where each hyperedge may connect more than two vertices. Formally, a hypergraph can be defined [4, 5] as where:

and are finite sets of vertices and hyperedges.

Edges and vertices may have associated weights: denotes the weight of a vertex and denotes the weight of a hyperedge .
A hyperedge is said to be incident on a vertex if, and only if, . Vertices are said to be adjacent in a hypergraph, if, and only if, there exists a hyperedge such that and . The degree of a vertex is the number of distinct hyperedges in that are incident on , and the length of a hyperedge is defined as its cardinality .
The way HGP problem is to partition the set of vertices into approximately equal disjoint subsets whilst minimising an objective function. Typically this is the cutsize: the sum of the weights of those hyperedges that span different subsets. However, minimising cutsize often leads to an uneven distribution of the cut hyperedges between partitions. Alternatives are the sum of external degrees, and metric, which includes the number of subsets connected by a hyperedge [5].
Current stateoftheart algorithms, including MLPart [6], hMetis [7], PaToH [8], Zoltan [9], Parway [10], UMPa [11], and KaHyPar [12], use a multilevel approach as illustrated in Algorithm 1. The approach recursively coarsens a hypergraph by contracting a single pair of vertices at each level until hypernodes remain. During coarsening, KaHyPar, hMetis, and PaToH use a greedy heavyedge rating function in however more sophisticated techniques respecting the community structure have recently been explored [13]. Various methods may be used to generate the assignment of supernodes to partitions in . This assignment is further improved using the FiducciaMattheyses [14] (FM) movebased local search algorithm. The uncoarsening phase recursively selects a node to expand (e.g., ) and then uses FM to refine which partition nodes and are assigned. Using a larger number of levels [15] and performing repeated iterations of the entire multilevel partitioning, known as Vcycles [7], can improve the solution quality, albeit at a computational cost.
Direct way partitioning (Algorithm 1) has the potential advantage of allowing the search algorithm to take a global view. This can result in better solutions for large hypergraphs and tighter balance constraints [16]. However, for scalability reasons recursive bisection approaches are more widely used.
Despite their sophistication, it is notable that these approaches stop coarsening at some predefined threshold of remaining supernodes. Most implementations, such as hMetis, PaToH, and KaHyPar, use default thresholds of , resulting in hypergraphs with around 300 vertices for initial partitioning. This value may result in fast and reasonably effective heuristic algorithms, but does not necessarily correspond to a good tradeoff between scale and information content.
Karypis and Kumar [17] showed that a good partitioning of the coarsest hypergraph generally leads to a good partitioning of the original hypergraph. This can reduce the amount of time spent on refinement in the uncoarsening phase. However, it is important to note that the initial hypergraph partitioning with the smallest cutsize may not necessarily lead to the smallest final cutsize after refinement is performed during uncoarsening [18]. Since information may be hidden to the global optimisation algorithm during compression, the more the hypergraph is coarsened the greater this effect may be.
Many approaches have been developed to perform the initial partitioning, ranging from random assignment [6] to the use of various greedy growing techniques [8], recursive bisection [7], and evolutionary algorithms (EAs) [19]. Greedy growth algorithms quickly produce balanced partitions, but are sensitive to the initial randomly chosen vertex [8]. Since the initial partitioning usually takes place on very small hypergraphs these algorithms can be rerun multiple times. The best partitioning found is subsequently propagated for refinement during the uncoarsening phase [8].
It is difficult to generalise measures to select the optimal algorithm to use for a given problem instance, i.e., the algorithm selection problem [20]. Therefore, a portfolio approach is used in practice by PaToH, hMetis, and KaHyPar [21]. For example, PaToH uses 11 different random and greedy growth heuristic algorithms [22]. The KaHyPar ‘Pool’ portfolio approach to initial partitioning also uses a range of simple algorithms, including fully random, breadthfirst search (BFS), label propagation, and nine variants of greedy hypergraph growing. Each algorithm is executed number of times, then the partition with the smallest cutsize and lowest imbalance is presented for uncoarsening where it is projected back to the original hypergraph. This approach has been extensively parameter tuned [21], finding that = 20 produces the overall best results at = 150, with partitions that are only marginally worse than = 75, yet significantly faster. Over a wide range of hypergraphs this approach has recently been shown to identify similar or better partitions in a faster time than the most popular general purpose HGP algorithms, hMetis and PaToH [12, 16], neither of which are open source.
In this article, we examine the case where there exists a large computational budget and many evaluations can be performed on less coarsened hypergraphs to identify the best final partitions, i.e., the potential for larger and exists. We explore the use of EAs to perform the initial partitioning within the stateoftheart, open source (GPLv3), Karlsruhe level hypergraph partitioning framework, KaHyPar from https://github.com/SebastianSchlag/kahypar.
In particular, the following contributions are made:

We characterise the ‘searchability’ of the space of initial partitions at different levels of coarsening.

Based on that analysis, we identify a role for EAs in terms of the level of coarsening, and hence the speed vs. quality of solutions produced. We also identify some key algorithm characteristics.

We develop a novel memetic algorithm and demonstrate that this discovers significantly better final solutions across a range of classes of hypergraphs and across a range of different coarsening thresholds.

Finally, we develop an adaptive mechanism for deciding when to perform initial partitioning based on the rate of change of information content in the hypergraph as it is coarsened. We show that this also gives significant performance improvements.
In the remainder of this article, Section II discusses the related work. Section III describes the test framework, the memeticEA initial partitioner, and comparison metrics. Section IV presents a landscape analysis with respect to EA design at different levels of coarsening. Section V presents the results of parameter sensitivity testing. Section VI introduces and presents results from a novel adaptive coarsening algorithm to identify the EA niche. Finally, Section VII summarises the conclusions.
Ii Related Work
Many EAs have been applied to the more wellknown problem of graph partitioning; see Kim et al. [23] for an overview. Soper et al. [19] were the first to use an EA within a multilevel approach. They introduced variation operators that modify the edge weights of the graph depending on the input partitions. Subsequently presenting these to a multilevel partitioner, which uses the weights to obtain a new partition.
More recently, Benlic and Hao [24] used a memetic algorithm within a multilevel approach to solve the perfectly balanced graph partitioning problem . They hypothesised that a large number of vertices will always be grouped together among high quality partitions and introduced a multiparent crossover operator, with the offspring being refined with a perturbationbased tabu search algorithm.
Sanders and Schulz [25] used an EA within a multilevel approach and showed that the usage of edge weight perturbations decreases the overall quality of the underlying graph partitioner; subsequently introducing new crossover and mutation operators that avoid randomly perturbing the edge weights. Their algorithm has recently been incorporated within a faster parallelised approach [26].
In addition to performing the initial partitioning, EAs can also be used in other areas of the multilevel approach. For example, Küçükpetek et al. [27] used an EA to perform the coarsening phase in a multilevel graph partitioning algorithm.
Merz and Freisleben [28] showed that the fitness landscape depends on the structure of the graph and, perhaps unintuitively, that the landscape can become smoother as the average degree increases. Consequently, Pope et al. [29]
proposed the use of genetic programming as a metalevel algorithm to select the best combination of existing algorithms for coarsening, partitioning, and refinement, based on the characteristics of the graph being solved.
The most popular chromosome representation is groupnumber encoding, wherein each gene represents the partition group to assign a given vertex, i.e., there are as many genes as there are vertices and alleles as there are partitions. This has led to a wide variety of proposed crossover and normalisation schemes since different assignments of allele values to groups still represent the same solution. For example, Mühlenbein and Mahnig [30] used the simple normalisation technique of inverting each candidate and selecting the one with the smallest Hamming distance.
EAs have been relatively underexplored for the more general case of HGP however: there has been a small amount of prior work on VLSI circuit partitioning. For example, Schwarz and Oc̆enás̆ek [31] briefly studied several EAs including the Bayesian optimisation algorithm for direct (i.e., not multilevel) small VLSI partitioning. Kim et al. [32] explored a memetic algorithm using a modified FM for local optimisation and reported smaller bipartition cutsizes on a number of benchmark circuits when compared with hMetis. Notably, Areibi and Yang [2]
explored VLSI design via the use of memetic algorithms using FM for local optimisation within a multilevel approach and reported improvements of 35% over a simple genetic algorithm. This has since been implemented in hardware using reconfigurable computing
[33]. Significantly, none of these algorithms are considered to be competitive with stateoftheart hypergraph partitioning tools.Recently a memetic EA has been introduced to build on the KaHyPar framework [34]. This algorithm runs a steadystate EA with a population at the original uncoarsened level. The initial population is seeded using a variant of KaHyPar. Each generation, binary tournament selection is used to choose two parents, then variation operators are applied to the fitter of those, running a number of Vcycles of coarsening–initial partitioning–uncoarsening, using different randomisation seeds. The recombination operator only runs Vcycles on the subset of originallevel vertices that are in different partitions in the two parents. Two mutation operators were defined: one starting from the original level, and another which preserves more locality by skipping the coarsening phase and starting from the initial partition corresponding to the fitter parent (these are cached to save time.) To maintain diversity, a variant of restricted tournament selection is used and the authors introduce a novel distance measure that they claim is better suited to this problem domain than Hamming distance.
The work presented here and that in [34] share the idea that the memetic algorithm should work at a less coarsened level. However, there are key differences: in [34] the EA works at the wholly uncoarsened level, which can mean millions of vertices/genes. Therefore, to make the search tractable the subspace in which search occurs (via the Vcycles) is restricted and initial partitioning run at a highly coarsened level.
Iii Methodology
Iiia Test Framework
To ensure the comparability of results we use the KaHyPar level hypergraph partitioner [21, 12, 16]. This is a mature toolkit to which considerable attention has been paid to parameter tuning, so no further optimisation was applied. We also use a selection of the hypergraphs used previously for benchmarking KaHyPar, available from http://doi.org/10.5281/zenodo.30176. Specifically, we use: the 10 largest from the wellknown ISPD98 VLSI circuits [35]; and 10 each randomly selected from the University of Florida sparse matrix collection (SPM) [36] (Airfoil_2d, Reuters911, usroads, stokes128, Andrews, Baumann, HTC_336_9129, NotreDame_actors, Stanford, nasasrb) and the 2014 international SAT competition (SAT) [37] (gss20s100, MD5282, ctl_4291_567_5_unsat_pre, aaai10planningipc5pathways17step21, slpsynthesisaestop29, hwmcc10timeframeexpansionk45pdtvisns, dated1011u, atco_enc1_opt2_05_4, UCG1510p1, openstacksp30_3.085).
Since KaHyPar is currently the best general stateoftheart hypergraph partitioner [12, 16], and recursive bipartitioning can scale with increasing more effectively, here we use an initial testing regime of = 2 and = 0.1. For benchmark comparisons, we use the KaHyPar Pool portfolio algorithm described above, and compare results at equivalent numbers of evaluations. An evaluation consists of generating an initial partitioning followed by an application of the FM algorithm. However, it should be noted that one evaluation of an algorithm in the Pool (e.g., a BFS) has a longer wallclock time than an EA evaluation. The total partitioning times for the experiments reported here are approximately longer for the Pool when compared at the same threshold. For = 2, the (1) and hyperedge cutsize metrics are identical [4], and so here we use this as the objective function.
IiiB Representation, Algorithm Operators and Parameters
We adopt a simple vertextocluster encoding of the coarsened hypernodes, and use a (+) EA where each subsequent generation consists of the fittest from the parental population and offspring. Each offspring is created as the product of two (independently) randomly selected parents. Uniform crossover is applied with
= 80% probability. Symmetry in the fitness landscape can severely obstruct the evolutionary search
[38], so we apply parental alignment (normalisation) during crossover: if the Hamming distance between the parents exceeds then the gene values of one parent are inverted. A selfadaptive mutation scheme is then applied, setting genes to random values. Following Serpell and Smith [39], each candidate maintains its own mutation rate. This is initially inherited from the fitter of its parents, and then with = 10% probability may be randomly reset to one of 10 possible values before applying mutation at the resulting rate. If an offspring has an imbalance greater than , a repair mechanism is invoked, randomly moving vertices from the largest to the smallest partition. Lamarkian evolution is performed by subsequently applying the FM local search algorithm using default [12] KaHyPar settings and the offspring acquiring any modifications. See Algorithm 2.IiiC Comparison Metrics and Statistical Analysis of Results
The distribution of values observed from repeated runs was not normally distributed—especially when there is a ‘hard’ lower or upper limit. We therefore apply nonparametric tests.
For each run, we recorded two values: the initial cutsize as the value found by a search algorithm operating at the coarsest level, and the final cutsize as the value at the original level, i.e., after uncoarsening has taken place. Since these values will depend on the coarsening threshold and choice of algorithm, we denote these as . In some cases below we also report the bestcase cutsize: , the value observed at whichever coarsening threshold gave the best results for a given dataset.
To measure the performance of different algorithms across the full range of thresholds, we also present the area under the curve (AUC) results, estimated from the experiments at individual thresholds using a composite Simpson’s rule. When comparing methods on a single problem, we use the Wilcoxon rankedsums test, with the null hypothesis that all observed results come from the same distribution.
To draw any firm overall conclusions about the performance of the two approaches, we follow the recommendations in [40]
for comparing algorithms over multiple data sets. First, we examine the results to ensure that for each algorithmhypergraph combination the arithmetic mean is a reliable estimate of performance, i.e., that the distribution of observations from the 20 runs is unimodal with low standard deviation. This results in a pair of values (one per algorithm) for each hypergraph, to which the Wilcoxon signed ranks test can be applied with the null hypothesis that taken across all hypergraphs there is no difference in performance.
Finally, runtimes are recorded as totalwallclock time for the whole process because the time taken in each phase is heavily linked to the results of the previous stage.
Iv Landscape Analysis at Different Levels
One of the tenets of the multilevel approach to solving HGP is that the sheer size of the search space makes it impractical to solve at the original, uncoarsened level, and that therefore it is better to conduct the search for a good initial partitioning within a much smaller space. It has also been suggested that the graphpartitioning counterparts become easier to search as the level of coarsening increases [28]. Nevertheless, there is clearly a tradeoff. It is inevitable that the coarsening process reduces the information content, so the mapping between quality of initial and final cuts becomes more noisy—especially given the greedy uncoarsening process.
To investigate the nature of the search spaces at different levels of coarsening, we used KaHyPar to generate 10000 random starting points, apply FM to each and stored these local optima. For each problem we then identified the (usually singleton) set of ‘quasiglobal’ optima. For each local optima, we measured its Hamming distance (and that of its inverse) to each of the global optima, and recorded the smallest distance (scaled [0,1]), together with the relative cutsize, i.e., divided by the landscape’s estimated global minimum. This was done at and for four hypergraphs from each of ISPD98, SPM, and SAT collections.
Landscapes were examined through a combination of visual analytics (scatter and kerneldensityestimate, KDE plots) and a model of the fitnessdistance correlation (FDC). The FDC model is a linear regression of local optima
in the form . The proportion of observed variation in relative cutsize that can be described by the model was recorded, i.e., the coefficients of determination ().This analysis showed a significant similarity between problems, with the exception of Stanford where coarsening stops prematurely. Fig. 1 shows KDE plots for the two thresholds overlaid with the FDC results for two typical hypergraphs. Note the scales were chosen to permit comparison between different thresholds and so significant numbers of local optima with high relative cutsizes are not shown. This is why the linear regression lines lie above the main cloud of points visible at . The results of this analysis, and the implications for search algorithm design are:

On some problems the coarsening process was observed to stop prematurely, and at different values when repeated (e.g., between 34000 and 65000 hypernodes for Stanford). This suggests that search algorithms should be designed to cope with large search spaces.

The FM process greatly reduced cutsizes and there was no correlation between the cutsizes of solutions before and after improvement. This suggests a lack of global structure of the landscape as a whole, i.e., considering all points rather than just local optima. This indicates algorithms should incorporate local search.

All search landscapes contained large numbers of distinct local optima. Only a few tens of duplicates were found; more than one copy of the global optima was only found in 2 of the 24 runs, and never at . It was common to see cutsizes an order of magnitude worse than the quasiglobal optimum. This suggests that it is worth devoting computational effort to finding good starting points for the search process.

On all landscapes there was a positive FDC, i.e., the global optimum was likely to be near other good local optimum. This mirrors previous findings on the related graph partitioning problem [41, 28]. This suggests benefits for search algorithms that can exploit this information such as populationbased search with some form of recombination.

This effect was noticeably more present on the large landscapes (). This suggests that there may be a role for populationbased search in partitioning at less coarse levels than is possible with singlemember search algorithms such as BFS.

There was almost always a ‘gap’ between the best solution found and next best. The lack of duplicates makes it unlikely the global optima had large basins of attraction. Given the numbers of ‘good’ local optima found just beyond this gap, this suggests a concentric structure. This may be because points “in the gap” are infeasible, or because the basins of attraction of the goodbutnotoptimal local optima are large. Again this suggests a role for recombination, but as this has less effect as populations converge, it also suggests a changing role for mutation during search. Selfadaptation of mutation rates has often been shown successful in a wide range of domains [42] and simple approaches can be shown theoretically to be capable of overcoming both fitness and entropic barriers in combinatorial landscapes [43].
V Sensitivity to EA design choices
Va Population Seeding
The landscape analysis suggests that for some hypergraphs there is good reason to devote significant effort to finding good starting points for search. To examine this hypothesis, and conversely, whether seeding is detrimental when those conditions do not apply, we exploit the portfolio of algorithms in the Pool as a selection of heuristics for quickly finding approximate solutions. To examine the performance of the EA () with different amounts of initial seeding, experiments were run with the EA seeded with Pool evaluations: for example, when , the first 1000 evaluations are generated from the Pool before the EA begins.
In Fig. 2 the cutsizes of the best solutions discovered are shown for the ibm18, Reuters911, Stanford, and usroads hypergraphs at coarsening threshold . All results are averages of 20 runs. On both ibm18 and Reuters911, the EA quickly identifies better solutions than the Pool algorithm regardless of the seeding strategy, showing that the evolutionary search is able to effectively follow a gradient in the fitness landscape. However, on Stanford and usroads, the EA without seeding ( = 0) performs very poorly, being an order of magnitude worse than after 30000 evaluations. Given that so many local optima are present in such a fitness landscape, starting with fully random solutions ( = 0) or only a few good solutions ( = 1, = 10) can cause the EA to converge prematurely. Only by starting the EA at a suitable point in the landscape, here after 10000 Pool evaluations ( = 100), is it able to consistently find very good solutions regardless of the effectiveness of coarsening. Further increasing the amount of seeding ( = 200) did not result in additional improvements. In all following experiments therefore we use = 100, i.e., 10000 initial Pool evaluations.
The topright KDE plot in Fig. 1 suggests a reason for these observations. The huge majority of local optima lie far from the global optimum and considering the highdensity contours, there is little or no slope to guide the search towards the global optimum. Although there is a correlation between local optima cutsize and distance from the global optimum, this gradient only emerges when enough seeds have been considered to sample the lowerdensity contours of the KDE.
VB Population Size
EA sensitivity to and was explored by repeating the previous experiments across the spectrum of coarsening levels on the same 12 hypergraphs. A ratio of 1:10 was employed as this is a commonly used setting, especially with selfadaptive mutation [39]. The EA(10+100) was found to produce significantly worse final cutsizes than EA(100+1000). However, EA(50+500) and EA(200+2000) were not significantly different than EA(100+1000). This shows that the EA is reasonably robust to these parameters and the use of 100+1000 is justified here for the use of fixed parameters. However, as shown in Table I, the optimum coarsening threshold differs for each hypergraph. Therefore, adaptive population sizing schemes would further optimise wallclock partitioning time and have been shown to increase EA performance [44].
VC Variation Operators
Further experimentation on less coarsened hypergraphs () confirmed results widely reported for graph partitioning [23] that both the use of uniform crossover and parental alignment significantly improved performance. This finding remained consistent even with the use of selfadaptive mutation. For example, EA(100+1000) with produced initial cutsizes on average 30% smaller than on ibm18 after 30000 evaluations, .
Estimation of distribution algorithms (EDAs) have been used to generate many stateoftheart results by replacing recombination and mutation with a process of building and then sampling probabilistic graphical models (PGMs) of the current populations. We adapted Pelikan’s implementations of the Bayesian optimisation algorithm (BOA) [45] to work within our seeding regime, and to explicitly exploit the representation’s symmetry during model building. With small no significant differences in performance were observed. However, the scalability of the model building process was an issue with large
. Runs on a MacBook Pro with a 2.8GHz 4core Intel i7 processor with 16GB RAM were halted after 6 hours stuck in initial model building for both decision tree and graphbased variants of BOA, even after restricting the space of PGMs to bivariate models. Simplifying still further to a univariate model removed the ability to accurately capture interactions. Runs with
=100 initial seeding produced significantly larger mean initial cutsizes after 30000 evaluations on the 4 hypergraphs in Fig. 2; 2422, 3154, 210, and 128 on ibm18, Reuters911, Stanford and usroads, respectively.VD Search at Different Coarsening Levels
The more coarsening performed on a hypergraph before partitioning, the more information is potentially hidden from the optimisation algorithm, i.e., it must move larger blocks. However, the less coarsening performed, the larger the search space and potentially the worse the optimisation algorithm will perform. To explore this relationship between algorithm and coarsening threshold, we examine the results of initial and final partitioning by the Pool and EA with =100 seeding across a spectrum of coarsening levels. For each of the three classes of hypergraph, we perform experiments across the spectrum of coarsening thresholds on 4 of the 10 selected benchmark hypergraphs^{1}^{1}1 ibm 15–18; gss20s100, aaai, MD5282, and slp from the SAT collection; and SPMs Airfoil_2d, Reuters911, Stanford, and usroads.. Additionally we ran tests at = 150 and = 15000 on all 30 hypergraphs. Results presented are an average of 20 runs of each algorithm run to 30000 initial partitioning evaluations at each coarsening threshold; each threshold is sampled in intervals of 250 for , and of 5000 above that. The initial and final cutsizes can be seen in Fig. 3.
VD1 Overall Performance
Using the AUC metric to compare performance across all coarsening thresholds, initial cut sizes found by the EA were smaller than those found by Pool on all 12 problems. The same is seen for final cut sizes with the exception of Stanford, where it should be noted that the coarsening algorithm produces hypergraphs with (200000 pins) even at .
VD2 Highly Coarsened Hypergraphs
The nature of the search landscapes for highly coarsened hypergraphs results in little difference between the algorithms. No statistically significant difference between algorithms was observed on any of the 30 benchmarks for either initial or final cutsizes.
VD3 Less Coarsened Hypergraphs
The difference between algorithms becomes more significant the less coarsening is performed. For example, at the EA mean best initial cutsizes are significantly smaller than the Pool on all 10 of the ISPD98 hypergraphs (Wilcoxon ranksum test,
). Furthermore, these improvements in initial partitioning lead to smaller final cutsizes. The mean and median are lower for the EA than the Pool algorithm on all 10 of the ISPD98 hypergraphs; but not significantly different at the 95% confidence interval on
ibm10 and ibm11. On ibm18, the EA mean inital and final cutsize were 20% and 16% smaller than the Pool.Similar improvements to initial partitioning are found by the EA on the SPM hypergraphs. For example, with , the EA mean initial cutsizes on 8 of the 10 SPM hypergraphs are significantly smaller than the Pool (Wilcoxon ranksum test, ); no significant difference was observed on the nasarb and Andrews hypergraphs. Interestingly, despite the improvement in initial partitioning, this only resulted in significant differences in final cutsizes on the Airfoil_2d, Reuters911, and usroads hypergraphs, where the EA resulted in improvements to mean final cutsize of 0.7%, 4%, and 15% respectively. At this setting, no coarsening is performed on either the Airfoil_2d or Reuters911 hypergraphs and therefore the cutsizes are entirely a result of the memetic EA.
For SAT hypergraphs at , both the mean EA initial and final cutsize is significantly smaller than the Pool on 6 of the hypergraphs (), with no significant difference on the other 4, again showing that the EA performs a more effective search on larger hypergraphs.
Performing Wilcoxon signedranks tests of the initial partitionings across all runs on the 10 ISPD98 hypergraphs confirms that the EA has a significantly lower cutsize than the Pool at (). Moreover, this also translates to significant improvements in the final partitioning (). Similar results were found when repeating the class tests for the 10 SPM hypergraphs and the 10 SAT hypergraphs.
VD4 Optimum Coarsened Hypergraphs
Table I shows the smallest (average) final cutsizes discovered by the Pool and EA across all coarsening thresholds on the 4 hypergraphs from each benchmark set. This shows that when the optimum coarsening threshold for each algorithmproblem combination is known, the smallest final cutsize discovered by the EA is less than the Pool algorithm on all 4 of the largest ISPD98 hypergraphs. On the SAT hypergraphs, the best EA final cutsizes are on average smaller by 5.8% on gss20, 2.2% on aaai10, 2.75% on MD5282, and 2.6% on slpsynthesis. These improvements are statistically significant for all but ibm15 and Stanford. The improvements were achieved by the EA carrying out a more effective search at the same or higher coarsening threshold than the Pool and therefore able to take advantage of any additional information in the larger initial hypergraph.
Also shown in Table I is the average total EA partitioning time, , relative to that taken by the Pool, . As can be seen, the EA is faster on 7 of the 12 hypergraphs despite operating on a similar or larger initial hypergraph.
Hypergraph  

ibm15  1000  3250  2649  2632  2.69 
ibm16  3250  25000  1762  1720  3.15 
ibm17  15000  15000  2276  2244  0.74 
ibm18  3000  3250  1612  1564  0.57 
Airfoil_2d  15000  15000  312  311  0.66 
Reuters911  5000  10000  3199  3125  0.60 
Stanford  500  250  30  29  0.40 
usroads  750  2250  80  79  1.87 
aaai10planning  5000  5000  2312  2261  0.65 
gss20s100  1250  30000  1002  944  9.67 
MD5282  500  10000  3580  3483  6.41 
slpsynthesis  2500  4500  2618  2549  0.96 
VD5 Summary

The results for all 30 hypergraphs at the coarsest level (=150) show no significant difference between algorithms.

However, with larger initial hypergraphs (=15000), the EA significantly outperforms the Pool ().

Furthermore, the wallclock time of the Pool algorithm was significantly higher than the EA’s ().
Moreover, results confirm our hypothesis that if initial partitioning is done on large hypergraphs, the picture changes dramatically. Taken as a whole, for the 12 instances where the spectrum of coarsening thresholds was explored:

The EA significantly outperforms the Pool algorithm over all coarsening thresholds (AUC metric).

The final cutsizes of the EA at are significantly smaller for all 12 hypergraphs than the Pool algorithm at the default =150.

Taking the optimum threshold for each algorithmproblem combination, and comparing the bestcase cutsizes across the 12 problems, the EA results are significantly better than the Pool algorithm ().
Vi Adaptive Coarsening to identify the EA niche
The less coarsening is performed, the more information may be available to the initial partitioning algorithm to potentially achieve higher quality partitions. This is particularly evident in a number of the hypergraphs in Fig. 3 by observing the final cutsizes where ; see, for example, ibm18. However, for each algorithm there exists a point at which further increases in the size of the search space result in declining performance; for example, see the algorithm cutsizes on the ibm18 hypergraph where in Fig. 3. Simply selecting a fixed larger does not help since the ‘optimal’ threshold is clearly hypergraphdependent..
From Fig. 3 it can be seen that the sum of the number of vertices in each hyperedge, , initially declines relatively linearly with the number of hypernodes before reaching a point of exponential decay. This suggests that for each hypergraph there may exist a tipping point at the balance between maximal information content and maximal hypergraph compression, akin to ‘kneepoints’ in Pareto fronts. We therefore propose an adaptive coarsening scheme that halts hypernode contraction in response to the changing characteristics of the hypergraph.
Via Algorithm
We perform a linear piecewise approximation of the curve based on a sliding window of observations, and seek to identify the kneepoint at which the linear approximation is least representative of the curve. Coarsening occurs as normal until there are fewer than hypernodes; here . Thereafter, a linear regression is performed on , sampled after every hypernodes have been contracted, and calculated on the most recent samples. Coarsening is terminated and initial hypergraph partitioning performed as usual when the correlation coefficient or the original threshold reached. See Algorithm 3.
A grid search of these parameters was performed to minimise the final EA(100+1000) cutsizes on the 12 hypergraphs for which partitioning was previously performed across the range of coarsening thresholds and the best performing parameters , and were identified.
ViB Results
Results show that over a wide range of different hypergraphs this simple adaptive threshold can identify better places to stop coarsening, although with some large variations:

Across all 30 hypergraphs there was an overall reduction in the mean final cutsize of 1.6% () compared with the results achieved at =150; and a 1.25% reduction () compared with results at =15000.

The mean final cutsize is smaller on 22 of the 30 hypergraphs when using the adaptive threshold compared with the EA at =150. This difference is statistically significant on 6 of the 10 ISPD98 hypergraphs, 2 of the 10 SPM hypergraphs (Reuters911 and usroads) and 2 of the 10 SAT hypergraphs (gss20s100 and UCG1510p1). Similar improvements are found when compared with the Pool at =150.

Excluding the 12 hypergraphs used for training the coarsening parameters, the EA achieves an overall reduction in the mean final cutsize of 1.8% () compared with the results achieved at =150.

Taken hypergraphbyhypergraph, the mean final cutsize is smaller on 13 of the 18 hypergraphs. There is no significant difference compared with =15000 and yet overall the average wallclock time was faster.

Total partitioning time with =150 is of course much faster than the adaptively coarsened hypergraphs (), however with larger cutsizes. Thus, showing the existence of the aforementioned kneepoints.
The use of a range of visual analytics tools failed to uncover any obvious relationships between the characteristics of the uncoarsened hypergraphs and the magnitude and direction of the performance difference arising from adaptive coarsening.
Vii Conclusions
Our analysis of the stateoftheart in hypergraph partitioning algorithms reveals that despite considerable sophistication, all algorithms use a somewhat arbitrary threshold for determining the size of the initial partitioning problem to be solved. This is perhaps driven by the poor scaleability of the search algorithms involved, such as BFS.
However, experimental analysis of the ‘searchability’ of initial partition landscapes at different coarsening thresholds shows that larger landscapes may have properties that can be exploited by populationbased search, and we derive some guidelines for algorithm design based on that analysis.
Experimental results confirm our hypothesis that there is valuable ‘niche’ for EAbased search that leads to statistically significant reductions in final cutsize: up to 20% compared to the default settings (Pool algorithm at =150). Searching effectively in larger search spaces comes at a cost of approximately tenfold in runtime, but this may well be warranted in many contexts such as ‘oneoff’ design, or where subsequent processing is needed within the partitions.
Sensitivity analysis confirmed the guidelines derived from landscape analysis: recombination is useful, population size is not critical, and it is worth devoting a significant proportion of the computational budget to seeding the EAbase search.
Examining the search performance of different algorithms at different coarsenening levels, we observe that there is a ‘sweetspot’ for EAbased search that is instancedependent. We identify a novel, computationally cheap method for halting coarsening by monitoring the rate of change in information content as the hypergraph is contracted. This gives as good results as stopping at a predefined arbitrary larger threshold and with runtimes reduced 7.5fold.
We do not claim to have developed the ‘best’ EA to work in that niche. Rather, the aim of this paper was to establish the presence of a valuable role for EAs in hypergraph partitioning, working at a less coarsened level than currently used. In future work we will focus on (i) improved adaptive coarsening schemes, and (ii) tighter integration and reuse of information from the FM local search with the EA search processes and EDA modelbuilding.
Acknowledgments
The authors would like to thank the Karlsruhe Institute of Technology for KaHyPar and benchmark hypergraphs, and Martin Pelikan for his implementations of the BOA algorithm.
References
 [1] T. Lengauer, Combinatorial algorithms for integrated circuit layout. New York, NY, USA: John Wiley & Sons, 1990.
 [2] S. Areibi and Z. Yang, “Effective memetic algorithms for VLSI design = genetic algorithms + local search + multilevel clustering,” Evol. Comput., vol. 12, no. 3, pp. 327–353, Fall 2004.
 [3] O. Selvitopi, S. Acer, and C. Aykanat, “A recursive hypergraph bipartitioning framework for reducing bandwidth and latency costs simultaneously,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 2, pp. 345–358, Feb. 2017.
 [4] A. Trifunović, “Parallel algorithms for hypergraph partitioning,” Ph.D. dissertation, Department of Computing, Imperial College of Science, Technology and Medicine, University of London, London, UK, 2006.
 [5] F. Lotfifar, “Hypergraph partitioning in the cloud,” Ph.D. dissertation, School of Engineering and Computing Sciences, Durham University, Durham, UK, 2016.
 [6] C. J. Alpert, J.H. Huang, and A. B. Kahng, “Multilevel circuit partitioning,” IEEE Trans. Comput.Aided Design Integr. Circuits Syst., vol. 17, no. 8, pp. 655–667, Aug. 1998.
 [7] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel hypergraph partitioning: Applications in VLSI domain,” IEEE Trans. VLSI Syst., vol. 8, no. 1, pp. 69–79, Mar. 1999.

[8]
U. V. Çatalyürek and C. Aykanat, “Hypergraphpartitioningbased decomposition for parallel sparsematrix vector multiplication,”
IEEE Trans. Parallel Distrib. Syst., vol. 11, no. 7, pp. 673–693, Jul. 1999.  [9] K. D. Devine, E. G. Boman, R. T. Heaphy, R. H. Bisseling, and U. V. Çatalyürek, “Parallel hypergraph partitioning for scientific computing,” in Proc. IEEE Int. Parallel Distrib. Process. Symp., P. Spirakis and H. J. Siegel, Eds. Piscataway, NJ, USA: IEEE Press, 2006, p. 10.
 [10] A. Trifunović and W. J. Knottenbelt, “Parallel multilevel algorithms for hypergraph partitioning,” J. Parallel Distrib. Comput., vol. 68, no. 5, pp. 563–581, May 2008.
 [11] U. V. Çatalyürek, M. Deveci, K. Kaya, and B. Uçar, “UMPa: A multiobjective, multilevel partitioner for communication minimization,” in Contemporary Mathematics: Graph Partitioning and Graph Clustering, D. A. Bader, H. Meyerhenke, P. Sanders, and D. Wagner, Eds. Providence, RI, USA: AMS, 2013, vol. 588, pp. 53–66.
 [12] S. Schlag et al., “kway hypergraph partitioning via nlevel recursive bisection,” in Proc. ALENEX, M. Goodrich and M. Mitzenmacher, Eds. Philadelphia, PA, USA: SIAM, 2016, pp. 53–67.
 [13] T. Heuer and S. Schlag, “Improving coarsening schemes for hypergraph partitioning by exploiting community structure,” in 16th Int. Symp. Experimental Algorithms, (SEA 2017), ser. Leibniz International Proceedings in Informatics (LIPIcs), C. S. Iliopoulos, S. P. Pissis, S. J. Puglisi, and R. Raman, Eds., vol. 75. Dagstuhl, Germany: Schloss Dagstuhl–LeibnizZentrum fuer Informatik, 2017, pp. 21:1–21:19.
 [14] C. M. Fiduccia and R. M. Mattheyses, “A linear time heuristic for improving network partitions,” in Proc. IEEE Design Autom. Conf., J. S. Crabbe, Ed. Piscataway, NJ, USA: IEEE Press, 1982, pp. 175–181.
 [15] V. Osipov and P. Sanders, “nlevel graph partitioning,” in Proc. Euro. Symp. Algor., ser. LNCS, M. de Berg and U. Meyer, Eds., vol. 6346. Berlin, Germany: Springer, 2010, pp. 278–289.
 [16] Y. Akhremtsev, T. Heuer, P. Sanders, and S. Schlag, “Engineering a direct kway hypergraph partitioning algorithm,” in Proc. ALENEX, S. Fekete and V. Ramachandran, Eds. Philadelphia, PA, USA: SIAM, 2017, pp. 28–42.
 [17] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM J. Sci. Comput., vol. 20, no. 1, pp. 359–392, Aug. 1998.

[18]
G. Karypis, “Multilevel hypergraph partitioning,” in Multilevel
Optimization in VLSICAD
, ser. Combinatorial Optimization, J. Cong and J. R. Shinnerl, Eds. New York, NY, USA: Springer US, 2003, vol. 14, ch. 3, pp. 125–154.
 [19] A. J. Soper, C. Walshaw, and M. Cross, “A combined evolutionary search and multilevel optimisation approach to graphpartitioning,” J. Global Optim., vol. 29, no. 2, pp. 225–241, Jun. 2004.
 [20] L. Kotthoff, “Algorithm selection for combinatorial search problems: A survey,” AI Mag., vol. 35, no. 3, pp. 48–60, Fall 2014.
 [21] T. Heuer, “Engineering initial partitioning algorithms for direct kway hypergraph partitioning,” Bachelor thesis, Department of Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany, 2015.
 [22] U. V. Çatalyürek and C. Aykanat, “PaToH: Partitioning tool for hypergraphs,” http://bmi.osu.edu/umit/PaToH/manual.pdf, pp. 22–23, 2011.
 [23] J. Kim, I. Hwang, Y.H. Kim, and B.R. Moon, “Genetic approaches for graph partitioning: A survey,” in Proc. GECCO, N. Krasnogor, Ed. New York, NY, USA: ACM, 2011, pp. 473–480.
 [24] U. Benlic and J. K. Hao, “A multilevel memetic approach for improving graph kpartitions,” IEEE Trans. Evol. Comput., vol. 15, no. 5, pp. 624–642, Oct. 2011.
 [25] P. Sanders and C. Schulz, “Distributed evolutionary graph partitioning,” in Proc. ALENEX, D. A. Bader and P. Mutzel, Eds. Philadelphia, PA, USA: SIAM, 2012, pp. 16–29.
 [26] H. Meyerhenke, P. Sanders, and C. Schulz, “Parallel graph partitioning for complex networks,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 9, pp. 2625–2638, Sep. 2017.
 [27] S. Küçükpetek, F. Polat, and H. J. Oğuztüzüün, “Multilevel graph partitioning: An evolutionary approach,” J. Oper. Res. Soc., vol. 56, no. 5, pp. 549–562, May 2005.
 [28] P. Merz and B. Freisleben, “Fitness landscapes, memetic algorithms, and greedy operators for graph bipartitioning,” Evol. Comput., vol. 8, no. 1, pp. 61–91, Spring 2000.
 [29] A. S. Pope, D. R. Tauritz, and A. D. Kent, “Evolving multilevel graph partitioning algorithms,” in Proc. IEEE Symp. Series Comput. Intell., Y. Jin and S. Kollias, Eds. Piscataway, NJ, USA: IEEE Press, 2016, pp. 1–8.
 [30] H. Mühlenbein and T. Mahnig, “Evolutionary optimization and the estimation of search distributions with applications to graph bipartitioning,” Int. J. Approx. Reason., vol. 31, no. 3, pp. 157–192, Nov. 2002.
 [31] J. Schwarz and J. Oc̆enás̆ek, “Experimental study: Hypergraph partitioning based on the simple and advanced genetic algorithm BMDA and BOA,” in Proc. 5th Int. Mendel Conf. Soft. Comput. (MENDEL’99), 1999, pp. 124–130.
 [32] J.P. Kim, Y.H. Kim, and B.R. Moon, “A hybrid genetic approach for circuit bipartitioning,” in Proc. GECCO, ser. LNCS, K. Deb, Ed. Berlin, Germany: Springer, 2004, vol. 3103, pp. 1054–1064.
 [33] S. Coe, S. Areibi, and M. Moussa, “A hardware memetic accelerator for VLSI circuit partitioning,” Comput. Elect. Eng., vol. 33, no. 4, pp. 233–248, Jul. 2007.
 [34] R. Andre, S. Schlag, and C. Schulz, “Memetic multilevel hypergraph partitioning,” in Proc. GECCO, K. Takadama, Ed. New York, NY, USA: ACM, 2018, pp. 347–354.
 [35] C. J. Alpert, “The ISPD98 circuit benchmark suite,” in Proc. Int. Symp. Phys. Design, M. Sarrafzadeh, Ed. New York, NY, USA: ACM, 1998, pp. 80–85.
 [36] T. A. Davis and Y. Hu, “The University of Florida sparse matrix collection,” ACM Trans. Math. Softw., vol. 38, no. 1, pp. 1–25, Nov. 2011.
 [37] A. Belov, D. Diepold, M. Heule, and M. Järvisalo, “SAT competition 2014,” http://satcompetition.org/2014/, 2014.
 [38] S. S. Choi, Y. K. Kwon, and B. R. Moon, “Properties of symmetric fitness functions,” IEEE Trans. Evol. Comput., vol. 11, no. 6, pp. 743–757, Dec. 2007.
 [39] M. Serpell and J. E. Smith, “Selfadaptation of mutation operator and probability for permutation representations in genetic algorithms,” Evol. Comput., vol. 18, no. 3, pp. 491–514, Fall 2010.

[40]
J. Dems̆ar, “Statistical comparisons of classifiers over multiple data sets,”
J. Mach. Learn. Res., vol. 7, pp. 1–30, Jan. 2006.  [41] K. D. Boese, A. B. Kahng, and S. Muddu, “A new adaptive multistart technique for combinatorial global optimizations,” Oper. Res. Lett., vol. 16, no. 2, pp. 101–113, Sep. 1994.
 [42] S. MeyerNieberg and H.G. Beyer, “Selfadaptation in evolutionary algorithms,” in Parameter setting in evolutionary algorithms, ser. Studies in Computational Intelligence, F. Lobo, C. Lima, and Z. Michalewicz, Eds. Berlin, Germany: Springer, 2007, vol. 54, pp. 47–75.
 [43] J. E. Smith, “Parameter perturbation mechanisms in binary coded GAs with selfadaptive mutation,” in Foundations of Genetic Algorithms 7, C. Potta, R. Poli, J. Rowe, and K. DeJong, Eds. San Francisco, CA, USA: Morgan Kauffman, 2003, pp. 329–346.
 [44] M. Z. Ali, N. H. Awad, P. N. Suganthan, and R. G. Reynolds, “An adaptive multipopulation differential evolution with dynamic population reduction,” IEEE Trans. Cybern., vol. 47, no. 9, pp. 2768–2779, Sep. 2017.
 [45] M. Pelikan, D. E. Goldberg, and E. CantúPaz, “BOA: The Bayesian optimization algorithm,” in Proc. GECCO, D. E. Goldberg, Ed. San Francisco, CA, USA: Morgan Kaufmann, 1999, pp. 525–532.
Comments
There are no comments yet.