I Introduction
Many realworld applications model problems as hypergraphs where connections between vertices are multidimensional, i.e., each vertex can directly communicate to n vertices in a group (called “hyperedge”) and each vertex can be in multiple groups. Hypergraph partitioning [1, 2, 3, 4, 5, 6, 7, 8] deals with the problem of optimally dividing a hypergraph into a set of equallysized components. Applications of hypergraph partitioning arise in diverse areas such as VLSI design placement [9], optimizing the task and data placement in workflows [10], minimizing the number of transactions in distributed data placement [11], optimizing storage sharding in distributed data bases [7], and as a necessary preprocessing step in distributed hypergraph processing [12].
Formally, dividing the hypergraph is denoted as the balanced kway hypergraph partitioning problem. The goal is to divide the hypergraph into
equallysized partitions such that the number of times neighboring vertices are assigned to different partitions is minimal (this is denoted as the (k1) metric, as introduced later). The balanced kway hypergraph partitioning problem is NPhard. Hence, a heuristic approach is needed to solve the problem for massive hypergraphs.
In literature, a couple of heuristic hypergraph partitioning algorithms have been proposed, but they have shortcomings. Streaming hypergraph partitioning [8] considers one vertex at a time from a stream of hypergraph vertices. Based on a scoring function, it greedily assigns each vertex from the stream to the partition that yields the best scoring. While this algorithm has low runtime, it does not consider all relationships between all vertices when deciding on the partitioning, so that partitioning quality suffers. A recent hypergraph partitioning algorithm, Social Hash Partitioner [7], considers the complete hypergraph at once. It iteratively performs random permutations of the current partitioning followed by a greedy optimization to choose a better permutation over a worse one. While this approach generally converges to some form of an improved partitioning and is highly scalable, we argue that random permutations may not be the most effective choice for the partitioning heuristic.
In the related field of graph partitioning, a recently proposed algorithm uses neighborhood expansion to exploit structural properties of natural graphs [13]. Graphs can be regarded as special cases of hypergraphs, where each hyperedge contains only a single vertex. However, the original neighborhood expansion algorithm for graphs cannot be directly applied to hypergraphs. As hyperedges may contain a very large number of vertices, the neighborhood of a single vertex can be huge, rendering neighborhood expansion infeasible.
In this paper, we are the first researchers who successfully apply neighborhood expansion to hypergraph partitioning. We propose HYPE, a hypergraph partitioning algorithm specifically tailored to realworld hypergraphs with skewed degree distribution. HYPE grows
partitions based on the neighborhood relations in the hypergraph. We evaluate the performance of HYPE on a set of realworld hypergraphs, including a novel hypergraph data set consisting of authors and subreddits from the online board Reddit. Reddit is, to the best of our knowledge, the largest realworld hypergraph that has been considered in evaluating hypergraph partitioning algorithms up to now. In our evaluations, we show that HYPE can partition very large hypergraphs efficiently with high quality. HYPE is 39% faster and yields 95% better partitioning quality than streaming partitioning [8]. We released the source code of HYPE as an open source project: https://github.com/mayerrn/HYPE.Ii Problem Formulation
In this section, we formulate the hypergraph partitioning problem addressed in this paper.
Problem Formulation: The hypergraph is given as with the set of vertices and the set of hyperedges . Given vertex , we denote the set of adjacent vertices, i.e., the set of neighbors of , as . The goal is to partition the hypergraph into partitions by assigning vertices to partitions. The assignment function defines for each vertex in to which partition it is assigned. We write if vertex is assigned to partition . Each hyperedge spans between 1 and partitions. The optimization objective is the cut that sums over each hyperedge the number of times it is assigned to more than one partition, i.e., . We require that the assignment of vertices to partitions is balanced in the number of vertices assigned to a single partition, i.e., for a small balancing factor . This problem is denoted as balanced kway hypergraph partitioning problem and it is NPhard [14].
Reddit Hypergraph Example: We give an example of the Reddit hypergraph in Figure 1. The Reddit social network consists of a large number of subreddits where each subreddit concerns a certain topic. For example, in the subreddit /r/learnprogramming, authors write comments related to the topic of learning to program. Each author writes comments in an arbitrary number of subreddits and each subreddit is authored by an arbitrary number of users. One way to build a hypergraph out of the Reddit data set (see Section IV) is to use subreddits as vertices and authors as hyperedges that connect these vertices. This hypergraph representation provides valuable information about the similarity of subreddits. For instance, if many authors are active in two subreddits (e.g. /r/Arduino and /r/ArduinoProjects), i.e., many hyperedges overlap significantly in two vertices, it is likely that the two subreddits concern similar content.
The size of the hyperedges and the degree of the vertices resemble a power law degree distribution for realworld graphs such as Reddit, Stackoverflow, and Github. Hence, most vertices have a small degree and most hyperedges have a small size. These parts of the graph with small degrees build relatively independent communities. In the reddit graph, there are local communities such as people who write in the /r/Arduino and /r/ArduinoProjects subreddits but not in, say the /r/Baby subreddit. This property of strong local communities is wellobserved for graphs [15]—and it holds for realworld hypergraphs as well. Power law distributions are longtailed, i.e., there are some hub vertices or edges with substantial sizes or degrees. In graph literature, it has been shown that focusing on optimal partitioning of the local communities at the expense of optimal placement of the hubs is a robust and effective partitioning strategy [16, 17, 18]. In Section III, we show how we exploit this observation in the HYPE partitioner.
Hypergraph  

Set of vertices  
Set of hyperedges  
Set of adjacent vertices of vertex  
Union of all sets for  
Set of partitions with  
Function assigning vertices to partitions  
Balancing factor  
Current core set of vertices  
Current fringe  
External neighbors score of vertex  
Maximal size of the fringe  
Number of fringe candidate vertices 
Iii The Algorithm
In the following, we first explain the idea of neighborhood expansion in Section IIIA. We introduce our novel hypergraph partitioning algorithm HYPE in Section IIIB and discuss the balancing of hypergraph partitions in Section IIIC. Finally, we present a more formal pseudocode notation of the HYPE algorithm in Section IIID and perform a complexity analysis in Section IIIE.
Iiia Neighborhood Expansion Idea
A practical method for highquality graph partitioning is neighborhood expansion [13]. The idea is to grow a core set of vertices via the neighborhood relation given by the graph structure. By exploiting the graph structure, the locality of vertices in the partition is maximized, i.e., neighboring vertices in the graph tend to reside on the same partition. The algorithm grows the core set one vertex at a time until the desired partition size is achieved. In order to partition the graph into partitions, the procedure of growing a core set is repeated times for .
We aim to adopt neighborhood expansion to hypergraph partitioning. To this end, we have to overcome a set of challenges which are related to the different structure of hypergraphs when compared to normal graphs. In particular, the number of neighbors of a vertex quickly explodes as the hyperedges contain multiple neighboring vertices at once. Before we explain in detail those challenges and our approach to tackle them, we sketch the basic idea of neighborhood expansion in the following.
Figure 2 sketches the general framework for growing the core set of partition . There are three overlapping sets: the vertex universe, the fringe, and the core set. The vertex universe is the set of remaining vertices that can potentially be added to the fringe , i.e., . The fringe is the set of vertices that are currently considered for the core set. The core is the set of vertices that are assigned to partition . The three sets are nonoverlapping, i.e., .
Initially, the core consists of seed vertices that are taken as a starting point for growing the partition. Based on these seed vertices, the fringe contains a subset of all neighboring vertices. In graph partitioning [13], the fringe contains not a subset but all vertices that are in a neighborhood relation to one of the vertices in the core set. In the Figure 2, a fringe candidate vertex, say vertex , is moved from the vertex universe to the fringe and then to the core set. In other words, any strategy based on neighborhood expansion must define the two functions upd8_fringe() and upd8_core().
As we develop a hypergraph partitioning algorithm based on the neighborhood expansion framework, we define the neighborhood relation and the three vertex sets accordingly. However, migrating the idea of neighborhood expansion from graph to hypergraph partitioning is challenging. The number of neighbors of a specific vertex in a typical hypergraph is much larger than in a typical graph. The reason is that the number of neighbors is not only proportional to the number of incident hyperedges but also to the size of these hyperedges. For example, suppose you are writing a comment in the /r/Python subreddit. Suddenly, hundreds of thousands other authors in /r/Python are your direct neighbors. In other words, the neighborhood relation is group based rather than bidirectional which leads to massive neighborhoods. The large number of neighbors changes the runtime behavior and efficiency of neighborhood expansion. For instance, in such a hypergraph, the fringe would suddenly contain a large fraction of the vertices in the hypergraph. But selecting a vertex from the fringe requires comparisons. This leads to high runtime overhead and does not scale to massive graphs. HYPE alleviates this problem by reducing the search space significantly as described next.
IiiB HYPE Algorithm
The HYPE algorithm grows the core set for partition one vertex at a time. We load one vertex from the fringe to the core set and update the fringe with fresh vertices from the vertex universe. The decision of which vertex to include into the core and fringe sets is a critical design choice that has impact on the algorithm runtime and partitioning quality.
In this section, we explain the HYPE algorithm in detail, including a discussion of the design choices. In doing so, we first provide the basic algorithm in Section IIIB1 and then discuss optimizations of the algorithm in Section IIIB2. These optimizations tremendously reduce the algorithm runtime without compromising on partitioning quality.
IiiB1 Basic Algorithm
The approach of growing a core set is repeated in a sequential manner for each partition for . It consists of a four step process. We initialize the computation with step 1., and iterate steps 2. and 3., until the algorithm terminates as defined in step 4.

Initialize the core set.

Move vertex from vertex universe into fringe.

Move vertex from fringe into core set.

Terminate the expansion.
We now examine these steps in detail. A formal algorithmic description is given in Section IIID.
1. Initialize the core set
The core set must initially contain at least one vertex in order to grow via the neighborhood expansion. In general, there are many different ways to initialize the core set. This problem is similar to the problem of initializing a cluster center for iterative clustering algorithms such as KMeans [19]. Here, the defacto standard is to select random points as cluster centers [20, 21]. In fact, a comparison of several initialization methods for KMeans shows that the random method leads to robust clustering quality [22]. As the problem of selecting an initial “seed” vertex from which the core set grows is similar to this problem, we perform random initialization.
2. Move vertex to fringe
The function upd8_fringe() determines which vertices move from the vertex universe to the fringe . The vertex universe consists of all vertices that are neither in the fringe , nor in the core set of any previous execution of the algorithm for any partition with .
The standard strategy of neighborhood expansion is to fill the fringe with all vertices that are neighbors to any core vertex, i.e., . But for realworld hypergraphs, this quickly overloads the fringe with a large number of vertices from the vertex universe. To prevent this, we restrict the fringe to contain only vertices, i.e., . In Figure 3, we validate experimentally that setting to a small constant value, i.e., , keeps partitioning quality high but reduces runtime by a large factor. For brevity, we omit the discussion of similar results observed for other hypergraphs.
But how do we select the next vertex to be loaded into the fringe? Out of the vertex universe , we select a vertex to be included to the fringe using a scoring metric as described in the next paragraphs. The intuition behind this scoring metric is to find the vertex that preserves the highest locality when assigned to the core set.
To this end, we define locality as the frequency that for a given vertex , a neighboring vertex resides on the same partition. High locality leads to low cut sizes and good partitioning quality. To improve locality, our goal is to grow the core set into the smaller local communities and assign all vertices of these smaller communities to the same partition. If a high proportion of neighbors of vertex is already assigned to the core set, assigning vertex to the core set as well will improve locality.
In Figure 4, we see an example hypergraph that has the typical properties of realworld hypergraphs: the size of the hyperedges follows a power law distribution. To grow the fringe, there are three options: include vertex , , or . Intuitively, we want to grow the fringe towards the local communities to preserve locality. We achieve this by selecting vertices based on the external neighbors with respect to the fringe . In other words, we want to add vertices to the fringe that have a high number of neighbors in the fringe or the core set, and a low number of neighbors in the remaining vertex universe. A vertex with low external neighbors score tends to have high locality in the fringe and the core set. Formally, we write to denote the number of neighboring vertices of that are not already contained in the fringe as defined in Equation 1.
(1) 
We denote the vertices for which we calculate the external neighbors score as fringe candidate vertices. For each execution of upd8_fringe(), we select fringe candidate vertices. The fringe contains up to vertices. Hence, after assigning one vertex to the core set, we take the top vertices out of the fringe candidate vertices as the new fringe.
In Section IIIB2, we describe three optimizations on the upd8_fringe() function that reduce the runtime while keeping the partitioning quality intact.
3. Move vertex to core set
Next, we describe the function upd8_core() that moves a vertex from the fringe to the core set . The function simply selects the vertex with smallest (cached) external neighbors score. This vertex is then moved to the core, i.e., . This decision is final. Once assigned to the core , vertex will never be assigned to any other core when considering a partition . Hence, we remove the vertex from the remaining set of vertices in the vertex universe, i.e., . In case the fringe can not be filled with enough neighbors, we add a random vertex from the vertex universe to the fringe and proceed with the given algorithm.
4. Terminate the expansion
We terminate the algorithm as soon as the core set contains vertices. This leads to perfectly balanced partitions with respect to the number of vertices. Upon termination, we release the vertices in the fringe and store the vertices in the core set in a separate partitioning file for partition . After this, we restart expansion for the next partition or terminate if all vertices have been assigned to partitions. In Section IIIC, we discuss other possible balancing schemes.
Dataset  Vertices  Hyperedges  #Vertices  #Hyperedges  #Edges 

Github [23]  Users  Projects  177,386  56,519  440,237 
StackOverflow [23]  Users  Posts  641,876  545,196  1,301,942 
Subreddits  Authors  430,156  21,169,586  179,686,265  
RedditL  Comments  Authors & Subreddits  2,862,748,675  21,599,742  5,725,497,350 
IiiB2 Optimization of Fringe Updates
When moving a vertex from the vertex universe to the fringe, we have to be careful in order to efficiently select a good vertex. Calculating a score for all vertices in the vertex universe would be much too expensive. For example, suppose we add vertex to the fringe in the example in Figure 4. Suddenly, all vertices in the huge hyperedge could become fringe candidate vertices for which we would have to calculate the external neighbors score. Note that to calculate the external neighbors score, we must perform set operations that may touch an arbitrary large portion of the global hypergraph.
Our strategy to address this issue involves three steps: (a) select the best fringe candidate vertices from the vertex universe in an efficient manner by traversing small hyperedges first, (b) reduce the number of fringe candidate vertices to , and (c) reduce the computational overhead to calculate the score for a fringe candidate vertex by employing a scoring cache. These three optimizations help us to limit the overhead per decision of which vertex to include into the fringe. Next, we describe the optimizations in detail.
Maximize the chance to select good fringe candidate vertices
First, we describe how we maximize the chance to select good fringe candidate vertices from the vertex universe. The optimal vertex to add to the fringe has minimal external neighbors score, i.e., . Vertices that reside in large hyperedges (e.g. in Figure 4) have a high number of neighbors. Hence, it is unlikely that these vertices have a low external neighbors score with respect to the fringe . For example, in Figure 4, vertices and have 18 neighbors, whereas vertex has only 4 neighbors. Thus, vertex has a much higher chance of being the vertex with the minimal external neighbors score. Based on this observation, we optimize the selection of fringe candidate vertices by ordering all hyperedges that are incident to the fringe with respect to their size and consider only vertices in the smallest hyperedge for inclusion into the fringe (e.g., hyperedges with results in the initial selection of hyperedge ). When we cannot retrieve vertices from the smallest hyperedge (because it does not contain enough vertices that are not already in or ), we proceed with the next larger hyperedge.
Reduce the number of fringe candidate vertices to be selected
Next, we limit the number of fringe candidate vertices to vertices. From these vertices, we select the vertex with the smaller external neighbors score. The basic principle of selecting the best out of two random choices is known in literature as “the power of two random choices” [24] and has inspired our design. We experimentally validated that considering more than two options, i.e., , does not significantly improve the decision quality, cf. Figure 5. Clearly, the lower the number of fringe candidate vertices is, the lower is the runtime of the algorithm. Interestingly, using two choices, i.e., leads to better partitioning quality than all other settings of . Apparently, a higher value for forces the algorithm to consider fringe candidate vertices from larger hyperedges which distracts the algorithm from the smaller hyperedges.
Reduce the overhead to compute an external neighbors score for fringe candidate vertices
The external neighbors score requires calculation of the set intersection between two potentially large sets (see Equation 1). This calculation must be done for all fringe candidate vertices. To prevent recomputation, we use a caching mechanism. The score is calculated only when the vertex is accessed for the first time (lazy caching policy). While this means that the cached score may change when including more and more vertices into the fringe, our evaluation results show that partitioning quality stays the same when using caching (see Figure 6). But the benefit of reducing score computations improves runtime by up to 20%.
IiiC Balancing Considerations
The default balancing objective of the HYPE algorithm leads to a balanced number of vertices on each partition. For vertices and partitions, the algorithm repeats the neighborhood expansion, one vertex at a time, until there are exactly vertices per partition. Vertex balancing is the standard method for distributed graph processing systems such as Pregel [25]—considering that the workload per partition is roughly proportional to the number of vertices per partition. Therefore, many practical algorithms such as the popular multilevel way hypergraph partitioning algorithm [1] focus on vertex balancing.
However, some applications of hypergraph partitioning may benefit from balancing the sum of vertices and hyperedges [8]. More precisely, for vertices and hyperedges, the algorithm should partition the hypergraph in a way such that each partition is responsible for vertices or hyperedges. In the following, we discuss two ideas how HYPE can achieve this. First, we assign a weight to each vertex , i.e., the weight with being the set of incident hyperedges of vertex . Then, we repeat the neighborhood expansion algorithm by assigning vertices until each partition has
total weight (or less). The rationale behind this method is the law of large numbers: it is not likely that a single vertex assignment will suddenly introduce a huge imbalance in relation to the already assigned vertices. Second, to achieve perfect edge balancing, we can flip the hypergraph, i.e., viewing each original vertex as a hyperedge and each original hyperedge as a vertex. When balancing the number of vertices in the flipped graph, we actually balance the number of hyperedges in the original graph. After termination of the algorithm, we flip the hypergraph back to the original representation. We leave an investigation of other balancing constraints as future work.
IiiD HYPE Pseudocode
Algorithm 1 lists the main loop. We repeat the following method for all partitions . After some housekeeping tasks such as filling the core set of partition with a random vertex (line 3), initializing the fringe (line 5), and clearing the cache (line 6), we repeat the main loop (lines 78) until the core set has exactly vertices. The loop body consists of the two functions upd8_fringe() and upd8_core() that are described next.
Algorithm 2 lists the upd8_fringe() function. The function consists of three steps: determine the fringe candidate vertices (lines 310), update the cache (lines 1214), and update the fringe (lines 1620). The first step calculates the fringe candidate vertices by first sorting the hyperedges that are incident to the core set by size (ascending) and then traversing these hyperedges for vertices that can still be added to the fringe (i.e., are not already assigned to the fringe or any core set). The second step updates the cache with the current external neighbors score with respect to the current fringe . The third step sets the fringe to the set of top vertices with respect to the external neighbors score. If the fringe is still empty after these steps, we initialize it with a random vertex.
Algorithm 3 lists the upd8_core() function. We load the vertex with the minimal cached external neighbors score into the core and remove this vertex from the fringe and the vertex universe .
IiiE Complexity Analysis
For the following analysis, we denote the number of vertices as and the number of hyperedges as . Algorithm 1 repeats for partitions the procedure of moving vertices from the vertex universe to the fringe and from the fringe to the core. Next, we analyze the runtime for those procedures upd8_fringe() and upd8_core().
The function upd8_fringe() in Algorithm 2 consists of three steps. First, it determines fringe candidate vertices from the vertex universe (lines 312). As we set to a small constant (), this step is very fast in practice with the following caveat: The algorithm needs to sort the incident hyperedges with respect to the hyperedge size (line 6). The computational complexity of sorting all hyperedges is . It is sufficient to sort the hyperedges only once in the beginning of the HYPE algorithm. Recap that the algorithm fills the fringe with new candidate vertices. In the worst case, the fringe is incident to all hyperedges in the hypergraph. Therefore, selecting the fringe candidate vertices is in to iterate over all hyperedges.
This is a pessimistic estimation—in practical cases it is sufficient to check the first few smallest hyperedges to find the
candidates. Second, the algorithm updates the cache with fresh external degree scores for new candidates vertices (lines 1416). It calculates the external degree score at most once for every candidate vertex (it is just read from cache if needed again later). As there are only fringe candidate vertices in each execution of upd8_fringe(), the total number of external degree score calculations is limited to . The overhead of calculating the external degree of a vertex with respect to a set of fringe vertices is (cf. Equation 1), but is a constant (). Hence, the total computational complexity of updating the cache is . Third, the algorithm updates the fringe with vertices from the fringe candidates (lines 1822). As both the fringe and the fringe candidates have constant sizes and , the complexity of the third step is .The function upd8_core() in Algorithm 3 selects the vertex with minimal cached external neighbors score from the constantsized fringe. Thus, the complexity is including the housekeeping tasks in lines 35.
In total, the worstcase computational complexity of the HYPE algorithm is . As highlighted above, in practice we observe that only a small, constant number of hyperedges is checked in order to find the candidate vertices, so that we observe a complexity of .
Iv Evaluations
In this section, we evaluate the performance of HYPE on several realworld hypergraphs.
Experimental Setup
All experiments were performed on a shared memory machine with 4 x Intel(R) Xeon(R) CPU E74850 v4 @ 2.10GHz (4 x 16 cores) with 1 TB RAM. The source code of our HYPE partitioner is written in C++.
Hypergraph Data Sets
We perform the experiments on different realworld hypergraphs, i.e., Github [23], StackOverflow [23], Reddit and RedditL^{1}^{1}1https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/, as listed in Table II. All of the hypergraphs show a power law distribution of vertex and hyperedge degrees. In addition to the number of vertices and hyperedges, we report the number of edges. An edge represents an assignment of a vertex to a hyperedge.
We highlight that for this paper, we crawled two large realworld hypergraphs from the Reddit dataset using the relations between authors, subreddits and comments.
Benchmarks
We choose our benchmarks for evaluating HYPE based on 3 categories.
Group (I) consists of a wide range of hierarchical partitioners such as hMetis [1], Mondriaan [2], Parkway [3], PaToH [4], Zoltan [5], and KaHyPar [6]. As no partitioner in group (I) consistently outperforms the other partitioners in terms of partitioning quality, scalability and partitioning time, we decided for the wellestablished and widely used hypergraph partitioner hMETIS. Several recent papers show that hMETIS leads to competitive partitioning performance with respect to the metric [7, 26, 8]. Hence, we chose hMETIS as the representative partitioner from group (I) in this paper. We run hMETIS in two different settings: with and without enforced vertex balancing. Due to the high partitioning quality, hMETIS serves as the benchmark for partitioning quality on small to medium hypergraphs. We used hMETIS in version 2.0pre1 with multilevel recursive bisection.
Group (II) consists of the recently proposed Social Hash Partitioner (SHP) [7]. The authors released the raw source code of SHP. Yet, we could not reproduce their results as neither configuration files and parameters, nor scripts, execution instructions, or documentations were provided. However, in our evaluations we partitioned hypergraphs of similar size as SHP in a similar runtime. In terms of partitioning quality, SHP reports a similar quality to partitioners from group (I), so that we can also implicitly get a fair comparison to SHP in our given evaluation setup.
Group (III) comprises multiple streaming partitioning strategies proposed by Alistarh et al. [8]. The greedy MinMax strategy constantly outperformed all other strategies according to their paper. Hence, we choose MinMax as the representative partitioner from group (III) in this paper. Moreover, we designed a novel vertexbalanced variant of MinMax that outperforms the original approach with respect to the (k1) metric^{2}^{2}2We allowed a slack parameter of up to 100 vertices, cf. [8]. We denote this variant as MinMax NB (node balanced) in contrast to the standard MinMax EB (edge balanced).
Experiments
In all experiments, we capture the following metrics. (1) The (k1) metric to evaluate the quality of the hypergraph partitioning. This is the default metric for partitioning quality [1, 5]. Other partitioning quality metrics such as the hyperedgecut and the sum of external degree performed similar in our experiments^{3}^{3}3The close relationship between these metrics is wellknown in literature [7].. (2) The runtime of the algorithm to partition the whole input hypergraph in order to evaluate the speed of the partitioning algorithms. (3) The vertex imbalance as a metric to capture the fairness of the hypergraph partitioning. We compute vertex imbalance as the normalized deviation between the maximal and the minimal number of vertices assigned to any partition, i.e., .
For each experiment, we increase the number of partitions from 2 up to 128 in exponential steps.
Iva Performance Evaluations
The performance evaluations show that HYPE outperforms all of our baseline algorithms in terms of quality and balancing. Its runtime is independent of the number of partitions, so that it is faster than the streaming partitioning algorithm MinMax when the number of partitions is large. Further, the hierarchical partitioning algorithm hMETIS does not scale to very large hypergraphs, i.e., it cannot partition the Reddit hypergraph. We discuss the detailed results next.
Github
Figure (a)a shows the partitioning quality on the Github hypergraph. In the (k1) metric, HYPE performs up to 22% better than hMETIS, up to 39% better than hMETIS with load balancing turned on, up to 45% better than MinMax hyperedgebalanced, and up to 34 % better than MinMax vertexbalanced. For a very low number of partitions, i.e., 2, 4 or 8 partitions, hMETIS leads to better results than HYPE. For all other settings of k, HYPE produces the best quality of partitioning.
Partitioning runtime is depicted in Figure (b)b. hMETIS took 54 to 105 seconds to partition the Github hypergraph, which is orders of magnitude slower than MinMax and HYPE (up to 145 slower than HYPE). The partitioning runtime of HYPE is independent of the number of partitions, as each partition is filled with vertices sequentially until it is full. Different from that, in MinMax, the partitioning runtime depends on the number of partitions, as MinMax works with a scoring function that computes for each vertex a score for each partition and then assigns the vertex to the partition where its score is best. Hence, for up to 32 partitions, HYPE has a higher runtime than MinMax, up to 2.7 higher, whereas for 64 and 128 partitions, the runtime of HYPE is lower (up to 2.4 lower).
In terms of balancing, cf. Figure (c)c, HYPE shows perfect vertex balancing, while the MinMax vertexbalanced has a slight imbalance of up to 5%. Unsurprisingly, MinMax hyperedgebalanced has a poor vertex balancing, underlining our need to implement a vertexbalanced version of it for this paper. In hMETIS, when vertex balancing is turned on, the maximum imbalance was 8%, whereas without that flag, vertex imbalance was tremedously higher (up to 40% imbalance). The balancing results are similar for all other hypergraphs, so we will not discuss balancing in the following results any more.
StackOverflow
Figure (a)a shows the partitioning quality on the StackOverflow hypergraph. In the (k1) metric, HYPE performs up to 38% better than hMETIS, up to 40% better than hMETIS with load balancing turned on, up to 47% better than MinMax hyperedgebalanced, and up to 35% better than MinMax vertexbalanced. For a very low number of partitions, i.e., 2, 4 or 8 partitions, hMETIS leads to better results than HYPE. For all other settings of k, HYPE produces the best quality of partitioning. In particular, hMETIS shows a sharp increase in the (k1) metric when the number of partitions exceeds 16.
Partitioning runtime is depicted in Figure (b)b. hMETIS took between 189 and 866 seconds to partition the StackOverflow hypergraph, which is orders of magnitude slower than MinMax and HYPE (up to 274 slower than HYPE). The comparison between HYPE and MinMax on StackOverflow leads to similar results as on the Github hypergraph: With up to 32 partitions, MinMax is faster (up to 4.1 faster), but with 64 and 128 partitions, HYPE is faster (up to 1.8 faster). Again, the reason for this is that the HYPE runtime is independent of the number of partitions.
Figure (a)a shows the partitioning quality on the Reddit hypergraph. For this hypergraph, we could not produce results with the hMETIS partitioner, as it crashed or was running for days without returning a result. Consistent to the experiments reported in [7], many partitioners from group (I) are not able to partition such large hypergraphs. Hence, in the following, we only report results for MinMax and HYPE.
On the Reddit hypergraph, the advantage of exploiting local communities in HYPE pays off to the full extent: HYPE outperforms the streaming partitioner MinMax, that ignores the overall hypergraph structure, by orders of magnitude. For 2, 4 and 8 partitions, HYPE achieved an improvement of 95% compared to MinMax hyperedgebalanced, and 93% compared to MinMax vertexbalanced in the (k1) metric. Thus, HYPE leads to a partitioning quality that is up to 20 better than when using MinMax. For 16 partitions, HYPE performs 93% and 91% better, for 32 partitions 91% and 88%, for 64 partitions 88% and 84%, and for 128 partitions 83% and 80% better than MinMax hyperedgebalanced and MinMax vertexbalanced partitioners, respectively.
Comparing the partitioning runtime of HYPE and MinMax in Figure (b)b, we see again that the runtime of HYPE is independent of the number of partitions, whereas MinMax has a higher runtime with growing number of partitions because of its scoring scheme. While at 2 partitions, MinMax is up to 4 faster than HYPE, at 64 partitions HYPE becomes faster than MinMax, by up to 36% at 128 partitions. As with the other hypergraphs, MinMax vertexbalanced is slightly slower than MinMax hyperedgebalanced as the hyperedge balance can change significantly after assigning a single vertex. This often forces assignment to a single partition (the least loaded) such that the partitions remain more or less balanced. In such cases, the forced partitioning decisions for hyperedgebalanced partitioning can be performed very quickly.
RedditL
In Figure 10, we compare partitioning quality and runtime of HYPE against MinMax on the large RedditL hypergraph with partitions. MinMax requires more than 31 hours to partition RedditL compared to the 19 hours of HYPE. Although being faster than MinMax, HYPE still outperforms MinMax in partitioning quality by : MinMax has a (k1) score of 68,709,969 compared to HYPE’s 8,357,200. Note that MinMax already belongs to the fastest highquality partitioners. However, HYPE is able to outperform MinMax because its runtime does not depend on the number of partitions.
IvB Discussion of the Results
We conclude that HYPE shows very promising performance in hypergraph partitioning. First, it is able to partition very large hypergraphs, which cannot be partitioned by algorithms from group (I). Second, it consistently provides better partitioning quality than both hMETIS and MinMax. On top of that, the HYPE algorithm is comparably easy to implement and to manage because all system parameters are fixed.
V Related Work
In the last decades, research on hypergraph partitioning was driven by the need to place transistors on chips in Verylargescale integration (VLSI) design [9], as logic circuits can be modeled as large hypergraphs that are divided among chips. The most popular hypergraph partitioning algorithm from that area is hMETIS [1] which is based on a multilevel contraction algorithm and produces good partitioning quality for mediumsized hypergraphs in the magnitude of up to 100,000 edges.
However, multilevel partitioning algorithms do not scale to large hypergraphs, as shown in our evaluations. Parallel implementations of multilevel partitioning have been proposed [5], but the problem of high computational complexity and memory consumption remains. For instance, Zoltan [5] is a parallel multilevel hypergraph partitioning algorithm. The evaluated graphs on Zoltan are relatively small—within a magnitude of up to 30 million edges or less—while using up to 64 parallel machines to process them. Other algorithms of that group are Mondriaan [2], Parkway [3], PaToH [4], and KaHyPar [6]. For hypergraphs with hundreds of millions of edges, these algorithms are not practical as they take hours or even days to complete, if they terminate at all.
The bad scalability of multilevel partitioning algorithms led to the development of more scalable partitioners. Social Hash Partitioner (SHP) achieves scalability to very large hypergraphs (up to 10 billion edges) by means of massive parallelization [7]. SHP performs random swaps of vertices between partitions and greedily chooses the best swaps. Random swaps fit well with the objective of parallelization and distribution in SHP, but may not be the most efficient heuristic. Investigating on the phenomenon of scalability versus efficiency [27], we conceived the idea to devise an efficient hypergraph partitioning algorithm.
Another approach to hypergraph partitioning that is suitable to partition very large hypergraphs are streaming algorithms. Streaming hypergraph partitioning takes one vertex at a time from a stream of vertices, and calculates a score for each possible placement of that vertex on each of the partitions. The vertex is then placed on the partition where its placement score is best, and cannot be removed any more. Alistarh et al. [8] proposed different heuristic scoring functions, where greedily assigning vertices to the partition with the largest overlap of incident hyperedges is considered best. There are two issues with the streaming approach. First, by only taking into account a single vertex at a time and placing it, information about the neighborhood of that vertex is not exploited although available in the hypergraph. Second, the complexity of the algorithm depends on the number of partitions, as the scoring function is computed for each vertex on each partition. For a large number of partitions, streaming partitioning becomes slow.
A closely related problem is balanced way graph partitioning which faces similar challenges such as billionscale graph data and the need for fast algorithms. Multilevel graph partitioners such as METIS [28] and ParMETIS [29] do not scale very well. Spinner [30] is a highly scalable graph partitioner that, like SHP, performs iterative random permutations and greedy selection of the best permutation. There is a large number of streaming graph partitioning algorithms, such as HDRF [17], Hload [31], and ADWISE [32]. The “neighborhood heuristic” by Zhang et al. [13] follows a completely different approach by exploiting the graph structure when performing partitioning decisions. The algorithm grows a core set by successively adding neighbors of the core set to a fringe set. However, the given heuristic can not be applied directly to hypergraph partitioning as the calculation of scores is way too expensive in hypergraphs (see Section III). While hypergraphs can be transformed into bipartite graphs, graph partitioning algorithms cannot be used to perform hypergraph partitioning. First, the bipartite graph representations contain one artificial vertex per hyperedge that destroys the vertex balancing requirement of hypergraph partitioning. Second, the (k1) metric is ignored by graph partitioning algorithms.
In recent years, several distributed hypergraph systems emerged that fueled the need for efficient massive hypergraph partitioning. These systems are inspired from the area of distributed graph processing systems and apply the vertexcentric programming model from graph processing to hypergraph processing. For instance, HyperX [12] allows applications to specify vertex and hyperedge programs which are then executed iteratively by the system. Also, Mesh [33] builds upon the popular GraphX system [34] and shows promising performance. These systems show significant reduction of processing latency with improved partitioning quality.
Vi Conclusions
In this paper, we propose HYPE, an effective and efficient partitioner for realworld hypergraphs. The partitioner grows core sets in a sequential manner using a neighborhood expansion algorithm with several optimizations to reduce the search space. Due to the simplicity of the design and focus on the hypergraph structure, HYPE is able to partition the large Reddit hypergraph with billions of edges in less than a day. This is the partitioning of one of the largest realworld hypergraph reported in literature. HYPE not only improves partitioning quality by up to compared to streaming hypergraph partitioning, but reduces runtime as well by .
A promising line of future research on HYPE is to explore how to grow the core sets in parallel. In this scenario, several core sets “compete” for inclusion of attractive vertices, so the crucial questions are how to minimize the number of “collisions” and how to deal with collisions when they happen.
References
 [1] G. Karypis and V. Kumar, “Multilevel kway hypergraph partitioning,” in Proc. of the 36th ACM/IEEE Design Automation Conference, 1999.

[2]
B. Vastenhouw and R. H. Bisseling, “A twodimensional data distribution method for parallel sparse matrixvector multiplication,”
SIAM review, vol. 47, no. 1, pp. 67–95, 2005.  [3] A. Trifunović and W. J. Knottenbelt, “Parallel multilevel algorithms for hypergraph partitioning,” Journal of Parallel and Distributed Computing, vol. 68, no. 5, pp. 563 – 581, 2008.
 [4] U. V. Catalyurek and C. Aykanat, “Hypergraphpartitioningbased decomposition for parallel sparsematrix vector multiplication,” IEEE Trans. on Parallel and Distributed Systems, vol. 10, no. 7, pp. 673–693, Jul 1999.
 [5] K. D. Devine, E. G. Boman, R. T. Heaphy, R. H. Bisseling, and U. V. Catalyurek, “Parallel hypergraph partitioning for scientific computing,” in Proc. 20th Int. Parallel Distributed Processing Symposium, 2006.
 [6] T. Heuer and S. Schlag, “Improving coarsening schemes for hypergraph partitioning by exploiting community structure,” in 16th Int. Symposium on Experimental Algorithms, (SEA 2017), 2017.
 [7] I. Kabiljo, B. Karrer, M. Pundir, S. Pupyrev, and A. Shalita, “Social hash partitioner: a scalable distributed hypergraph partitioner,” Proc. of the VLDB Endowment, vol. 10, no. 11, pp. 1418–1429, 2017.
 [8] D. Alistarh, J. Iglesias, and M. Vojnovic, “Streaming minmax hypergraph partitioning,” in Proc. of the 28th Int. Conf. on Neural Information Processing Systems  Volume 2, 2015.
 [9] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel hypergraph partitioning: applications in vlsi domain,” IEEE Trans. on Very Large Scale Integration Systems, vol. 7, no. 1, pp. 69–79, 1999.
 [10] U. V. Çatalyürek, K. Kaya, and B. Uçar, “Integrated data placement and task assignment for scientific workflows in clouds,” in Proc. of the 4th Int. Workshop on Dataintensive Distributed Computing, 2011.
 [11] C. Curino, E. Jones, Y. Zhang, and S. Madden, “Schism: A workloaddriven approach to database replication and partitioning,” Proc. VLDB Endow., vol. 3, no. 12, pp. 48–57, Sep. 2010.
 [12] J. Huang, R. Zhang, and J. X. Yu, “Scalable hypergraph learning and processing,” in Proc. of the IEEE Int. Conf. on Data Mining, 2015.
 [13] C. Zhang, F. Wei, Q. Liu, Z. G. Tang, and Z. Li, “Graph edge partitioning via neighborhood heuristic,” in Proc. of the 23rd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2017.
 [14] K. Andreev and H. Racke, “Balanced graph partitioning,” Theory of Computing Systems, vol. 39, no. 6, pp. 929–939, Nov 2006.
 [15] D. J. Watts and S. H. Strogatz, “Collective dynamics of smallworld networks,” nature, vol. 393, no. 6684, p. 440, 1998.
 [16] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “Powergraph: distributed graphparallel computation on natural graphs.” in OSDI, 2012.
 [17] F. Petroni, L. Querzoni, K. Daudjee, S. Kamali, and G. Iacoboni, “Hdrf: Streambased partitioning for powerlaw graphs,” in Proc. of the 24th ACM Int. on Conf. on Information and Knowledge Management, 2015.
 [18] R. Albert, H. Jeong, and A.L. Barabási, “Error and attack tolerance of complex networks,” Nature, vol. 406, no. 6794, p. 378, 2000.
 [19] J. A. Hartigan and M. A. Wong, “Algorithm as 136: A kmeans clustering algorithm,” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, no. 1, pp. 100–108, 1979.
 [20] P. S. Bradley and U. M. Fayyad, “Refining initial points for kmeans clustering.” in ICML, vol. 98. Citeseer, 1998, pp. 91–99.
 [21] S. S. Khan and A. Ahmad, “Cluster center initialization algorithm for kmeans clustering,” Pattern recognition letters, vol. 25, no. 11, pp. 1293–1302, 2004.
 [22] J. M. Pena, J. A. Lozano, and P. Larranaga, “An empirical comparison of four initialization methods for the kmeans algorithm,” Pattern recognition letters, vol. 20, no. 10, pp. 1027–1040, 1999.
 [23] J. Kunegis, “Konect: the koblenz network collection,” in Proc. of the 22nd Int. Conf. on World Wide Web, 2013.
 [24] A. W. Richa, M. Mitzenmacher, and R. Sitaraman, “The power of two random choices: A survey of techniques and results,” Combinatorial Optimization, vol. 9, pp. 255–304, 2001.
 [25] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: a system for largescale graph processing,” in Proc. of the SIGMOD Int. Conf. on Management of data, 2010.
 [26] W. Yang, G. Wang, K.K. R. Choo, and S. Chen, “Hepart: A balanced hypergraph partitioning algorithm for big data applications,” Future Generation Computer Systems, vol. 83, pp. 250 – 268, 2018.
 [27] F. McSherry, M. Isard, and D. G. Murray, “Scalability! but at what cost?” in Proc. of the 15th USENIX Conf. on Hot Topics in Operating Systems, 2015.
 [28] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20, no. 1, pp. 359–392, 1998.
 [29] ——, “A parallel algorithm for multilevel graph partitioning and sparse matrix ordering,” Journal of Parallel and Distributed Computing, vol. 48, no. 1, pp. 71–95, 1998.
 [30] C. Martella, D. Logothetis, A. Loukas, and G. Siganos, “Spinner: Scalable graph partitioning in the cloud,” in Proc. of the 33rd Int. Conf. on Data Engineering (ICDE), 2017.
 [31] C. Mayer, M. A. Tariq, R. Mayer, and K. Rothermel, “Graph: Trafficaware graph processing,” IEEE Trans. on Parallel and Distributed Systems, vol. 29, no. 6, pp. 1289–1302, 2018.
 [32] C. Mayer, R. Mayer, M. A. Tariq, H. Geppert, L. Laich, L. Rieger, and K. Rothermel, “Adwise: Adaptive windowbased streaming edge partitioning for highspeed graph processing,” in Proc. of the 38th Int. Conf. on Distributed Computing Systems (ICDCS), 2018.
 [33] B. Heintz, S. Singh, C. Tesdahl, and A. Chandra, “Mesh: A flexible distributed hypergraph processing system,” 2016.
 [34] J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica, “Graphx: Graph processing in a distributed dataflow framework.” in OSDI, 2014.