I Introduction
Graphs are remarkably capable of capturing complex data dependencies present in various realworld domains, including biological networks, web graphs, social networks, and transportation routes, to mention just a few. Extracting useful information and gleaning insights from such graphs (a process denoted as graph analytics) is becoming central to our modern life. For instance, Facebook continually mines gigantic social networks to determine shared connections, detect communities, and propagate advertisements for massive number of users. Other common graph analytics applications include scene reconstruction from image collections and topical news recommendations from microblogging services (e.g., Twitter).
The scale of the graphs generated by such applications is significantly increasing, yielding graphs with millions of vertices and edges referred to as big graphs, which cannot be typically mined efficiently on a single machine [1]. Consequently, for efficient execution of such applications, largescale clusters are usually needed [2, 3]. Cloud computing services like Amazon EC2 [4], Microsoft Azure [5], and Google AppEngine [6] offer unprecedented levels of ondemand access to computing and storage resources, allowing thereby largescale graph analytics to be effectively pursued.
In addition to resource requirements, efficient graph analytics necessitates platforms tailored specifically for big graphs. The current graph analytics platforms can be examined from different angles. More precisely, in terms of computing needs they can be divided into three types: 1) single node nonscalable solutions such as Gunrock [7], Ligra [8], Polymer [9] and Galois [10], 2) single node outofcore solutions such as GraphChi [11], XStream [12], GridGraph [13] and Mosaic [14], and 3) distributed solutions such as LA3 [1], Apache Giraph [15], Google Pregel [16], GraphLab [17], PowerGraph [3], PowerLygra [18] and PowerSwitch [19].
Graph analytics platforms can also be categorized based on programming models, namely: 1) vertexcentric in Pregel [16] and Giraph [15], 2) edgecentric in Xstream [12], 3) subgraphcentric in GoFFish [20], and 4) graphcentric in Giraph++ [21]
. In addition, they can be classified in terms of two major execution models, namely: 1)
synchronous model, where vertices progress in a lockstep fashion such as in Giraph [15], and 2) asynchronous model, where vertices can change values anytime and be several steps apart during execution such as in PowerGraph [3].Distributed graph analytics platforms, irrespective of their programming and execution models, necessitate partitioning input graphs. A popular technique to partition graphs is referred to as balanced graph partitioning, which tries to divide any input graph into a set of roughly equal subgraphs (or partitions), while reducing edge cuts among pairs of partitions so as to minimize overall communication cost. Clearly, balanced graph partitioning is theoretically NPhard [22]; however, it can be solved effectively using a vertexcentric approach. To exemplify, JaBeJa [23], Fennel [24], and Spinner [25] suggest vertexcentric algorithms for balanced graph partitioning.
In this paper, we propose Revolver, an asynchronous singlenode graph partitioning algorithm, which adopts a vertexcentric view of graphs. The vertexcentric approach enables Revolver to partition any graph in a parallel fashion. Alongside, the asynchronous execution model allows it to involve the most recent partitioning results into the ongoing computations and, subsequently, produce loadbalanced partitions. Furthermore, asynchrony permits it to skip the strict barrier requirements of the synchronous execution model (e.g., the famous BSP framework [26]), thus converging quickly.
Within its core, Revolver utilizes Label Propagation (LP) [27] to train Learning Automata (LA) [28] for graph partitioning. LA is a subclass of Reinforcement Learning (RL), which focuses on training autonomous agents in an interactive environment in order to optimize cumulative rewards. LA have applications in evolutionary optimization [29], Cloud and Grid computing [30, 31, 32], social networks [33], image processing [34], and data clustering [35], to mention a few.
In addition to using LA, Revolver introduces: 1) a new highly accurate normalized LP, 2) a weighted learning automaton, which particularly suits graph workloads, and 3) highly balanced partitions compared to existing graph partitioners.
The paper is organized as follows. We formally define the graph partitioning problem in Section II. A background on LP and LA algorithms is provided in Section III. In Section IV, our weighted learning automaton and partitioning algorithm is presented. The evaluation methodology and results are presented in Section V. Finally, we conclude in Section VI.
Ii Problem Definition
To begin with, let us assume a directed graph where is the set of vertices and is the set of outgoing edges (directed edges). An edgecentric way graph partitioning algorithm divides into distinct partitions of almost equal size partitions. Let . Consequently, a balanced partitioning assignment can be defined as:
(1) 
where is the set of outgoing edges that belongs to with being the subset of vertices assigned to the partition and is the imbalanced ratio. Also, to guarantee having nonempty partitions, should satisfy the following inequality:
(2) 
Upon running a graph application (e.g. PageRank) in a distributed fashion, the biggest partition bounds the amount of computation and the number of interpartition edges bounds the amount of communication under each processing step. Hence, a balanced partitioning of a graph workload potentially lowers the runtime of any distributed graph analytics platform via imposing nearuniform utilization across machines while reducing communication. Examples of balanced partitioners are KernighanLin [36], Spectral partitioning [37], and Metis [38]. Besides, JaBeJa [23] (a local search partitioner), Fennel [24] (a streaming balanced partitioner), and Spinner [25] (a LPbased partitioner, which is currently deemed the stateoftheart) are three vertexcentric balanced partitioners.
Iii Background
Iiia Label propagationbased graph partitioning
LP [39]
is an iterative semisupervised machine learning algorithm that infers unlabeled data from available labeled data. It has been used for community detection
[27] and balanced graph partitioning [25]. Spinner [25] uses LP to solve way graph partitioning via producing scores for partitions for every vertex. Afterwards, it chooses for each vertex the partition with the maximum score as a candidate partition. Each vertex will then migrate to its candidate partition only if the specified balance across partitions is not impacted. To elaborate, the scoring function of Spinner [25] is as follows:(3)  
(4)  
(5) 
where in (3), an edge is a pair with , is the neighborhood of vertex , is the Kronecker delta where if and 0 otherwise, is a labeling function where such that ; if , is the weighing function computed in (4), is a penalty function with being the load of partition , and capacity .
For each vertex in Spinner, the partition with the maximum score is considered as ’s candidate partition (say, ) and may migrate to
only if the probability of migration to
, , is greater than a random number generated between . More precisely, the probability of migration of to , is calculated based on ’s remaining capacity, , divided by the number of candidate edges in , , where is the set of vertices to migrate to and is the degree of each vertex in this set.IiiB Mathematical framework of learning automata
Learning automaton [28]
is a probabilistic decisionmaking algorithm belongs to the family of Reinforcement Learning (RL). It draws an action using its probability distribution and applies it to the environment. By taking a sequence of actions and receiving reactions, learning automaton learns an optimal action. The common type of the learning automaton is variable structure learning automata, which is defined using the quadruple
where is the learning step and: 1) is the set of actions with being the number of actions, 2) such thatis the probability vector where the action taken in step
is chosen using a roulette wheel [40], 3) wehere is the reinforcement signal, and 4) is the linear learning algorithm where in step .In step of , if action receives reward signal , probability vector is updated as follows:
(6) 
Otherwise, if action receives penalty signal , probability vector is updated as follows:
(7) 
Iv Revolver
Revolver is an application of Reinforcement Learning (RL) to graph partitioning [41]. It uses weighted learning automata (Section IVA) to partition a graph. In Revolver, a new normalized Label Propagation (LP) algorithm (Section IVB) is extrapolated to form an objective function which produces weights that express the decency of the assigned partitions by Learning Automata (LA). Moreover, Revolver partitions the graph in a vertexcentric manner. In particular, each vertex pulls information from its neighboring vertices to calculate a score for each partition, before pushing the calculated scores (as weights) back to them. Subsequently, reinforcement signals are computed at each vertex based on the accumulated weights gathered from all the vertex’s neighbors. Finally, the probabilities of actions associated with partitions are updated using weights and reinforcement signals accordingly.
In the following subsections, we introduce our extension of LA for graph partitioning, namely, weighted LA. Afterwards, we introduce our new normalized LP formulas, before showing how graph partitioning can be solved using LA. Lastly, we discuss how LP is used to train LA for graph partitioning.
Iva Weighted Learning Automata
A typical learning automaton uses (6) or (7) for updating its probability vector. However, in a complex environment such as multimillionnode graphs with a large number of actions (or partitions in a partitioning problem) this updating strategy tends to fail because of two reasons: 1) at any given step, (6) or (7) can reinforce only one of the actions, meaning in conventional learning automaton there is only one reward signal and the rest are penalty signals, and 2) given that the initial probability vector is initialized with , for a large number of actions this value will converge to zero (i.e., ), which means a considerable amount of time will be required to reach a consensus on a single action.
The above limitations motivate us to propose the Weighted Learning Automata, which is able to distribute the reinforcement signals among an entire set of actions rather than concentrating it on a single action. Consequently, a weight vector is added to support the weighted probability updates for reward and penalty signals. Therefore, while updating the element of the probability vector using its weight and reinforcement signal, the rest of elements should also be updated using their weights and the negation of the reinforcement signal.
Thus, the weighted learning automaton can be defined using the quintuple , where is the learning step and: , , and are the same as before (see Section IIIB), is the set of weights for reward and penalty signals, and (more on this shortly), and is the linear learning algorithm, where in step .
In step of , if action receives reward signal , probability vector is updated as follows:
(8) 
Otherwise, if action receives penalty signal , probability vector is updated as follows:
(9) 
where and are the reward and penalty parameters, respectively and contains weights for reward and penalty reinforcement signals . To guarantee the correctness of the calculations, the sum of weights for reward signals and penalty signals should be both 1 (i.e., and where is Kronecker delta), subsequently, . As such, compared to the original learning automaton where the probability is updated times via multiple passes of (6) or (7) (having only one reward reinforcement signal), in weighted learning automaton, (8) or (9) are executed times in total ( times for each ) so as to apply different reward or penalty reinforcement signals and keep the sum of probabilities equals to 1.
IvB Normalized way label propagation for graph partitioning
In a multiterm LP, a dominant term easily causes huge variations in score computations. Normalization of terms is a solution for this. Thus, we propose a normalized LP, which consists of a normalized weighing term and penalty term defined as follows:
(10)  
(11)  
(12) 
where in (10), a score is produced for the partition of vertex , is normalized based on the total weight of ’s neighborhood , (which produces penalties for partitions) is normalized based on the total load of the system^{1}^{1}1Note, if there is a negative penalty, penalties will be augmented with respect to the minimum negative value before normalization., is Kronecker delta, and and are defined in Section IIIA.
IvC How to use reinforcement learning for graph partitioning
To partition a graph into disjoint partitions, is utilized, where a learning automaton is associated with each and the available range of partitions constitutes the action set of learning automaton. The mapping is shown in Figure 1 where a network of LA is created from a hypothetical graph .
The way LA is laidout to solve a partitioning problem goes as follows: 1) a network of LA is created where with one LA per each vertex in , 2) a learning automaton can find neighboring LA using a subset of , which belongs to , 3) in each step, the network of LA determines the partitions for vertices in parallel; the action set of a learning automaton is the same as the range of available partitions and the probability of actions is initialized to , 4) scores that are generated from multiple passes of (10) are evaluated by (13) to form the weight vector , 5) the reinforcement signal is, subsequently, constructed from the weight vector values and is used to measure the merit of the current partitioning configuration, alongside giving LA feedback via updating probabilities using (8) and (9), and finally 6) LA will learn how to partition the graph by taking a series of actions and receiving reinforcement signals.
IvD How to use label propagation to train a reinforcement learning algorithm
IvD1 LA action selection
Each step of Revolver starts with LA taking actions for determining the partitions of vertices locally ( in Figure 2). Each vertex has an analogous learning automaton in the network of LA with an action set equal to the available partitions. The LA determines the candidate partition for the vertex using a roulette wheel populated by its probability vector. Actions with larger probabilities will have a higher chance for being selected by the automaton.
IvD2 Calculating vertex migration probability
After taking actions by LA and selecting candidate partitions for vertices, the remaining load of partition is calculated using the edges currently assigned to it (i.e., ) and the demanding load is computed using the number of candidate edges in (i.e., ) (see Section IIIA). Lastly, to calculate the probability for a vertex to migrate to a candidate partition, we simply divide the remaining load by the demanding load.
IvD3 Computing vertex score
The normalized LP in (10) is used to calculate a score for each partition of a vertex based on the vertex’s neighboring vertices partitions and the current load of the partition ( in Figure 2). To the contrary of other vertexcentric graph partitioning algorithms like Spinner [25], which migrates a vertex to the partition with the maximum score, Revolver utilizes a function to extracts the index of the maximum score for future training of LA i.e. . ( in Figure 2).
IvD4 Executing vertex migration
The decision for migrating a vertex to a new partition is instrumented by comparing the selected action versus the current partition. If the two are not similar, a random number is generated against the migration probability of the candidate partition to determine whether to move the vertex to the new partition or not ( in Figure 2).
IvD5 Evaluating the objective function
Vertex receives the maximum score label of its neighbors and its learning automaton updates its corresponding weight vector as follows ( in Figure 2):
(13) 
where in (13), is the element of the weight vector belogned to vertex , extracts the partition label assigned to vertex by its learning automaton (i.e., such that if LA assigns to vertex ), is the Kronecker delta, and and are defined in Section IIIA.
In (13), the learning automaton associated with will receive a reinforcement signal proportional to in two cases: 1) if the selected action of is equal to , or 2) if the migration probability of the selected action (or partition) is positive. Hence, these two cases try to reinforce actions associated with the highest score while preserving the balance by taking into account the probability of migration.
IvD6 Constructing the reinforcement signals
The weight vector is a vector of weights populated by vertices belonged to . This vector shows the decency of partitions in , where higher weights represent the more promising partitions. To differentiate between favorable and unfavorable partitions while constructing the reinforcement signal , we divide into two parts using its mean. Specifically, if is larger than the mean of weights, (reward signal); otherwise (penalty signal) ( in Figure 2).
IvD7 Updating learning automata probability vector
IvD8 Updating remaining capacity
Calculating the remaining capacity is a simple subtraction of partition capacity from the current load of partition at the end of each step.
IvD9 Checking convergence
Finally, Revolver halts if for a specified number of consecutive steps, score has not improved (i.e., , where is the min score difference.)
V Experiments
Va Experimental Environment
We conducted our experiments on a cluster of 96 nodes (especially that we implemented two versions of Revolver, distributed and parallel ones as noted shortly), each with 64 GB RAM and a 28 cores Intel Xeon CPU E52690 running at 2.60 GHz (Broadwell). The operating system of each node is Red Hat Enterprise Linux Server 7.3 (Maipo) with Linux kernel 3.10.0. Communication between nodes is achieved using Intel (R) OmniPath with channel speed of 100 GB/s.
VB Datasets
Table I
reports the selected graphs along with their numbers of vertices and edges, densities, and skewnesses. In Table
I, the density of a graph is calculated using , and Pearson’s 1^{st} skewness coefficient is computed using , where andare the mean, mode, and standard deviation of the outdegree edges. Negative or positive values of density illustrates to what degree a graph is skewed (i.e., whether toward left or right of the outdegree edges). From nine graphs, SO
[42] and EU [43] are almost skewfree, USA [44] is uniquely leftskewed, and the rest are rightskewed, whereby they follow a power law distribution. Note that we use different graphs with varied degrees of skewness so as to comprehensively demonstrate the performance of Revolver.Graph  Skew  

Wikitopcats (WIKI) [42]  1.79M  28.51M  0.88  +0.35 
UK2007@1M (UK) [43]  1.00M  41.24M  4.12  +0.81 
USAroad (USA) [44]  23.9M  58.33M  0.01  0.59 
Stackoverflow (SO) [42]  2.60M  63.49M  0.93  +0.08 
LiveJournal (LJ) [42]  4.84M  68.99M  0.29  +0.36 
ENwiki2013 (EN) [43]  4.20M  101.3M  0.57  +0.35 
Orkut (OK) [42]  3.07M  117.1M  1.24  +0.29 
Hollywood (HLWD) [43]  2.18M  228.9M  4.81  +0.32 
EU2015host (EU) [43]  11.2M  386.9M  0.30  +0.07 
VC Implementation Details
We implemented two versions of Revolver, a multithreaded asynchronous one in C/C++ and a synchronous one in Giraph [15]. To encourage reproducibility and extensibility, we made them both opensource^{2}^{2}2Revolver’s code is available at: https://github.com/hmofrad/revolver. However, we report the results of the asynchronous version for two reasons: 1) Revolver can benefit from incremental changes in partitions offered by the asynchronous computation model during the partitioning process, and 2) Revolver’s C/C++ implementation efficiently balances the vertices among working threads via allocating each subset of vertices to a separate thread. In particular, the vertices, , of every graph are divided into chunks of size (with being the number of threads) and each chunk is assigned a separate thread on a separate core.
VD Algorithms
We compared Revolver against a number of partitioning algorithms : 1) Spinner [25], a vertexcentric graph partitioning algorithm, which uses an LP scoring function to determine a suitable partition for every vertex , 2) Hash partitioning, where is used to hash a vertex with numerical id to its designated partition, and 3) Range partitioning, where is utilized to map a vertex with id to its partition.
VE Performance Metrics
To demonstrate the quality of partitioning, we borrowed two metrics from [25], namely: 1) , which is the number of edges with both ends at the same partition divided by the total number of edges, and 2) , where Max Load is the number of edges assigned to the highest loaded partition (see Section II) and Expected Load is . Also, we note that is equal to . Clearly, with a set of machines assuming a onetoone mapping from nodes to partitions, these metrics illustrate the degree at which a given workload can uniformly harness the available computing resources and stress the communication medium at run time. To this end, we indicate that the execution time of the partitioning algorithm as a performance metric is less informative in this context since having a faster runtime does not necessarily lead to better partitions.
Communication becomes a bottleneck if an algorithm requires to perform a massive amount of message passing after each execution step. To this end, the metrics local edges and edge cuts represent the amount of intrapartition and interpartition interactions, respectively and assess the degree to which an application may require sending/receiving internal or external messages. Furthermore, for an iterative application, the runtime of a single step of execution is bounded by the computation done at the highest loaded machine, which adds up to the latency of the system as well. As such, the metric max normalized load captures the extent to which the computation time is affected by the highest loaded partition.
VF Experimental Settings
In Figure 3, we report the average local edges and max normalized load for 10 individual runs of each algorithm across different numbers of partitions 2, 4, 8, 16 32, 64, 128, 192, and 256. The results shown for Spinner are collected using Spinner’s original implementation in Giraph [25]. Moreover, for fair comparison, the same parameter settings as in [25] are used to run Revolver and Spinner. Specifically, we set the max number of steps to 290, the max number of consecutive iterations for halting to 5, the min halting score difference to 0.001, and the imbalance ratio to 0.05. In addition, the LA reward and penalty parameters and are set to 1 and 0.1, respectively. We note that the maximum normalized load for Range partitioning is so bad (e.g., Revolver has 60 times improvement compared to Range with 256 partitions on EU graph); hence, we removed it from Figure 3 (except for USA graph) to avoid spoiling the plot scale.
VG Analysis of Local Edges
We now discuss the results shown in Figure 3. To start with, the clustered bars refer to local edges (left axis) and the lines denote max normalized load (right axis).
VG1 Rightskewed graphs
WIKI, OK, LJ, EN, and HLWD are rightskewed graphs (see Table I). Compared to Spinner, Hash, and Range, for WIKI (Figure 3A) and OK (Figure 3G), Revolver produces the best local edges with 2  64 partitions. Furthermore, it achieves the best local edges for LJ (Figure 3F), while maintaining 5% improvement versus Spinner across all partitions. Alongside, it provides the best local edges for EN (Figure 3E) with 2  16 partitions. Lastly, compared to Spinner and Hash, Revolver produces the best local edges for HLWD (Figure 3H) with 2  128 partitions (almost 5% improvement). In conclusion, Revolver adaptive strategy makes it a decent choice for partitioning rightskewed graphs, especially under smaller numbers of partitions.
VG2 Highly rightskewed graphs
UK is a highly rightskewed graph. Range produces the best local edges for UK, while Revolver accomplishes better max normalized load on this graph (Figure 3B). The right skewness feature of UK (Pearson’s first rank = +0.81) indicates that the of outdegree edges is far greater than of its , entailing that most of vertices have degrees less than the . Range partitions the graph based on the range of vertices and can exploit this UK’s feature. Also, in comparing Revolver against Spinner, Revolver achieves better results with 256 partitions.
VG3 Skewfree graphs
For skewfree graphs like SO and EU, Revolver produces the best local edges for SO (Figure 3D) with 4  16, 128  256 partitions, while Range provides the best local edges for EU (Figure 3I). Compared to EU, in SO, which is a denser graph with (see Table I), Revolver can produce better localized partitions for almost any number of partitions which shows it can effectively partition this type of graphs, independent of the number of partitions.
VG4 Leftskewed graphs
For leftskewed graphs like USA (Figure 3C), Range produces better local edges and comparable max normalized load against Revolver. USA is a highly leftskewed graph (Pearson’s first rank = 0.59) (see Table I), which implies that outdegree edges are evenly distributed across vertices. Clearly, Range perfectly benefits from this characteristic and, accordingly, achieves superior results.
VG5 Impact of graph density and skewness on partitioning
To summarize the results of local edges, Revolver effectively partitioned both rightskewed and skew free graphs because its partitioning strategy is not highly depended to the way edges are distributed among vertices. Range exclusively partitioned highly rightskewed, dense skew free, and highly leftskewed graphs where edges are distributed evenly among vertices. For these kinds of graphs, Range’s partitioning strategy simply extracts partitions from ranges of consecutive vertices while enjoying a balanced distribution of outdegree edges.
VH Analysis of Max Normalized Load
VH1 Tradeoff between local edges and max normalized load
From Figure 3A all the way to Figure 3I (lines), Revolver always produces significantly better max normalized load compared to other algorithms, irrespective of the type of the graph. Unlike Spinner, Revolver’s normalized LP does not allow the penalty function to vary the score independently and, subsequently, create unbalaced partitions. Revolver and Spinner leverage a 5% imbalance ratio (, where ), yet the largest partition produced by Spinner is always bigger than the allowed extra capacity. Evidently, this explains why Spinner accomplishes better local edges (i.e., because larger partitions will have more local edges). On the other hand, Hash produces comparable balanced partitions for these graphs with 2  64 partitions, while it always generates the worst local edges. Lastly, although Range provides the best local edges for UK, USA, and EU graphs, it achieves the worst max normalized load among all graphs (e.g., max normalized load of 1.6  60 times worst than Revolver for 2  256 partitions on EU), except for USA. Range is highly dependant on the way edges are distributed among vertices. The reason for why Range outperforms other algorithms on USA is that USA is a sparse () leftskewed graph with edges laid out evenly across vertices.
VH2 The impact of asynchronous processing
Since Revolver adopts an asynchronous computational model, the process of computing the scores of partitions and migrating a vertex to a new partition is executed onthefly, whereby loads of the source and destination partitions are exchanged progressively. This relaxes the migration condition and enables Revolver to attain better max normalized load via utilizing the most recent changes in the partitioning configuration. Compared to Spinner, which is implemented synchronously, the asynchronous model of Revolver has a significant impact on the load distribution as shown in Figure 3A  Figure 3I (e.g., up to 28 improvement in max normalized load on EU).
VI The Scalability Feature of Learning Automata
In a partitioning problem, as the number of partitions increases, the complexity of the problem grows as well. This is an inevitable outcome of the curse of dimensionality. In LA, as the number of actions increases, the initial probabilities are decreased which makes it harder to find optimal actions. Our weighted LA (see (
8) and (9)) is designed to account for any increase in the dimensionality of the problem by having a weight for each element in the probability vector (which separates probability updates from the space complexity of the problem). Consequently, our weighted updating strategy guarantees a fair distribution of probabilities among the elements of the probability vector. This unique feature makes LA scalable and resistant to increases in the number of partitions.VJ Convergence Characteristics of Revolver
As Revolver and Spinner have clear advantages over Range and Hash in Sections VG and VH, in Figure 4, we draw the convergence characteristics of local edges and max normalized load (left and right y axes) of Revolver and Spinner over LJ with 8 partitions and (other graphs show a similar pattern to Figure 4 and are not shown due to space limitations).
Figure 4 demonstrates an interesting observation about local edges. Specifically, Spinner’s local edges become almost fixed after step 100, while Revolver keeps increasing local edges up to the end (step 300). This clearly indicates the strength of Revolver’s adaptive strategy, which continuously allows reaching a consensus and does not get trapped in a local minimum. On the flip side, the greedy strategy of Spinner gets it trapped early on during execution. Moreover, there is a 5% difference between the local edges produced by Revolver and Spinner, which further illustrates Revolver’s superiority. In addition, Spinner stops increasing local edges when it fully utilizes the 5% extra capacity (), while Revolver continues enhancing local edges, even without exhausting the entire available extra capacity.
Comparing local edges and max normalized load, when Revolver hits the plateau of local edges after 30 steps, it harnesses up to 2% extra capacity, whereas, Spinner local edges are only improved when more extra capacity is used (two middle lines of Figure 4 shows this where local edges of Spinner is improved as a function of max normalized load).
Figure 4 also illuminates another pattern. In particular, Spinner tends to utilize its entire extra capacity in the first 75 steps, while Revolver barely consumes up to 2% extra capacity during the whole run. The huge gap between Revolver’s and Spinner’s max normalized load is due to the fact that the asynchronous computation model of Revolver helps LA creating balanced partitions while utilizing significantly less extra capacity. In contrary, Spinner fails in achieving balanced partitions because of its strict synchronous model.
Vi Conclusions
In this work, we proposed Revolver, an asynchronous reinforcement learning algorithm capable of partitioning multimillionnode graphs. In Revolver, each vertex is assigned to an independent learning automaton to determine the corresponding suitable partition. In addition, a normalized label propagation algorithm is incorporated to asses partitioning results and provide feedback to learning automata. Experimental results show that Revolver can provide locallypreserving partitions, without sacrificing load balance.
Vii Acknowledgments
This publication was made possible by NPRP grant #713302483 from the Qatar National Research Fund (a member of Qatar Foundation). This research was supported in part by the University of Pittsburgh Center for Research Computing through the resources provided.
References
 [1] Y. Ahmad, O. Khattab, A. Malik, A. Musleh, M. Hammoud, M. Kutlu, M. Shehata, and T. Elsayed, “La3: a scalable linkand localityaware linear algebrabased graph analytics system,” Proceedings of the VLDB Endowment, vol. 11, no. 8, pp. 920–933, 2018.
 [2] M. Redekopp, Y. Simmhan, and V. K. Prasanna, “Performance analysis of vertex centric graph algorithms on the azure cloud platform,” in Workshop on Parallel Algorithms and Software for Analysis of Massive Graphs. Citeseer, 2011.
 [3] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, “Powergraph: Distributed graphparallel computation on natural graphs.” in OSDI, vol. 12, no. 1, 2012, p. 2.
 [4] Amazon.com, Inc. (2018) Amazon elastic compute cloud (amazon ec2). [Online]. Available: https://aws.amazon.com/ec2/
 [5] Microsoft Corporation. (2018) Microsoft azure cloud services. [Online]. Available: https://azure.microsoft.com/
 [6] Google LLC. (2018) Google app engine platfrorm. [Online]. Available: https://cloud.google.com/appengine/
 [7] Y. Wang, A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens, “Gunrock: A highperformance graph processing library on the gpu,” in PPoPP, 2016, p. 11.
 [8] J. Shun and G. E. Blelloch, “Ligra: a lightweight graph processing framework for shared memory,” in ACM Sigplan Notices, vol. 48, no. 8, 2013, pp. 135–146.
 [9] K. Zhang, R. Chen, and H. Chen, “Numaaware graphstructured analytics,” in ACM SIGPLAN Notices, vol. 50, no. 8, 2015, pp. 183–193.
 [10] D. Nguyen, A. Lenharth, and K. Pingali, “A lightweight infrastructure for graph analytics,” in 24th ACM SOSP, 2013, pp. 456–471.
 [11] A. Kyrola, G. E. Blelloch, C. Guestrin et al., “Graphchi: Largescale graph computation on just a pc.” in OSDI, vol. 12, 2012, pp. 31–46.
 [12] A. Roy, I. Mihailovic, and W. Zwaenepoel, “Xstream: Edgecentric graph processing using streaming partitions,” in SOSP, 2013, p. 472.
 [13] X. Zhu, W. Han, and W. Chen, “Gridgraph: Largescale graph processing on a single machine using 2level hierarchical partitioning.” in USENIX Annual Technical Conference, 2015, pp. 375–386.
 [14] S. Maass, C. Min, S. Kashyap, W. Kang, M. Kumar, and T. Kim, “Mosaic: Processing a trillionedge graph on a single machine,” in 12th ACM EuroSys, 2017, pp. 527–543.
 [15] A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan, “One trillion edges: Graph processing at facebookscale,” Proceedings of the VLDB Endowment, vol. 8, no. 12, pp. 1804–1815, 2015.
 [16] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: a system for largescale graph processing,” in 2010 ACM SIGMOD International Conference on Management of data, 2010, pp. 135–146.
 [17] Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed graphlab: a framework for machine learning and data mining in the cloud,” Proceedings of the VLDB Endowment, vol. 5, no. 8, pp. 716–727, 2012.
 [18] R. Chen, J. Shi, Y. Chen, and H. Chen, “Powerlyra: Differentiated graph computation and partitioning on skewed graphs,” in 10th EuroSys, 2015.
 [19] C. Xie, R. Chen, H. Guan, B. Zang, and H. Chen, “Sync or async: Time to fuse for distributed graphparallel computation,” ACM SIGPLAN Notices, vol. 50, no. 8, pp. 194–204, 2015.
 [20] Y. Simmhan, A. Kumbhare, C. Wickramaarachchi, S. Nagarkar, S. Ravi, C. Raghavendra, and V. Prasanna, “Goffish: A subgraph centric framework for largescale graph analytics,” in European Conference on Parallel Processing. Springer, 2014, pp. 451–462.
 [21] Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson, “From think like a vertex to think like a graph,” Proceedings of the VLDB Endowment, vol. 7, no. 3, pp. 193–204, 2013.
 [22] M. Hammoud and M. F. Sakr, “Distributed programming for the cloud: Models, challenges, and analytics engines.” 2014.
 [23] F. Rahimian, A. H. Payberah, S. Girdzijauskas, M. Jelasity, and S. Haridi, “Jabeja: A distributed algorithm for balanced graph partitioning,” in 7th IEEE SASO, 2013, pp. 51–60.
 [24] C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic, “Fennel: Streaming graph partitioning for massive scale graphs,” in 7th ACM WSDM, 2014, pp. 333–342.
 [25] C. Martella, D. Logothetis, A. Loukas, and G. Siganos, “Spinner: Scalable graph partitioning in the cloud,” in ICDE, 2017, p. 1083.
 [26] L. G. Valiant, “A bridging model for parallel computation,” Communications of the ACM, vol. 33, no. 8, pp. 103–111, 1990.
 [27] M. J. Barber and J. W. Clark, “Detecting network communities by propagating labels under constraints,” Physical Review E, vol. 80, 2009.
 [28] K. S. Narendra and M. A. Thathachar, Learning automata: an introduction. Courier Corporation, 2012.

[29]
M. Hasanzadeh, M. R. Meybodi, and M. M. Ebadzadeh, “Adaptive cooperative particle swarm optimizer,”
Applied Intelligence, vol. 39, no. 2, pp. 397–420, 2013.  [30] M. Hasanzadeh and M. R. Meybodi, “Grid resource discovery based on distributed learning automata,” Computing, vol. 96, pp. 909–922, 2014.
 [31] M. Hasanzadeh and M. R. Meybodi, “Distributed optimization grid resource discovery,” The Journal of Supercomputing, vol. 71, no. 1, pp. 87–120, 2015.
 [32] A. Jobava, A. Yazidi, B. J. Oommen, and K. Begnum, “On achieving intelligent trafficaware consolidation of virtual machines in a data center using learning automata,” Journal of Computational Science, vol. 24, pp. 290–312, 2018.
 [33] A. Rezvanian and M. R. Meybodi, “A new learning automata based sampling algorithm for social networks,” International Journal of Communication Systems, vol. 30, no. 5, pp. 1–21, 2017.
 [34] M. H. Mofrad, S. Sadeghi, A. Rezvanian, and M. R. Meybodi, “Cellular edge detection: Combining cellular automata and cellular learning automata,” AEUInternational Journal of Electronics and Communications, vol. 69, no. 9, pp. 1282–1290, 2015.
 [35] M. HasanzadehMofrad and A. Rezvanian, “Learning automata clustering,” Journal of Computational Science, vol. 24, pp. 379–388, 2018.

[36]
B. W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,”
The Bell system technical journal, vol. 49, no. 2, pp. 291–307, 1970.  [37] P. K. Chan, M. D. Schlag, and J. Y. Zien, “Spectral kway ratiocut partitioning and clustering,” ransactions on computeraided design of integrated circuits and systems, vol. 13, no. 9, pp. 1088–1096, 1994.
 [38] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20, no. 1, pp. 359–392, 1998.
 [39] U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm to detect community structures in largescale networks,” Physical review E, vol. 76, no. 3, p. 036106, 2007.
 [40] D. E. Goldberg, “Probability matching, the magnitude of reinforcement, and classifier system bidding,” Machine Learning, vol. 5, no. 4, pp. 407–425, 1990.
 [41] M. H. Mofrad, R. Melhem, and M. Hammoud, “Revolver: vertexcentric graph partitioning using reinforcement learning,” in 2018 IEEE 11th International Conference on Cloud Computing (CLOUD). IEEE, 2018, pp. 818–821.
 [42] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data.
 [43] P. Boldi and S. Vigna, “The webgraph framework i: Compression techniques,” in 13th ACM WWW, 2004, pp. 595–601.
 [44] C. Demetrescu, A. V. Goldberg, and D. S. Johnson, “9th DIMACS implementation challenge,” http://www.diag.uniroma1.it/challenge9/.