2PS: High-Quality Edge Partitioning with Two-Phase Streaming

01/20/2020
by   Ruben Mayer, et al.
0

Graph partitioning is an important preprocessing step to distributed graph processing. In edge partitioning, the edge set of a given graph is split into k equally-sized partitions, such that the replication of vertices across partitions is minimized. Streaming is a viable approach to partition graphs that exceed the memory capacities of a single server. The graph is ingested as a stream of edges, and one edge at a time is immediately and irrevocably assigned to a partition based on a scoring function. However, streaming partitioning suffers from the uninformed assignment problem: At the time of partitioning early edges in the stream, there is no information available about the rest of the edges. As a consequence, edge assignments are often driven by balancing considerations, and the achieved replication factor is comparably high. In this paper, we propose 2PS, a novel two-phase streaming algorithm for high-quality edge partitioning. In the first phase, vertices are separated into clusters by a lightweight streaming clustering algorithm. In the second phase, the graph is re-streamed and edge partitioning is performed while taking into account the clustering of the vertices from the first phase. Our evaluations show that 2PS can achieve a replication factor that is comparable to heavy-weight random access partitioners while inducing orders of magnitude lower memory overhead.

READ FULL TEXT
research
03/23/2022

Out-of-Core Edge Partitioning at Linear Run-Time

Graph edge partitioning is an important preprocessing step to optimize d...
research
03/14/2018

Local Partition in Rich Graphs

Local graph partitioning is a key graph mining tool that allows research...
research
08/21/2017

Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demo...
research
12/09/2017

A Streaming Algorithm for Graph Clustering

We introduce a novel algorithm to perform graph clustering in the edge s...
research
12/05/2014

An iterative step-function estimator for graphons

Exchangeable graphs arise via a sampling procedure from measurable funct...
research
12/22/2017

ADWISE: Adaptive Window-based Streaming Edge Partitioning for High-Speed Graph Processing

In recent years, the graph partitioning problem gained importance as a m...
research
10/16/2021

Deep Learning and Spectral Embedding for Graph Partitioning

We present a graph bisection and partitioning algorithm based on graph n...

Please sign up or login with your details

Forgot password? Click here to reset