Loom: Query-aware Partitioning of Online Graphs

11/17/2017
by   Hugo Firth, et al.
0

As with general graph processing systems, partitioning data over a cluster of machines improves the scalability of graph database management systems. However, these systems will incur additional network cost during the execution of a query workload, due to inter-partition traversals. Workload-agnostic partitioning algorithms typically minimise the likelihood of any edge crossing partition boundaries. However, these partitioners are sub-optimal with respect to many workloads, especially queries, which may require more frequent traversal of specific subsets of inter-partition edges. Furthermore, they largely unsuited to operating incrementally on dynamic, growing graphs. We present a new graph partitioning algorithm, Loom, that operates on a stream of graph updates and continuously allocates the new vertices and edges to partitions, taking into account a query workload of graph pattern expressions along with their relative frequencies. First we capture the most common patterns of edge traversals which occur when executing queries. We then compare sub-graphs, which present themselves incrementally in the graph update stream, against these common patterns. Finally we attempt to allocate each match to single partitions, reducing the number of inter-partition edges within frequently traversed sub-graphs and improving average query performance. Loom is extensively evaluated over several large test graphs with realistic query workloads and various orderings of the graph updates. We demonstrate that, given a workload, our prototype produces partitionings of significantly better quality than existing streaming graph partitioning algorithms Fennel and LDG.

READ FULL TEXT
research
03/28/2022

WawPart: Workload-Aware Partitioning of Knowledge Graphs

Large-scale datasets in the form of knowledge graphs are often used in n...
research
05/30/2018

Q-Graph: Preserving Query Locality in Multi-Query Graph Processing

Arising user-centric graph applications such as route planning and perso...
research
11/26/2019

Prediction of Horizontal Data Partitioning Through Query Execution Cost Estimation

The excessively increased volume of data in modern data management syste...
research
03/27/2021

Cache-Efficient Fork-Processing Patterns on Large Graphs

As large graph processing emerges, we observe a costly fork-processing p...
research
03/14/2018

Local Partition in Rich Graphs

Local graph partitioning is a key graph mining tool that allows research...
research
05/12/2022

Query Complexity Based Optimal Processing of Raw Data

The paper aims to find an efficient way for processing large datasets ha...
research
09/21/2022

Evaluating Continuous Basic Graph Patterns over Dynamic Link Data Graphs

In this paper, we investigate the problem of evaluating Basic Graph Patt...

Please sign up or login with your details

Forgot password? Click here to reset