Graphyti: A Semi-External Memory Graph Library for FlashGraph

07/07/2019
by   Disa Mhembere, et al.
0

Graph datasets exceed the in-memory capacity of most standalone machines. Traditionally, graph frameworks have overcome memory limitations through scale-out, distributing computing. Emerging frameworks avoid the network bottleneck of distributed data with Semi-External Memory (SEM) that uses a single multicore node and operates on graphs larger than memory. In SEM, O(m) data resides on disk and O(n) data in memory, for a graph with n vertices and m edges. For developers, this adds complexity because they must explicitly encode I/O within applications. We present principles that are critical for application developers to adopt in order to achieve state-of-the-art performance, while minimizing I/O and memory for algorithms in SEM. We present them in Graphyti, an extensible parallel SEM graph library built on FlashGraph and available in Python via pip. In SEM, Graphyti achieves 80 performance of FlashGraph, which outperforms distributed engines, such as PowerGraph and Galois.

READ FULL TEXT
research
02/03/2016

An SSD-based eigensolver for spectral analysis on billion-node graphs

Many eigensolvers such as ARPACK and Anasazi have been developed to comp...
research
05/14/2018

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

There has been significant interest in parallel graph processing recentl...
research
05/03/2022

Extended Abstract: Productive Parallel Programming with Parsl

Parsl is a parallel programming library for Python that aims to make it ...
research
10/21/2017

BigSparse: High-performance external graph analytics

We present BigSparse, a fully external graph analytics system that picks...
research
10/27/2019

Semi-Asymmetric Parallel Graph Algorithms for NVRAMs

Emerging non-volatile main memory (NVRAM) technologies provide novel fea...
research
06/05/2020

Efficient Semi-External Depth-First Search

Computing Depth-First Search (DFS) results, i.e. depth-first order or DF...
research
10/10/2019

Graph Sampling with Distributed In-Memory Dataflow Systems

Given a large graph, a graph sample determines a subgraph with similar c...

Please sign up or login with your details

Forgot password? Click here to reset