Efficient Graph Computation for Node2Vec

05/01/2018
by   Dongyan Zhou, et al.
0

Node2Vec is a state-of-the-art general-purpose feature learning method for network analysis. However, current solutions cannot run Node2Vec on large-scale graphs with billions of vertices and edges, which are common in real-world applications. The existing distributed Node2Vec on Spark incurs significant space and time overhead. It runs out of memory even for mid-sized graphs with millions of vertices. Moreover, it considers at most 30 edges for every vertex in generating random walks, causing poor result quality. In this paper, we propose Fast-Node2Vec, a family of efficient Node2Vec random walk algorithms on a Pregel-like graph computation framework. Fast-Node2Vec computes transition probabilities during random walks to reduce memory space consumption and computation overhead for large-scale graphs. The Pregel-like scheme avoids space and time overhead of Spark's read-only RDD structures and shuffle operations. Moreover, we propose a number of optimization techniques to further reduce the computation overhead for popular vertices with large degrees. Empirical evaluation show that Fast-Node2Vec is capable of computing Node2Vec on graphs with billions of vertices and edges on a mid-sized machine cluster. Compared to Spark-Node2Vec, Fast-Node2Vec achieves 7.7--122x speedups.

READ FULL TEXT

page 5

page 12

research
10/12/2021

GraPE: fast and scalable Graph Processing and Embedding

Graph Representation Learning methods have enabled a wide range of learn...
research
03/28/2019

Distributed Algorithms for Fully Personalized PageRank on Large Graphs

Personalized PageRank (PPR) has enormous applications, such as link pred...
research
09/13/2022

Space-Efficient Random Walks on Streaming Graphs

Graphs in many applications, such as social networks and IoT, are inhere...
research
12/14/2021

Simulating Random Walks in Random Streams

The random order graph streaming model has received significant attentio...
research
10/09/2018

GraphMP: I/O-Efficient Big Graph Analytics on a Single Commodity Machine

Recent studies showed that single-machine graph processing systems can b...
research
08/21/2020

A Tipping Point for the Planarity of Small and Medium Sized Graphs

This paper presents an empirical study of the relationship between the d...
research
05/25/2023

Efficient Approximation Algorithms for Spanning Centrality

Given a graph 𝒢, the spanning centrality (SC) of an edge e measures the ...

Please sign up or login with your details

Forgot password? Click here to reset