Large-Scale Network Embedding in Apache Spark

06/20/2021
by   Wenqing Lin, et al.
0

Network embedding has been widely used in social recommendation and network analysis, such as recommendation systems and anomaly detection with graphs. However, most of previous approaches cannot handle large graphs efficiently, due to that (i) computation on graphs is often costly and (ii) the size of graph or the intermediate results of vectors could be prohibitively large, rendering it difficult to be processed on a single machine. In this paper, we propose an efficient and effective distributed algorithm for network embedding on large graphs using Apache Spark, which recursively partitions a graph into several small-sized subgraphs to capture the internal and external structural information of nodes, and then computes the network embedding for each subgraph in parallel. Finally, by aggregating the outputs on all subgraphs, we obtain the embeddings of nodes in a linear cost. After that, we demonstrate in various experiments that our proposed approach is able to handle graphs with billions of edges within a few hours and is at least 4 times faster than the state-of-the-art approaches. Besides, it achieves up to 4.25% and 4.27% improvements on link prediction and node classification tasks respectively. In the end, we deploy the proposed algorithms in two online games of Tencent with the applications of friend recommendation and item recommendation, which improve the competitors by up to 91.11% in running time and up to 12.80% in the corresponding evaluation metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2019

PyTorch-BigGraph: A Large-scale Graph Embedding System

Graph embedding methods produce unsupervised node features from graphs t...
research
09/10/2020

Understanding Coarsening for Embedding Large-Scale Graphs

A significant portion of the data today, e.g, social networks, web conne...
research
06/07/2020

Distributed-Memory Vertex-Centric Network Embedding for Large-Scale Graphs

Network embedding is an important step in many different computations ba...
research
02/16/2021

Evaluating Node Embeddings of Complex Networks

Graph embedding is a transformation of nodes of a graph into a set of ve...
research
08/28/2019

Effective and Efficient Network Embedding Initialization via Graph Partitioning

Network embedding has been intensively studied in the literature and wid...
research
08/26/2019

Graph Embedding Based Hybrid Social Recommendation System

Item recommendation tasks are a widely studied topic. Recent development...
research
03/28/2019

Distributed Algorithms for Fully Personalized PageRank on Large Graphs

Personalized PageRank (PPR) has enormous applications, such as link pred...

Please sign up or login with your details

Forgot password? Click here to reset