A Distributed Multi-GPU System for Large-Scale Node Embedding at Tencent

05/28/2020
by   Wanjing Wei, et al.
0

Scaling node embedding systems to efficiently process networks in real-world applications that often contain hundreds of billions of edges with high-dimension node features remains a challenging problem. In this paper we present a high-performance multi-GPU node embedding system that uses hybrid model data parallel training. We propose a hierarchical data partitioning strategy and an embedding training pipeline to optimize both communication and memory usage on a GPU cluster. With the decoupled design of our random walk engine and embedding training engine, we can run both random walk and embedding training with high flexibility to fully utilize all computing resources on a GPU cluster. We evaluate the system on real-world and synthesized networks with various node embedding tasks. Using 40 NVIDIA V100 GPUs on a network with over two hundred billion edges and one billion nodes, our implementation requires only 200 seconds to finish one training epoch. We also achieve 5.9x-14.4x speedup on average over the current state-of-the-art multi-GPU single-node embedding system with competitive or better accuracy on open datasets.

READ FULL TEXT

page 4

page 5

research
03/02/2019

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding

Learning continuous representations of nodes is attracting growing inter...
research
08/27/2023

SPEED: Streaming Partition and Parallel Acceleration for Temporal Interaction Graph Embedding

Temporal Interaction Graphs (TIGs) are widely employed to model intricat...
research
10/12/2021

GraPE: fast and scalable Graph Processing and Embedding

Graph Representation Learning methods have enabled a wide range of learn...
research
04/03/2022

BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Most AI projects start with a Python notebook running on a single laptop...
research
06/21/2022

Nimble GNN Embedding with Tensor-Train Decomposition

This paper describes a new method for representing embedding tables of g...
research
10/03/2017

Supervised Q-walk for Learning Vector Representation of Nodes in Networks

Automatic feature learning algorithms are at the forefront of modern day...
research
03/12/2020

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Neural networks of ads systems usually take input from multiple resource...

Please sign up or login with your details

Forgot password? Click here to reset