OpenGraphGym-MG: Using Reinforcement Learning to Solve Large Graph Optimization Problems on MultiGPU Systems

05/18/2021
by   Weijian Zheng, et al.
0

Large scale graph optimization problems arise in many fields. This paper presents an extensible, high performance framework (named OpenGraphGym-MG) that uses deep reinforcement learning and graph embedding to solve large graph optimization problems with multiple GPUs. The paper uses a common RL algorithm (deep Q-learning) and a representative graph embedding (structure2vec) to demonstrate the extensibility of the framework and, most importantly, to illustrate the novel optimization techniques, such as spatial parallelism, graph-level and node-level batched processing, distributed sparse graph storage, efficient parallel RL training and inference algorithms, repeated gradient descent iterations, and adaptive multiple-node selections. This study performs a comprehensive performance analysis on parallel efficiency and memory cost that proves the parallel RL training and inference algorithms are efficient and highly scalable on a number of GPUs. This study also conducts a range of large graph experiments, with both generated graphs (over 30 million edges) and real-world graphs, using a single compute node (with six GPUs) of the Summit supercomputer. Good scalability in both RL training and inference is achieved: as the number of GPUs increases from one to six, the time of a single step of RL training and a single step of RL inference on large graphs with more than 30 million edges, is reduced from 316.4s to 54.5s, and 23.8s to 3.4s, respectively. The research results on a single node lay out a solid foundation for the future work to address graph optimization problems with a large number of GPUs across multiple nodes in the Summit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2019

GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding

Learning continuous representations of nodes is attracting growing inter...
research
03/28/2019

PyTorch-BigGraph: A Large-scale Graph Embedding System

Graph embedding methods produce unsupervised node features from graphs t...
research
02/26/2018

MILE: A Multi-Level Framework for Scalable Graph Embedding

Recently there has been a surge of interest in designing graph embedding...
research
05/16/2023

Graph Reinforcement Learning for Network Control via Bi-Level Optimization

Optimization problems over dynamic networks have been extensively studie...
research
03/08/2019

Learning Heuristics over Large Graphs via Deep Reinforcement Learning

In this paper, we propose a deep reinforcement learning framework called...
research
10/21/2020

Transferable Graph Optimizers for ML Compilers

Most compilers for machine learning (ML) frameworks need to solve many c...
research
08/19/2020

Intelligent Replication Management for HDFS Using Reinforcement Learning

Storage systems for cloud computing merge a large number of commodity co...

Please sign up or login with your details

Forgot password? Click here to reset