ButterFly BFS – An Efficient Communication Pattern for Multi Node Traversals

03/25/2021

∙

Breadth-First Search (BFS) is a building block used in a wide array of graph analytics and is used in various network analysis domains: social, road, transportation, communication, and much more. Over the last two decades, network sizes have continued to grow. The popularity of BFS has brought with it a need for significantly faster traversals. Thus, BFS algorithms have been designed to exploit shared-memory and shared-nothing systems – this includes algorithms for accelerators such as the GPU. GPUs offer extremely fast traversals at the cost of processing smaller graphs due to their limited memory size. In contrast, CPU shared-memory systems can scale to graphs with several billion edges but do not have enough compute resources needed for fast traversals. This paper introduces ButterFly BFS, a multi-GPU traversal algorithm that allows analyzing significantly larger networks at high rates. ButterFly BFS scales to the similar-sized graphs processed by shared-memory systems while improving performance by more than 10X compared to CPUs. We evaluate our new algorithm on an NVIDIA DGX-2 server with 16 V100 GPUS and show that our algorithm scales with an increase in the number of GPUS. We show that we can achieve a roughly 70% performance linear speedup, which is non-trivial for BFS. For a scale 29 Kronecker graph and edge factor of 8, our new algorithm traverses the graph at a rate of over 300 GTEP/s. That is a high traversal rate for a single server.

READ FULL TEXT

ButterFly BFS – An Efficient Communication Pattern for Multi Node Traversals

Sign in with Google

Consider DeepAI Pro