Performance of All-Pairs Shortest-Paths Solvers with Apache Spark

02/12/2019
by   Frank Schoeneman, et al.
0

Algorithms for computing All-Pairs Shortest-Paths (APSP) are critical building blocks underlying many practical applications. The standard sequential algorithms, such as Floyd-Warshall and Johnson, quickly become infeasible for large input graphs, necessitating parallel approaches. In this work, we provide detailed analysis of parallel APSP performance on distributed memory clusters with Apache Spark. The Spark model allows for a portable and easy to deploy distributed implementation, and hence is attractive from the end-user point of view. We propose four different APSP implementations for large undirected weighted graphs, which differ in complexity and degree of reliance on techniques outside of pure Spark API. We demonstrate that Spark is able to handle APSP problems with over 200,000 vertices on a 1024-core cluster, and can compete with a naive MPI-based solution. However, our best performing solver requires auxiliary shared persistent storage, and is over two times slower than optimized MPI-based solver.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/15/2018

A Deterministic Distributed Algorithm for Exact Weighted All-Pairs Shortest Paths in Õ(n^3/2) Rounds

We present a deterministic distributed algorithm to compute all-pairs sh...
research
05/21/2018

Distributed Algorithms for Directed Betweenness Centrality and All Pairs Shortest Paths

The betweenness centrality (BC) of a node in a network (or graph) is a m...
research
01/07/2021

A Deterministic Parallel APSP Algorithm and its Applications

In this paper we show a deterministic parallel all-pairs shortest paths ...
research
05/28/2022

Towards Distributed 2-Approximation Steiner Minimal Trees in Billion-edge Graphs

Given an edge-weighted graph and a set of known seed vertices, a network...
research
03/01/2023

Parallel and Distributed Exact Single-Source Shortest Paths with Negative Edge Weights

This paper presents parallel and distributed algorithms for single-sourc...
research
07/06/2021

An MPI-based Algorithm for Mapping Complex Networks onto Hierarchical Architectures

Processing massive application graphs on distributed memory systems requ...
research
02/05/2018

Shortest k-Disjoint Paths via Determinants

The well-known k-disjoint path problem (k-DPP) asks for pairwise vertex-...

Please sign up or login with your details

Forgot password? Click here to reset