Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree

08/01/2023
by   Rajesh Jayaram, et al.
0

We study the classic Euclidean Minimum Spanning Tree (MST) problem in the Massively Parallel Computation (MPC) model. Given a set X ⊂ℝ^d of n points, the goal is to produce a spanning tree for X with weight within a small factor of optimal. Euclidean MST is one of the most fundamental hierarchical geometric clustering algorithms, and with the proliferation of enormous high-dimensional data sets, such as massive transformer-based embeddings, there is now a critical demand for efficient distributed algorithms to cluster such data sets. In low-dimensional space, where d = O(1), Andoni, Nikolov, Onak, and Yaroslavtsev [STOC '14] gave a constant round MPC algorithm that obtains a high accuracy (1+ϵ)-approximate solution. However, the situation is much more challenging for high-dimensional spaces: the best-known algorithm to obtain a constant approximation requires O(log n) rounds. Recently Chen, Jayaram, Levi, and Waingarten [STOC '22] gave a Õ(log n) approximation algorithm in a constant number of rounds based on embeddings into tree metrics. However, to date, no known algorithm achieves both a constant number of rounds and approximation. In this paper, we make strong progress on this front by giving a constant factor approximation in Õ(loglog n) rounds of the MPC model. In contrast to tree-embedding-based approaches, which necessarily must pay Ω(log n)-distortion, our algorithm is based on a new combination of graph-based distributed MST algorithms and geometric space partitions. Additionally, although the approximate MST we return can have a large depth, we show that it can be modified to obtain a Õ(loglog n)-round constant factor approximation to the Euclidean Traveling Salesman Problem (TSP) in the MPC model. Previously, only a O(log n) round was known for the problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2022

Streaming Euclidean MST to a Constant Factor

We study streaming algorithms for the fundamental geometric problem of c...
research
05/18/2019

Massively Parallel Computation via Remote Memory Access

We introduce the Adaptive Massively Parallel Computation (AMPC) model, w...
research
12/09/2019

A Deterministic Algorithm for the MST Problem in Constant Rounds of Congested Clique

In this paper, we show that the Minimum Spanning Tree problem can be sol...
research
10/04/2017

Massively Parallel Algorithms and Hardness for Single-Linkage Clustering Under ℓ_p-Distances

We present massively parallel (MPC) algorithms and hardness of approxima...
research
06/04/2021

Massively Parallel and Dynamic Algorithms for Minimum Size Clustering

In this paper, we study the r-gather problem, a natural formulation of m...
research
02/17/2020

How fast can you update your MST? (Dynamic algorithms for cluster computing)

Imagine a large graph that is being processed by a cluster of computers,...
research
04/19/2022

Massively Parallel Computation and Sublinear-Time Algorithms for Embedded Planar Graphs

While algorithms for planar graphs have received a lot of attention, few...

Please sign up or login with your details

Forgot password? Click here to reset