New Streaming Algorithms for High Dimensional EMD and MST

by   Xi Chen, et al.

We study streaming algorithms for two fundamental geometric problems: computing the cost of a Minimum Spanning Tree (MST) of an n-point set X ⊂{1,2,…,Δ}^d, and computing the Earth Mover Distance (EMD) between two multi-sets A,B ⊂{1,2,…,Δ}^d of size n. We consider the turnstile model, where points can be added and removed. We give a one-pass streaming algorithm for MST and a two-pass streaming algorithm for EMD, both achieving an approximation factor of Õ(log n) and using polylog(n,d,Δ)-space only. Furthermore, our algorithm for EMD can be compressed to a single pass with a small additive error. Previously, the best known sublinear-space streaming algorithms for either problem achieved an approximation of O(min{log n , log (Δ d)}log n) [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20]. For MST, we also prove that any constant space streaming algorithm can only achieve an approximation of Ω(log n), analogous to the Ω(log n) lower bound for EMD of [Andoni-Indyk-Krauthgamer '08]. Our algorithms are based on an improved analysis of a recursive space partitioning method known generically as the Quadtree. Specifically, we show that the Quadtree achieves an Õ(log n) approximation for both EMD and MST, improving on the O(min{log n , log (Δ d)}log n) approximation of [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20].



There are no comments yet.


page 1

page 2

page 3

page 4


Streaming PTAS for Binary ℓ_0-Low Rank Approximation

We give a 3-pass, polylog-space streaming PTAS for the constrained binar...

Streaming Algorithms for Planar Convex Hulls

Many classical algorithms are known for computing the convex hull of a s...

A Two-Pass Lower Bound for Semi-Streaming Maximum Matching

We prove a lower bound on the space complexity of two-pass semi-streamin...

Sublinear Algorithms for MAXCUT and Correlation Clustering

We study sublinear algorithms for two fundamental graph problems, MAXCUT...

Lower Bounds and Hardness Magnification for Sublinear-Time Shrinking Cellular Automata

The minimum circuit size problem (MCSP) is a string compression problem ...

Streaming Complexity of Spanning Tree Computation

The semi-streaming model is a variant of the streaming model frequently ...

Streaming Algorithms for Online Selection Problems

The model of streaming algorithms is motivated by the increasingly commo...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.