# New Streaming Algorithms for High Dimensional EMD and MST

We study streaming algorithms for two fundamental geometric problems: computing the cost of a Minimum Spanning Tree (MST) of an n-point set X ⊂{1,2,…,Δ}^d, and computing the Earth Mover Distance (EMD) between two multi-sets A,B ⊂{1,2,…,Δ}^d of size n. We consider the turnstile model, where points can be added and removed. We give a one-pass streaming algorithm for MST and a two-pass streaming algorithm for EMD, both achieving an approximation factor of Õ(log n) and using polylog(n,d,Δ)-space only. Furthermore, our algorithm for EMD can be compressed to a single pass with a small additive error. Previously, the best known sublinear-space streaming algorithms for either problem achieved an approximation of O(min{log n , log (Δ d)}log n) [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20]. For MST, we also prove that any constant space streaming algorithm can only achieve an approximation of Ω(log n), analogous to the Ω(log n) lower bound for EMD of [Andoni-Indyk-Krauthgamer '08]. Our algorithms are based on an improved analysis of a recursive space partitioning method known generically as the Quadtree. Specifically, we show that the Quadtree achieves an Õ(log n) approximation for both EMD and MST, improving on the O(min{log n , log (Δ d)}log n) approximation of [Andoni-Indyk-Krauthgamer '08, Backurs-Dong-Indyk-Razenshteyn-Wagner '20].

## Authors

• 195 publications
• 17 publications
• 9 publications
• 11 publications
09/25/2019

### Streaming PTAS for Binary ℓ_0-Low Rank Approximation

We give a 3-pass, polylog-space streaming PTAS for the constrained binar...
09/30/2018

### Streaming Algorithms for Planar Convex Hulls

Many classical algorithms are known for computing the convex hull of a s...
08/16/2021

### A Two-Pass Lower Bound for Semi-Streaming Maximum Matching

We prove a lower bound on the space complexity of two-pass semi-streamin...
02/20/2018

### Sublinear Algorithms for MAXCUT and Correlation Clustering

We study sublinear algorithms for two fundamental graph problems, MAXCUT...
07/23/2020

### Lower Bounds and Hardness Magnification for Sublinear-Time Shrinking Cellular Automata

The minimum circuit size problem (MCSP) is a string compression problem ...
01/21/2020

### Streaming Complexity of Spanning Tree Computation

The semi-streaming model is a variant of the streaming model frequently ...
07/12/2020

### Streaming Algorithms for Online Selection Problems

The model of streaming algorithms is motivated by the increasingly commo...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.