Tight Bounds on the Round Complexity of the Distributed Maximum Coverage Problem

01/09/2018
by   Sepehr Assadi, et al.
0

We study the maximum k-set coverage problem in the following distributed setting. A collection of sets S_1,...,S_m over a universe [n] is partitioned across p machines and the goal is to find k sets whose union covers the most number of elements. The computation proceeds in synchronous rounds. In each round, all machines simultaneously send a message to a central coordinator who then communicates back to all machines a summary to guide the computation for the next round. At the end, the coordinator outputs the answer. The main measures of efficiency in this setting are the approximation ratio of the returned solution, the communication cost of each machine, and the number of rounds of computation. Our main result is an asymptotically tight bound on the tradeoff between these measures for the distributed maximum coverage problem. We first show that any r-round protocol for this problem either incurs a communication cost of k · m^Ω(1/r) or only achieves an approximation factor of k^Ω(1/r). This implies that any protocol that simultaneously achieves good approximation ratio (O(1) approximation) and good communication cost (O(n) communication per machine), essentially requires logarithmic (in k) number of rounds. We complement our lower bound result by showing that there exist an r-round protocol that achieves an e/e-1-approximation (essentially best possible) with a communication cost of k · m^O(1/r) as well as an r-round protocol that achieves a k^O(1/r)-approximation with only O(n) communication per each machine (essentially best possible). We further use our results in this distributed setting to obtain new bounds for the maximum coverage problem in two other main models of computation for massive datasets, namely, the dynamic streaming model and the MapReduce model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2022

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

We study the distributed minimum spanning tree (MST) problem, a fundamen...
research
05/21/2018

Silence

The cost of communication is a substantial factor affecting the scalabil...
research
05/15/2020

Efficient Distributed Algorithms for the K-Nearest Neighbors Problem

The K-nearest neighbors is a basic problem in machine learning with nume...
research
11/20/2017

Schlegel Diagram and Optimizable Immediate Snapshot Protocol

In the topological study of distributed systems, the immediate snapshot ...
research
06/03/2021

Interactive Communication in Bilateral Trade

We define a model of interactive communication where two agents with pri...
research
02/17/2018

Approximate Set Union Via Approximate Randomization

We develop an randomized approximation algorithm for the size of set uni...
research
04/22/2020

Derivation of Heard-Of Predicates From Elementary Behavioral Patterns

There are many models of distributed computing, and no unifying mathemat...

Please sign up or login with your details

Forgot password? Click here to reset