k-Center Clustering with Outliers in the MPC and Streaming Model

02/24/2023
by   Mark de Berg, et al.
0

Given a point set P ⊆ X of size n in a metric space (X,dist) of doubling dimension d and two parameters k ∈ N and z ∈ N, the k-center problem with z outliers asks to return a set C^∗⊆ X of k centers such that the maximum distance of all but z points of P to their nearest center in C^∗ is minimized. An (ϵ,k,z)-coreset for this problem is a weighted point set P^* such that an optimal solution for the k-center problem with z outliers on P^* gives a (1±ϵ)-approximation for the k-center problem with z outliers on P. We study the construction of such coresets in the Massively Parallel Computing (MPC) model, and in the insertion-only as well as the fully dynamic streaming model. We obtain the following results, for any given 0 < ϵ≤ 1: In all cases, the size of the computed coreset is O(k/ϵ^d+z). - In the MPC model, we present a deterministic 2-round and a randomized 1-round algorithm. Additionally, we provide a deterministic algorithm that obtains a trade-off between the number of rounds, R, and the storage per machine. - For the insertion-only streaming model, we present an algorithm and a tight lower bound to support it. - We also discuss the dynamic streaming model, which allows both insertions and deletions in the data stream. In this model, we present the first algorithm and a lower bound. - Finally, we consider the sliding window model, where we are interested in maintaining an (ϵ,k,z)-coreset for the last W points in the stream, we present a tight lower bound that confirms the optimality of the previous work by De Berg, Monemizadeh, and Zhong (ESA2020).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/24/2021

k-Center Clustering with Outliers in the Sliding-Window Model

The k-center problem for a point set P asks for a collection of k congru...
research
02/26/2018

Improved MapReduce and Streaming Algorithms for k-Center Clustering (with Outliers)

We present efficient MapReduce and Streaming algorithms for the k-center...
research
10/15/2018

Small Space Stream Summary for Matroid Center

In the matroid center problem, which generalizes the k-center problem, w...
research
02/18/2020

Coreset-based Strategies for Robust Center-type Problems

Given a dataset V of points from some metric space, the popular k-center...
research
02/17/2020

How fast can you update your MST? (Dynamic algorithms for cluster computing)

Imagine a large graph that is being processed by a cluster of computers,...
research
01/07/2022

k-Center Clustering with Outliers in Sliding Windows

Metric k-center clustering is a fundamental unsupervised learning primit...
research
09/30/2018

Streaming Algorithms for Planar Convex Hulls

Many classical algorithms are known for computing the convex hull of a s...

Please sign up or login with your details

Forgot password? Click here to reset