On Approximability of Clustering Problems Without Candidate Centers

09/30/2020
by   Vincent Cohen-Addad, et al.
0

The k-means objective is arguably the most widely-used cost function for modeling clustering tasks in a metric space. In practice and historically, k-means is thought of in a continuous setting, namely where the centers can be located anywhere in the metric space. For example, the popular Lloyd's heuristic locates a center at the mean of each cluster. Despite persistent efforts on understanding the approximability of k-means, and other classic clustering problems such as k-median and k-minsum, our knowledge of the hardness of approximation factors of these problems remains quite poor. In this paper, we significantly improve upon the hardness of approximation factors known in the literature for these objectives. We show that if the input lies in a general metric space, it is NP-hard to approximate: ∙ Continuous k-median to a factor of 2-o(1); this improves upon the previous inapproximability factor of 1.36 shown by Guha and Khuller (J. Algorithms '99). ∙ Continuous k-means to a factor of 4- o(1); this improves upon the previous inapproximability factor of 2.10 shown by Guha and Khuller (J. Algorithms '99). ∙ k-minsum to a factor of 1.415; this improves upon the APX-hardness shown by Guruswami and Indyk (SODA '03). Our results shed new and perhaps counter-intuitive light on the differences between clustering problems in the continuous setting versus the discrete setting (where the candidate centers are given as part of the input).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2021

Johnson Coverage Hypothesis: Inapproximability of k-means and k-median in L_p metrics

K-median and k-means are the two most popular objectives for clustering ...
research
06/30/2022

Approximation Algorithms for Continuous Clustering and Facility Location Problems

We consider the approximability of center-based clustering problems wher...
research
05/05/2021

Universal Algorithms for Clustering

This paper presents universal algorithms for clustering problems, includ...
research
06/26/2022

k-Median Clustering via Metric Embedding: Towards Better Initialization with Differential Privacy

When designing clustering algorithms, the choice of initial centers is c...
research
06/23/2014

Further heuristics for k-means: The merge-and-split heuristic and the (k,l)-means

Finding the optimal k-means clustering is NP-hard in general and many he...
research
06/03/2013

Distributed k-Means and k-Median Clustering on General Topologies

This paper provides new algorithms for distributed clustering for two po...
research
05/30/2019

Sequential no-Substitution k-Median-Clustering

We study the sample-based k-median clustering objective under a sequenti...

Please sign up or login with your details

Forgot password? Click here to reset