Exact Exponential Algorithms for Clustering Problems

08/14/2022
by   Fedor V. Fomin, et al.
0

In this paper we initiate a systematic study of exact algorithms for well-known clustering problems, namely k-Median and k-Means. In k-Median, the input consists of a set X of n points belonging to a metric space, and the task is to select a subset C ⊆ X of k points as centers, such that the sum of the distances of every point to its nearest center is minimized. In k-Means, the objective is to minimize the sum of squares of the distances instead. It is easy to design an algorithm running in time max_k≤ nn k n^O(1) = O^*(2^n) (O^*(·) notation hides polynomial factors in n). We design first non-trivial exact algorithms for these problems. In particular, we obtain an O^*((1.89)^n) time exact algorithm for k-Median that works for any value of k. Our algorithm is quite general in that it does not use any properties of the underlying (metric) space – it does not even require the distances to satisfy the triangle inequality. In particular, the same algorithm also works for k-Means. We complement this result by showing that the running time of our algorithm is asymptotically optimal, up to the base of the exponent. That is, unless ETH fails, there is no algorithm for these problems running in time 2^o(n)· n^O(1). Finally, we consider the "supplier" versions of these clustering problems, where, in addition to the set X we are additionally given a set of m candidate centers F, and objective is to find a subset of k centers from F. The goal is still to minimize the k-Median/k-Means/k-Center objective. For these versions we give a O(2^n (mn)^O(1)) time algorithms using subset convolution. We complement this result by showing that, under the Set Cover Conjecture, the supplier versions of these problems do not admit an exact algorithm running in time 2^(1-ϵ) n (mn)^O(1).

READ FULL TEXT

page 1

page 5

page 13

research
10/30/2018

Coresets for k-Means and k-Median Clustering and their Applications

In this paper, we show the existence of small coresets for the problem...
research
02/25/2022

Towards Optimal Lower Bounds for k-median and k-means Coresets

Given a set of points in a metric space, the (k,z)-clustering problem co...
research
10/29/2017

If it ain't broke, don't fix it: Sparse metric repair

Many modern data-intensive computational problems either require, or ben...
research
08/16/2023

A Quantum Approximation Scheme for k-Means

We give a quantum approximation scheme (i.e., (1 + ε)-approximation for ...
research
08/24/2021

Linear-Size Universal Discretization of Geometric Center-Based Problems in Fixed Dimensions

Many geometric optimization problems can be reduced to finding points in...
research
06/14/2021

Coresets for constrained k-median and k-means clustering in low dimensional Euclidean space

We study (Euclidean) k-median and k-means with constraints in the stream...
research
04/11/2020

Submodular Clustering in Low Dimensions

We study a clustering problem where the goal is to maximize the coverage...

Please sign up or login with your details

Forgot password? Click here to reset