Improved Approximations for Euclidean k-means and k-median, via Nested Quasi-Independent Sets

04/11/2022
by   Vincent Cohen-Addad, et al.
0

Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean k-median and k-means problems. We propose a new primal-dual algorithm, inspired by the classic algorithm of Jain and Vazirani and the recent algorithm of Ahmadian, Norouzi-Fard, Svensson, and Ward. Our algorithm achieves an approximation ratio of 2.406 and 5.912 for Euclidean k-median and k-means, respectively, improving upon the 2.633 approximation ratio of Ahmadian et al. and the 6.1291 approximation ratio of Grandoni, Ostrovsky, Rabani, Schulman, and Venkat. Our techniques involve a much stronger exploitation of the Euclidean metric than previous work on Euclidean clustering. In addition, we introduce a new method of removing excess centers using a variant of independent sets over graphs that we dub a "nested quasi-independent set". In turn, this technique may be of interest for other optimization problems in Euclidean and ℓ_p metric spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2022

On the Fixed-Parameter Tractability of Capacitated Clustering

We study the complexity of the classic capacitated k-median and k-means ...
research
12/19/2018

Approximation Schemes for Capacitated Clustering in Doubling Metrics

Motivated by applications in redistricting, we consider the uniform capa...
research
11/09/2020

Hardness of Approximation of Euclidean k-Median

The Euclidean k-median problem is defined in the following manner: given...
research
07/15/2021

A Refined Approximation for Euclidean k-Means

In the Euclidean k-Means problem we are given a collection of n points D...
research
04/28/2019

Tight FPT Approximations for k-Median and k-Means

We investigate the fine-grained complexity of approximating the classica...
research
07/12/2018

Turning Big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering

We develop and analyze a method to reduce the size of a very large set o...
research
12/27/2018

Hierarchical Clustering for Euclidean Data

Recent works on Hierarchical Clustering (HC), a well-studied problem in ...

Please sign up or login with your details

Forgot password? Click here to reset