DeepAI AI Chat
Log In Sign Up

Learning-Augmented k-means Clustering

10/27/2021
by   Jon Ergun, et al.
MIT
Carnegie Mellon University
0

k-means clustering is a well-studied problem due to its wide applicability. Unfortunately, there exist strong theoretical limits on the performance of any algorithm for the k-means problem on worst-case inputs. To overcome this barrier, we consider a scenario where "advice" is provided to help perform clustering. Specifically, we consider the k-means problem augmented with a predictor that, given any point, returns its cluster label in an approximately optimal clustering up to some, possibly adversarial, error. We present an algorithm whose performance improves along with the accuracy of the predictor, even though naïvely following the accurate predictor can still lead to a high clustering cost. Thus if the predictor is sufficiently accurate, we can retrieve a close to optimal clustering with nearly optimal runtime, breaking known computational barriers for algorithms that do not have access to such advice. We evaluate our algorithms on real datasets and show significant improvements in the quality of clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/31/2022

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

We consider the problem of clustering in the learning-augmented setting,...
02/28/2020

Explainable k-Means and k-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data...
06/15/2018

Query K-means Clustering and the Double Dixie Cup Problem

We consider the problem of approximate K-means clustering with outliers ...
12/04/2017

Clustering Stable Instances of Euclidean k-means

The Euclidean k-means problem is arguably the most widely-studied cluste...
05/22/2017

Improved Clustering with Augmented k-means

Identifying a set of homogeneous clusters in a heterogeneous dataset is ...
08/04/2020

Out-of-Distribution Generalization with Maximal Invariant Predictor

Out-of-Distribution (OOD) generalization problem is a problem of seeking...
06/28/2021

Robust Learning-Augmented Caching: An Experimental Study

Effective caching is crucial for the performance of modern-day computing...