Nearly-Tight and Oblivious Algorithms for Explainable Clustering

06/30/2021
by   Buddhima Gamlath, et al.
0

We study the problem of explainable clustering in the setting first formalized by Moshkovitz, Dasgupta, Rashtchian, and Frost (ICML 2020). A k-clustering is said to be explainable if it is given by a decision tree where each internal node splits data points with a threshold cut in a single dimension (feature), and each of the k leaves corresponds to a cluster. We give an algorithm that outputs an explainable clustering that loses at most a factor of O(log^2 k) compared to an optimal (not necessarily explainable) clustering for the k-medians objective, and a factor of O(k log^2 k) for the k-means objective. This improves over the previous best upper bounds of O(k) and O(k^2), respectively, and nearly matches the previous Ω(log k) lower bound for k-medians and our new Ω(k) lower bound for k-means. The algorithm is remarkably simple. In particular, given an initial not necessarily explainable clustering in ℝ^d, it is oblivious to the data points and runs in time O(dk log^2 k), independent of the number of data points n. Our upper and lower bounds also generalize to objectives given by higher ℓ_p-norms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

Almost Tight Approximation Algorithms for Explainable Clustering

Recently, due to an increasing interest for transparency in artificial i...
research
06/29/2021

Near-Optimal Explainable k-Means for All Dimensions

Many clustering algorithms are guided by certain cost functions such as ...
research
02/28/2020

Explainable k-Means and k-Medians Clustering

Clustering is a popular form of unsupervised learning for geometric data...
research
12/12/2012

Optimal Time Bounds for Approximate Clustering

Clustering is a fundamental problem in unsupervised learning, and has be...
research
06/29/2021

Exponential Weights Algorithms for Selective Learning

We study the selective learning problem introduced by Qiao and Valiant (...
research
04/19/2023

The Price of Explainability for Clustering

Given a set of points in d-dimensional space, an explainable clustering ...
research
12/13/2021

How to Find a Good Explanation for Clustering?

k-means and k-median clustering are powerful unsupervised machine learni...

Please sign up or login with your details

Forgot password? Click here to reset