
Almost Tight Approximation Algorithms for Explainable Clustering
Recently, due to an increasing interest for transparency in artificial i...
read it

Explainable kMeans and kMedians Clustering
Clustering is a popular form of unsupervised learning for geometric data...
read it

Nearoptimal Algorithms for Explainable kMedians and kMeans
We consider the problem of explainable kmedians and kmeans introduced ...
read it

NearOptimal Explainable kMeans for All Dimensions
Many clustering algorithms are guided by certain cost functions such as ...
read it

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel kmeans Clustering
We present tight lower bounds on the number of kernel evaluations requir...
read it

On the price of explainability for some clustering problems
The price of explainability for a clustering task can be defined as the ...
read it

Optimal Time Bounds for Approximate Clustering
Clustering is a fundamental problem in unsupervised learning, and has be...
read it
NearlyTight and Oblivious Algorithms for Explainable Clustering
We study the problem of explainable clustering in the setting first formalized by Moshkovitz, Dasgupta, Rashtchian, and Frost (ICML 2020). A kclustering is said to be explainable if it is given by a decision tree where each internal node splits data points with a threshold cut in a single dimension (feature), and each of the k leaves corresponds to a cluster. We give an algorithm that outputs an explainable clustering that loses at most a factor of O(log^2 k) compared to an optimal (not necessarily explainable) clustering for the kmedians objective, and a factor of O(k log^2 k) for the kmeans objective. This improves over the previous best upper bounds of O(k) and O(k^2), respectively, and nearly matches the previous Ω(log k) lower bound for kmedians and our new Ω(k) lower bound for kmeans. The algorithm is remarkably simple. In particular, given an initial not necessarily explainable clustering in ℝ^d, it is oblivious to the data points and runs in time O(dk log^2 k), independent of the number of data points n. Our upper and lower bounds also generalize to objectives given by higher ℓ_pnorms.
READ FULL TEXT
Comments
There are no comments yet.