Shallow decision trees for explainable k-means clustering

12/29/2021
by   Eduardo Laber, et al.
0

A number of recent works have employed decision trees for the construction of explainable partitions that aim to minimize the k-means cost function. These works, however, largely ignore metrics related to the depths of the leaves in the resulting tree, which is perhaps surprising considering how the explainability of a decision tree depends on these depths. To fill this gap in the literature, we propose an efficient algorithm that takes into account these metrics. In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms such as the ones presented in <cit.>, <cit.>, <cit.> and <cit.>, typically achieving lower or equivalent costs with considerably shallower trees. We also show, through a simple adaptation of existing techniques, that the problem of building explainable partitions induced by binary trees for the k-means cost function does not admit an (1+ϵ)-approximation in polynomial time unless P=NP, which justifies the quest for approximation algorithms and/or heuristics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/05/2021

On the price of explainability for some clustering problems

The price of explainability for a clustering task can be defined as the ...
research
11/04/2021

Explainable k-means. Don't be greedy, plant bigger trees!

We provide a new bi-criteria Õ(log^2 k) competitive algorithm for explai...
research
12/19/2019

Meta Decision Trees for Explainable Recommendation Systems

We tackle the problem of building explainable recommendation systems tha...
research
09/22/2022

XClusters: Explainability-first Clustering

We study the problem of explainability-first clustering where explainabi...
research
08/20/2022

The computational complexity of some explainable clustering problems

We study the computational complexity of some explainable clustering pro...
research
06/20/2019

ID3 Learns Juntas for Smoothed Product Distributions

In recent years, there are many attempts to understand popular heuristic...
research
08/13/2021

An Information-theoretic Perspective of Hierarchical Clustering

A combinatorial cost function for hierarchical clustering was introduced...

Please sign up or login with your details

Forgot password? Click here to reset