Algorithm-Agnostic Interpretations for Clustering

09/21/2022
by   Christian A. Scholbeck, et al.
10

A clustering outcome for high-dimensional data is typically interpreted via post-processing, involving dimension reduction and subsequent visualization. This destroys the meaning of the data and obfuscates interpretations. We propose algorithm-agnostic interpretation methods to explain clustering outcomes in reduced dimensions while preserving the integrity of the data. The permutation feature importance for clustering represents a general framework based on shuffling feature values and measuring changes in cluster assignments through custom score functions. The individual conditional expectation for clustering indicates observation-wise changes in the cluster assignment due to changes in the data. The partial dependence for clustering evaluates average changes in cluster assignments for the entire feature space. All methods can be used with any clustering algorithm able to reassign instances through soft or hard labels. In contrast to common post-processing methods such as principal component analysis, the introduced methods maintain the original structure of the features.

READ FULL TEXT

page 7

page 9

page 11

research
06/08/2020

Model-agnostic Feature Importance and Effects with Dependent Features – A Conditional Subgroup Approach

Partial dependence plots and permutation feature importance are popular ...
research
03/13/2022

Homogeneity and Sub-homogeneity Pursuit: Iterative Complement Clustering PCA

Principal component analysis (PCA), the most popular dimension-reduction...
research
02/16/2022

Using the left Gram matrix to cluster high dimensional data

For high dimensional data, where P features for N objects (P >> N) are r...
research
07/15/2023

Corrected kernel principal component analysis for model structural change detection

This paper develops a method to detect model structural changes by apply...
research
03/29/2023

Hard Regularization to Prevent Collapse in Online Deep Clustering without Data Augmentation

Online deep clustering refers to the joint use of a feature extraction n...
research
06/07/2023

Interpretable Deep Clustering

Clustering is a fundamental learning task widely used as a first step in...
research
11/30/2021

Hierarchical clustering: visualization, feature importance and model selection

We propose methods for the analysis of hierarchical clustering that full...

Please sign up or login with your details

Forgot password? Click here to reset