Predictive K-means with local models

12/16/2020
by   Vincent Lemaire, et al.
0

Supervised classification can be effective for prediction but sometimes weak on interpretability or explainability (XAI). Clustering, on the other hand, tends to isolate categories or profiles that can be meaningful but there is no guarantee that they are useful for labels prediction. Predictive clustering seeks to obtain the best of the two worlds. Starting from labeled data, it looks for clusters that are as pure as possible with regards to the class labels. One technique consists in tweaking a clustering algorithm so that data points sharing the same label tend to aggregate together. With distance-based algorithms, such as k-means, a solution is to modify the distance used by the algorithm so that it incorporates information about the labels of the data points. In this paper, we propose another method which relies on a change of representation guided by class densities and then carries out clustering in this new representation space. We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance with pure supervised classifiers while offering interpretability of the clusters discovered.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2023

Rethinking k-means from manifold learning perspective

Although numerous clustering algorithms have been developed, many existi...
research
07/05/2019

Hybridized Threshold Clustering for Massive Data

As the size n of datasets become massive, many commonly-used clustering ...
research
04/25/2020

Clustering by Constructing Hyper-Planes

As a kind of basic machine learning method, clustering algorithms group ...
research
07/23/2021

Text Classification and Clustering with Annealing Soft Nearest Neighbor Loss

We define disentanglement as how far class-different data points from ea...
research
01/26/2022

Multi-objective Semi-supervised Clustering for Finding Predictive Clusters

This study concentrates on clustering problems and aims to find compact ...
research
02/12/2022

Towards Continuous Consistency Axiom

Development of new algorithms in the area of machine learning, especiall...
research
07/25/2022

Orthogonalization of data via Gromov-Wasserstein type feedback for clustering and visualization

In this paper we propose an adaptive approach for clustering and visuali...

Please sign up or login with your details

Forgot password? Click here to reset