Hierarchical clustering: visualization, feature importance and model selection

11/30/2021
by   Luben M. C. Cabezas, et al.
0

We propose methods for the analysis of hierarchical clustering that fully use the multi-resolution structure provided by a dendrogram. Specifically, we propose a loss for choosing between clustering methods, a feature importance score and a graphical tool for visualizing the segmentation of features in a dendrogram. Current approaches to these tasks lead to loss of information since they require the user to generate a single partition of the instances by cutting the dendrogram at a specified level. Our proposed methods, instead, use the full structure of the dendrogram. The key insight behind the proposed methods is to view a dendrogram as a phylogeny. This analogy permits the assignment of a feature value to each internal node of a tree through ancestral state reconstruction. Real and simulated datasets provide evidence that our proposed framework has desirable outcomes. We provide an R package that implements our methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Foundations of Comparison-Based Hierarchical Clustering

We address the classical problem of hierarchical clustering, but in a fr...
research
04/17/2014

Hierarchical Quasi-Clustering Methods for Asymmetric Networks

This paper introduces hierarchical quasi-clustering methods, a generaliz...
research
06/03/2022

Interactive Exploration of Large Dendrograms with Prototypes

Hierarchical clustering is one of the standard methods taught for identi...
research
12/16/2015

Blockout: Dynamic Model Selection for Hierarchical Deep Networks

Most deep architectures for image classification--even those that are tr...
research
05/24/2023

Hierarchical clustering with dot products recovers hidden tree structure

In this paper we offer a new perspective on the well established agglome...
research
01/13/2022

The R Package HCV for Hierarchical Clustering from Vertex-links

The HCV package implements the hierarchical clustering for spatial data....
research
09/21/2022

Algorithm-Agnostic Interpretations for Clustering

A clustering outcome for high-dimensional data is typically interpreted ...

Please sign up or login with your details

Forgot password? Click here to reset