Interactive Exploration of Large Dendrograms with Prototypes

06/03/2022
by   Andee Kaplan, et al.
0

Hierarchical clustering is one of the standard methods taught for identifying and exploring the underlying structures that may be present within a data set. Students are shown examples in which the dendrogram, a visual representation of the hierarchical clustering, reveals a clear clustering structure. However, in practice, data analysts today frequently encounter data sets whose large scale undermines the usefulness of the dendrogram as a visualization tool. Densely packed branches obscure structure, and overlapping labels are impossible to read. In this paper we present a new workflow for performing hierarchical clustering via the R package called protoshiny that aims to restore hierarchical clustering to its former role of being an effective and versatile visualization tool. Our proposal leverages interactivity combined with the ability to label internal nodes in a dendrogram with a representative data point (called a prototype). After presenting the workflow, we provide three case studies to demonstrate its utility.

READ FULL TEXT

page 5

page 19

research
07/15/2020

Evaluating and Validating Cluster Results

Clustering is the technique to partition data according to their charact...
research
02/13/2020

Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE

t-SNE and hierarchical clustering are popular methods of exploratory dat...
research
11/19/2019

Hierarchical Distribution Matching: a Versatile Tool for Probabilistic Shaping

The hierarchical distribution matching (Hi-DM) approach for probabilisti...
research
11/30/2021

Hierarchical clustering: visualization, feature importance and model selection

We propose methods for the analysis of hierarchical clustering that full...
research
06/19/2016

Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation

A good clustering can help a data analyst to explore and understand a da...
research
04/08/2018

A Proposal of Interactive Growing Hierarchical SOM

Self Organizing Map is trained using unsupervised learning to produce a ...
research
03/14/2021

Pandemonium: a clustering tool to partition parameter space – application to the B anomalies

We introduce the interactive tool pandemonium to cluster model predictio...

Please sign up or login with your details

Forgot password? Click here to reset