Interactive Steering of Hierarchical Clustering

09/21/2020
by   Weikai Yang, et al.
0

Hierarchical clustering is an important technique to organize big data for exploratory data analysis. However, existing one-size-fits-all hierarchical clustering methods often fail to meet the diverse needs of different users. To address this challenge, we present an interactive steering method to visually supervise constrained hierarchical clustering by utilizing both public knowledge (e.g., Wikipedia) and private knowledge from users. The novelty of our approach includes 1) automatically constructing constraints for hierarchical clustering using knowledge (knowledge-driven) and intrinsic data distribution (data-driven), and 2) enabling the interactive steering of clustering through a visual interface (user-driven). Our method first maps each data item to the most relevant items in a knowledge base. An initial constraint tree is then extracted using the ant colony optimization algorithm. The algorithm balances the tree width and depth and covers the data items with high confidence. Given the constraint tree, the data items are hierarchically clustered using evolutionary Bayesian rose tree. To clearly convey the hierarchical clustering results, an uncertainty-aware tree visualization has been developed to enable users to quickly locate the most uncertain sub-hierarchies and interactively improve them. The quantitative evaluation and case study demonstrate that the proposed approach facilitates the building of customized clustering trees in an efficient and effective manner.

READ FULL TEXT

page 12

page 14

research
02/13/2020

Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE

t-SNE and hierarchical clustering are popular methods of exploratory dat...
research
04/06/2017

An Online Hierarchical Algorithm for Extreme Clustering

Many modern clustering methods scale well to a large number of data item...
research
04/08/2018

A Proposal of Interactive Growing Hierarchical SOM

Self Organizing Map is trained using unsupervised learning to produce a ...
research
04/09/2018

Clustrophile 2: Guided Visual Clustering Analysis

Data clustering is a common unsupervised learning method frequently used...
research
08/08/2022

Clustering Optimisation Method for Highly Connected Biological Data

Currently, data-driven discovery in biological sciences resides in findi...
research
10/08/2020

Clustering Analysis of Interactive Learning Activities Based on Improved BIRCH Algorithm

Group tendency is a research branch of computer assisted learning. The c...
research
12/20/2020

eTREE: Learning Tree-structured Embeddings

Matrix factorization (MF) plays an important role in a wide range of mac...

Please sign up or login with your details

Forgot password? Click here to reset