Tree Index: A New Cluster Evaluation Technique

03/24/2020
by   A. H. Beg, et al.
0

We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation power of clustering is some cumulative error similar to vector quantization). Our Tree Index is finding margins amongst clusters for easy learning without the complications of Minimum Description Length. Our Tree Index produces a decision tree from the clustered data set, using the cluster identifiers as labels. It combines the entropy of each leaf with their depth. Intuitively, a shorter tree with pure leaves generalizes the data well (the clusters are easy to learn because they are well separated). So, the labels are meaningful clusters. If the clustering algorithm does not separate well, trees learned from their results will be large and too detailed. We show that, on the clustering results (obtained by various techniques) on a brain dataset, Tree Index discriminates between reasonable and non-sensible clusters. We confirm the effectiveness of Tree Index through graphical visualizations. Tree Index evaluates the sensible solutions higher than the non-sensible solutions while existing cluster-quality indexes fail to do so.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2020

Evaluating and Validating Cluster Results

Clustering is the technique to partition data according to their charact...
research
05/21/2015

Parallel Streaming Signature EM-tree: A Clustering Algorithm for Web Scale Applications

The proliferation of the web presents an unsolved problem of automatical...
research
06/12/2015

Leading Tree in DPCLUS and Its Impact on Building Hierarchies

This paper reveals the tree structure as an intermediate result of clust...
research
12/30/2022

A novel cluster internal evaluation index based on hyper-balls

It is crucial to evaluate the quality and determine the optimal number o...
research
02/08/2020

Index-based Solutions for Efficient Density Peaks Clustering

Density Peaks Clustering (DPC), a novel density-based clustering approac...
research
10/01/2019

Deep Lifetime Clustering

The goal of lifetime clustering is to develop an inductive model that ma...
research
10/26/2020

Data Segmentation via t-SNE, DBSCAN, and Random Forest

This research proposes a data segmentation technique which is easy to in...

Please sign up or login with your details

Forgot password? Click here to reset