Mutual Information based labelling and comparing clusters

02/27/2017
by   Rob Koopman, et al.
0

After a clustering solution is generated automatically, labelling these clusters becomes important to help understanding the results. In this paper, we propose to use a Mutual Information based method to label clusters of journal articles. Topical terms which have the highest Normalised Mutual Information (NMI) with a certain cluster are selected to be the labels of the cluster. Discussion of the labelling technique with a domain expert was used as a check that the labels are discriminating not only lexical-wise but also semantically. Based on a common set of topical terms, we also propose to generate lexical fingerprints as a representation of individual clusters. Eventually, we visualise and compare these fingerprints of different clusters from either one clustering solution or different ones.

READ FULL TEXT
research
10/04/2021

Clustering with Respect to the Information Distance

We discuss the notion of a dense cluster with respect to the information...
research
10/12/2022

Generalised Mutual Information for Discriminative Clustering

In the last decade, recent successes in deep clustering majorly involved...
research
03/23/2021

Pairwise Adjusted Mutual Information

A well-known metric for quantifying the similarity between two clusterin...
research
02/27/2017

Contextualization of topics: Browsing through the universe of bibliographic information

This paper describes how semantic indexing can help to generate a contex...
research
10/03/2019

Information based Deep Clustering: An experimental study

Recently, two methods have shown outstanding performance for clustering ...
research
05/01/2017

Forced to Learn: Discovering Disentangled Representations Without Exhaustive Labels

Learning a better representation with neural networks is a challenging p...
research
04/23/2015

svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery

We present a new R package which takes a numerical matrix format as data...

Please sign up or login with your details

Forgot password? Click here to reset