ClusterNet: A Perception-Based Clustering Model for Scattered Data

by   Sebastian Hartwig, et al.

Cluster separation in scatterplots is a task that is typically tackled by widely used clustering techniques, such as for instance k-means or DBSCAN. However, as these algorithms are based on non-perceptual metrics, their output often does not reflect human cluster perception. To bridge the gap between human cluster perception and machine-computed clusters, we propose a learning strategy which directly operates on scattered data. To learn perceptual cluster separation on this data, we crowdsourced a large scale dataset, consisting of 7,320 point-wise cluster affiliations for bivariate data, which has been labeled by 384 human crowd workers. Based on this data, we were able to train ClusterNet, a point-based deep learning model, trained to reflect human perception of cluster separability. In order to train ClusterNet on human annotated data, we omit rendering scatterplots on a 2D canvas, but rather use a PointNet++ architecture enabling inference on point clouds directly. In this work, we provide details on how we collected our dataset, report statistics of the resulting annotations, and investigate perceptual agreement of cluster separation for real-world data. We further report the training and evaluation protocol of ClusterNet and introduce a novel metric, that measures the accuracy between a clustering technique and a group of human annotators. Finally, we compare our approach against existing state-of-the-art clustering techniques.


CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual Clustering

Visual clustering is a common perceptual task in scatterplots that suppo...

Cluster validity index based on Jeffrey divergence

Cluster validity indexes are very important tools designed for two purpo...

Unique Metric for Health Analysis with Optimization of Clustering Activity and Cross Comparison of Results from Different Approach

In machine learning and data mining, Cluster analysis is one of the most...

Learning Neural Models for End-to-End Clustering

We propose a novel end-to-end neural network architecture that, once tra...

Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology

Scatterplots are used for a variety of visual analytics tasks, including...

Meta-Learning to Cluster

Clustering is one of the most fundamental and wide-spread techniques in ...

A Cluster Ranking Model for Full Anaphora Resolution

Anaphora resolution (coreference) systems designed for the CONLL 2012 da...

Please sign up or login with your details

Forgot password? Click here to reset