COBRA: A Fast and Simple Method for Active Clustering with Pairwise Constraints

01/30/2018
by   Toon Van Craenendonck, et al.
0

Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset, and without any additional information a clustering system has no way of knowing which clustering it should produce. This motivates the use of constraints in clustering, as they allow users to communicate their interests to the clustering system. Active constraint-based clustering algorithms select the most useful constraints to query, aiming to produce a good clustering using as few constraints as possible. We propose COBRA, an active method that first over-clusters the data by running K-means with a K that is intended to be too large, and subsequently merges the resulting small clusters into larger ones based on pairwise constraints. In its merging step, COBRA is able to keep the number of pairwise queries low by maximally exploiting constraint transitivity and entailment. We experimentally show that COBRA outperforms the state of the art in terms of clustering quality and runtime, without requiring the number of clusters in advance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2018

COBRAS: Fast, Iterative, Active Clustering with Pairwise Constraints

Constraint-based clustering algorithms exploit background knowledge to c...
research
02/07/2014

Active Clustering with Model-Based Uncertainty Reduction

Semi-supervised clustering seeks to augment traditional clustering metho...
research
12/29/2022

PCCC: The Pairwise-Confidence-Constraints-Clustering Algorithm

We consider a semi-supervised k-clustering problem where information is ...
research
03/23/2022

Constrained Clustering and Multiple Kernel Learning without Pairwise Constraint Relaxation

Clustering under pairwise constraints is an important knowledge discover...
research
02/20/2023

Active Learning with Positive and Negative Pairwise Feedback

In this paper, we propose a generic framework for active clustering with...
research
05/24/2018

Hierarchical Clustering with Structural Constraints

Hierarchical clustering is a popular unsupervised data analysis method. ...
research
07/24/2019

Constrained K-means with General Pairwise and Cardinality Constraints

In this work, we study constrained clustering, where constraints are uti...

Please sign up or login with your details

Forgot password? Click here to reset