COBRAS: Fast, Iterative, Active Clustering with Pairwise Constraints

03/29/2018
by   Toon Van Craenendonck, et al.
0

Constraint-based clustering algorithms exploit background knowledge to construct clusterings that are aligned with the interests of a particular user. This background knowledge is often obtained by allowing the clustering system to pose pairwise queries to the user: should these two elements be in the same cluster or not? Active clustering methods aim to minimize the number of queries needed to obtain a good clustering by querying the most informative pairs first. Ideally, a user should be able to answer a couple of these queries, inspect the resulting clustering, and repeat these two steps until a satisfactory result is obtained. We present COBRAS, an approach to active clustering with pairwise constraints that is suited for such an interactive clustering process. A core concept in COBRAS is that of a super-instance: a local region in the data in which all instances are assumed to belong to the same cluster. COBRAS constructs such super-instances in a top-down manner to produce high-quality results early on in the clustering process, and keeps refining these super-instances as more pairwise queries are given to get more detailed clusterings later on. We experimentally demonstrate that COBRAS produces good clusterings at fast run times, making it an excellent candidate for the iterative clustering scenario outlined above.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2018

COBRA: A Fast and Simple Method for Active Clustering with Pairwise Constraints

Clustering is inherently ill-posed: there often exist multiple valid clu...
research
02/20/2023

Active Learning with Positive and Negative Pairwise Feedback

In this paper, we propose a generic framework for active clustering with...
research
06/08/2016

Clustering with Same-Cluster Queries

We propose a framework for Semi-Supervised Active Clustering framework (...
research
02/25/2023

Semi-supervised Clustering with Two Types of Background Knowledge: Fusing Pairwise Constraints and Monotonicity Constraints

This study addresses the problem of performing clustering in the presenc...
research
09/11/2017

Semi-Supervised Active Clustering with Weak Oracles

Semi-supervised active clustering (SSAC) utilizes the knowledge of a dom...
research
11/08/2021

Query-augmented Active Metric Learning

In this paper we propose an active metric learning method for clustering...
research
02/28/2023

Semi-Supervised Constrained Clustering: An In-Depth Overview, Ranked Taxonomy and Future Research Directions

Clustering is a well-known unsupervised machine learning approach capabl...

Please sign up or login with your details

Forgot password? Click here to reset