A Revenue Function for Comparison-Based Hierarchical Clustering

11/29/2022
by   Aishik Mandal, et al.
0

Comparison-based learning addresses the problem of learning when, instead of explicit features or pairwise similarities, one only has access to comparisons of the form: Object A is more similar to B than to C. Recently, it has been shown that, in Hierarchical Clustering, single and complete linkage can be directly implemented using only such comparisons while several algorithms have been proposed to emulate the behaviour of average linkage. Hence, finding hierarchies (or dendrograms) using only comparisons is a well understood problem. However, evaluating their meaningfulness when no ground-truth nor explicit similarities are available remains an open question. In this paper, we bridge this gap by proposing a new revenue function that allows one to measure the goodness of dendrograms using only comparisons. We show that this function is closely related to Dasgupta's cost for hierarchical clustering that uses pairwise similarities. On the theoretical side, we use the proposed revenue function to resolve the open problem of whether one can approximately recover a latent hierarchy using few triplet comparisons. On the practical side, we present principled algorithms for comparison-based hierarchical clustering based on the maximisation of the revenue and we empirically compare them with existing methods.

READ FULL TEXT

page 9

page 18

page 19

page 20

page 21

research
11/02/2018

Foundations of Comparison-Based Hierarchical Clustering

We address the classical problem of hierarchical clustering, but in a fr...
research
10/08/2020

Near-Optimal Comparison Based Clustering

The goal of clustering is to group similar objects into meaningful parti...
research
02/18/2011

Active Clustering: Robust and Efficient Hierarchical Clustering using Adaptively Selected Similarities

Hierarchical clustering based on pairwise similarities is a common tool ...
research
07/19/2012

Hierarchical Clustering using Randomly Selected Similarities

The problem of hierarchical clustering items from pairwise similarities ...
research
02/23/2021

Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT

In this paper, we study a number of well-known combinatorial optimizatio...
research
09/25/2019

A revenue allocation scheme based on pairwise comparisons

A model of sharing revenues among groups when group members are ranked s...
research
09/20/2019

Online Hierarchical Clustering Approximations

Hierarchical clustering is a widely used approach for clustering dataset...

Please sign up or login with your details

Forgot password? Click here to reset