Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

01/31/2021
by   Marco Bressan, et al.
5

We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only O(log n) same-cluster queries, where n is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural characterization of clusters, called (β,γ)-convexity, that can be applied to any finite set of points equipped with a metric (or even a semimetric, as the triangle inequality is not needed). Using (β,γ)-convexity, we can translate natural density properties of clusters (which include, for instance, clusters that are strongly non-convex in R^d) into a graph-theoretic notion of convexity. By exploiting this convexity notion, we design a deterministic algorithm that recovers (β,γ)-convex clusters using O(k^2 log n + k^2 (6/βγ)^dens(X)) same-cluster queries, where k is the number of clusters and dens(X) is the density dimension of the semimetric. We show that an exponential dependence on the density dimension is necessary, and we also show that, if we are allowed to make O(k^2 + k log n) additional queries to a "cluster separation" oracle, then we can recover clusters that have different and arbitrary scales, even when the scale of each cluster is unknown.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

On Margin-Based Cluster Recovery with Oracle Queries

We study an active cluster recovery problem where, given a set of n poin...
research
10/03/2019

A Grid-based Approach for Convexity Analysis of a Density-based Cluster

This paper presents a novel geometrical approach to investigate the conv...
research
06/08/2020

Exact Recovery of Mangled Clusters with Same-Cluster Queries

We study the problem of recovering distorted clusters in the semi-superv...
research
08/17/2021

Learning to Cluster via Same-Cluster Queries

We study the problem of learning to cluster data points using an oracle ...
research
03/08/2019

Active Learning a Convex Body in Low Dimensions

Consider a set P ⊆R^d of n points, and a convex body C provided via a se...
research
03/14/2022

Geometric reconstructions of density based clusterings

DBSCAN* and HDBSCAN* are well established density based clustering algor...
research
07/24/2023

Finite Size Effects in Addition and Chipping Processes

We investigate analytically and numerically a system of clusters evolvin...

Please sign up or login with your details

Forgot password? Click here to reset