Learning to Cluster via Same-Cluster Queries

by   Yi Li, et al.

We study the problem of learning to cluster data points using an oracle which can answer same-cluster queries. Different from previous approaches, we do not assume that the total number of clusters is known at the beginning and do not require that the true clusters are consistent with a predefined objective function such as the K-means. These relaxations are critical from the practical perspective and, meanwhile, make the problem more challenging. We propose two algorithms with provable theoretical guarantees and verify their effectiveness via an extensive set of experiments on both synthetic and real-world data.


page 1

page 2

page 3

page 4


Same-Cluster Querying for Overlapping Clusters

Overlapping clusters are common in models of many practical data-segment...

Exact Recovery of Mangled Clusters with Same-Cluster Queries

We study the problem of recovering distorted clusters in the semi-superv...

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

We investigate the problem of exact cluster recovery using oracle querie...

Query K-means Clustering and the Double Dixie Cup Problem

We consider the problem of approximate K-means clustering with outliers ...

Cluster-based dual evolution for multivariate systems

This paper proposes a cluster-based method to analyse multivariate syste...

Efficient Algorithms for Generating Provably Near-Optimal Cluster Descriptors for Explainability

Improving the explainability of the results from machine learning method...

SCE: A manifold regularized set-covering method for data partitioning

Cluster analysis plays a very important role in data analysis. In these ...

Please sign up or login with your details

Forgot password? Click here to reset