Optimal Clustering with Noisy Queries via Multi-Armed Bandit

07/12/2022
by   Jinghui Xia, et al.
0

Motivated by many applications, we study clustering with a faulty oracle. In this problem, there are n items belonging to k unknown clusters, and the algorithm is allowed to ask the oracle whether two items belong to the same cluster or not. However, the answer from the oracle is correct only with probability 1/2+δ/2. The goal is to recover the hidden clusters with minimum number of noisy queries. Previous works have shown that the problem can be solved with O(nklog n/δ^2 + poly(k,1/δ, log n)) queries, while Ω(nk/δ^2) queries is known to be necessary. So, for any values of k and δ, there is still a non-trivial gap between upper and lower bounds. In this work, we obtain the first matching upper and lower bounds for a wide range of parameters. In particular, a new polynomial time algorithm with O(n(k+log n)/δ^2 + poly(k,1/δ, log n)) queries is proposed. Moreover, we prove a new lower bound of Ω(nlog n/δ^2), which, combined with the existing Ω(nk/δ^2) bound, matches our upper bound up to an additive poly(k,1/δ,log n) term. To obtain the new results, our main ingredient is an interesting connection between our problem and multi-armed bandit, which might provide useful insights for other similar problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2023

Regret Lower Bounds in Multi-agent Multi-armed Bandit

Multi-armed Bandit motivates methods with provable upper bounds on regre...
research
01/05/2022

Bridging Adversarial and Nonstationary Multi-armed Bandit

In the multi-armed bandit framework, there are two formulations that are...
research
12/19/2017

Approximate Correlation Clustering Using Same-Cluster Queries

Ashtiani et al. (NIPS 2016) introduced a semi-supervised framework for c...
research
11/22/2022

Support Size Estimation: The Power of Conditioning

We consider the problem of estimating the support size of a distribution...
research
06/09/2022

Clustering with Queries under Semi-Random Noise

The seminal paper by Mazumdar and Saha <cit.> introduced an extensive li...
research
01/05/2023

Diagonalization Games

We study several variants of a combinatorial game which is based on Cant...
research
09/24/2019

The Query Complexity of Mastermind with ℓ_p Distances

Consider a variant of the Mastermind game in which queries are ℓ_p dista...

Please sign up or login with your details

Forgot password? Click here to reset