Medoids in almost linear time via multi-armed bandits

11/02/2017
by   Vivek Bagaria, et al.
0

Computing the medoid of a large number of points in high-dimensional space is an increasingly common operation in many data science problems. We present an algorithm Med-dit which uses O(n log n) distance evaluations to compute the medoid with high probability. Med-dit is based on a connection with the multi-armed bandit problem. We evaluate the performance of Med-dit empirically on the Netflix-prize and the single-cell RNA-Seq datasets, containing hundreds of thousands of points living in tens of thousands of dimensions, and observe a 5-10x improvement in performance over the current state of the art. Med-dit is available at https://github.com/bagavi/Meddit

READ FULL TEXT
research
06/11/2020

Bandit-PAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

Clustering is a ubiquitous task in data science. Compared to the commonl...
research
10/11/2019

Nonparametric Bayesian multi-armed bandits for single cell experiment design

The problem of maximizing cell type discovery under budget constraints i...
research
06/11/2019

Ultra Fast Medoid Identification via Correlated Sequential Halving

The medoid of a set of n points is the point in the set that minimizes t...
research
11/08/2022

Adaptive Data Depth via Multi-Armed Bandits

Data depth, introduced by Tukey (1975), is an important tool in data sci...
research
08/17/2023

Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health

Restless multi-armed bandits (RMABs) are a popular framework for algorit...
research
12/14/2022

MABSplit: Faster Forest Training Using Multi-Armed Bandits

Random forests are some of the most widely used machine learning models ...
research
03/07/2023

PyXAB – A Python Library for 𝒳-Armed Bandit and Online Blackbox Optimization Algorithms

We introduce a Python open-source library for 𝒳-armed bandit and online ...

Please sign up or login with your details

Forgot password? Click here to reset