On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

09/08/2016
by   Shahin Shahrampour, et al.
0

We consider the best-arm identification problem in multi-armed bandits, which focuses purely on exploration. A player is given a fixed budget to explore a finite set of arms, and the rewards of each arm are drawn independently from a fixed, unknown distribution. The player aims to identify the arm with the largest expected reward. We propose a general framework to unify sequential elimination algorithms, where the arms are dismissed iteratively until a unique arm is left. Our analysis reveals a novel performance measure expressed in terms of the sampling mechanism and number of eliminated arms at each round. Based on this result, we develop an algorithm that divides the budget according to a nonlinear function of remaining arms at each round. We provide theoretical guarantees for the algorithm, characterizing the suitable nonlinearity for different problem environments described by the number of competitive arms. Matching the theoretical results, our experiments show that the nonlinear algorithm outperforms the state-of-the-art. We finally study the side-observation model, where pulling an arm reveals the rewards of its related arms, and we establish improved theoretical guarantees in the pure-exploration setting.

READ FULL TEXT
research
07/09/2017

Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits

We address the M-best-arm identification problem in multi-armed bandits....
research
08/02/2021

Pure Exploration in Multi-armed Bandits with Graph Side Information

We study pure exploration in multi-armed bandits with graph side-informa...
research
03/05/2020

Robustness Guarantees for Mode Estimation with an Application to Bandits

Mode estimation is a classical problem in statistics with a wide range o...
research
11/15/2018

Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs

This paper considers a multi-armed bandit game where the number of arms ...
research
10/31/2020

Resource Allocation in Multi-armed Bandit Exploration: Overcoming Nonlinear Scaling with Adaptive Parallelism

We study exploration in stochastic multi-armed bandits when we have acce...
research
02/08/2016

Decoy Bandits Dueling on a Poset

We adress the problem of dueling bandits defined on partially ordered se...
research
05/17/2023

Sequential Best-Arm Identification with Application to Brain-Computer Interface

A brain-computer interface (BCI) is a technology that enables direct com...

Please sign up or login with your details

Forgot password? Click here to reset