SPRT-based Efficient Best Arm Identification in Stochastic Bandits

07/22/2022
by   Arpan Mukherjee, et al.
5

This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The state-of-the-art algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, a novel framework is proposed, which views the BAI problem as sequential hypothesis testing, and is amenable to tractable analysis for the exponential family of bandits. Based on this framework, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests. This algorithm has three features for both settings: (1) its sample complexity is asymptotically optimal, (2) it is guaranteed to be δ-PAC, and (3) it addresses the computational challenge of the state-of-the-art approaches. Specifically, these approaches, which are focused only on the Gaussian setting, require Thompson sampling from the arm that is deemed the best and a challenger arm. This paper analytically shows that identifying the challenger is computationally expensive and that the proposed algorithm circumvents it. Finally, numerical experiments are provided to support the analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2021

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

This paper investigates the problem of best arm identification in contam...
research
05/24/2022

Optimality Conditions and Algorithms for Top-K Arm Identification

We consider the top-k arm identification problem for multi-armed bandits...
research
01/10/2023

Best Arm Identification in Stochastic Bandits: Beyond β-optimality

This paper focuses on best arm identification (BAI) in stochastic multi-...
research
05/17/2023

Sequential Best-Arm Identification with Application to Brain-Computer Interface

A brain-computer interface (BCI) is a technology that enables direct com...
research
11/19/2018

Decentralized Exploration in Multi-Armed Bandits

We consider the decentralized exploration problem: a set of players coll...
research
02/15/2023

Best Arm Identification for Stochastic Rising Bandits

Stochastic Rising Bandits is a setting in which the values of the expect...
research
03/18/2021

Top-m identification for linear bandits

Motivated by an application to drug repurposing, we propose the first al...

Please sign up or login with your details

Forgot password? Click here to reset