Choosing Answers in ε-Best-Answer Identification for Linear Bandits

06/09/2022
by   Marc Jourdan, et al.
8

In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been dedicated to identifying one arm that is ε-close to the best one (and not exactly the best one). In this problem with several correct answers, an identification algorithm should focus on one candidate among those answers and verify that it is correct. We demonstrate that picking the answer with highest mean does not allow an algorithm to reach asymptotic optimality in terms of expected sample complexity. Instead, a furthest answer should be identified. Using that insight to choose the candidate answer carefully, we develop a simple procedure to adapt best-arm identification algorithms to tackle ε-best-answer identification in transductive linear stochastic bandits. Finally, we propose an asymptotically optimal algorithm for this setting, which is shown to achieve competitive empirical performance against existing modified best-arm identification algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2020

Gamification of Pure Exploration for Linear Bandits

We investigate an active pure-exploration setting, that includes best-ar...
research
05/22/2022

On Elimination Strategies for Bandit Fixed-Confidence Identification

Elimination algorithms for bandit identification, which prune the plausi...
research
02/15/2023

Best Arm Identification for Stochastic Rising Bandits

Stochastic Rising Bandits is a setting in which the values of the expect...
research
02/09/2019

Pure Exploration with Multiple Correct Answers

We determine the sample complexity of pure exploration bandit problems w...
research
05/29/2017

Improving the Expected Improvement Algorithm

The expected improvement (EI) algorithm is a popular strategy for inform...
research
11/04/2020

Answer Identification in Collaborative Organizational Group Chat

We present a simple unsupervised approach for answer identification in o...
research
02/09/2023

Multi-task Representation Learning for Pure Exploration in Linear Bandits

Despite the recent success of representation learning in sequential deci...

Please sign up or login with your details

Forgot password? Click here to reset