A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

10/02/2015
by   Giuseppe Burtini, et al.
0

Adaptive and sequential experiment design is a well-studied area in numerous domains. We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments. We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or consideration of the experiment design context. Finally, at the end of the paper, we present a table of known upper-bounds of regret for all studied algorithms providing both perspectives for future theoretical work and a decision-making tool for practitioners looking for theoretical guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2013

Modeling Human Decision-making in Generalized Gaussian Multi-armed Bandits

We present a formal model of human decision-making in explore-exploit ta...
research
08/28/2021

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

In this paper, we study endogeneity problems in algorithmic decision-mak...
research
05/23/2022

Falsification of Multiple Requirements for Cyber-Physical Systems Using Online Generative Adversarial Networks and Multi-Armed Bandits

We consider the problem of falsifying safety requirements of Cyber-Physi...
research
06/11/2023

Multi-Source Test-Time Adaptation as Dueling Bandits for Extractive Question Answering

In this work, we study multi-source test-time model adaptation from user...
research
07/30/2018

Preference-based Online Learning with Dueling Bandits: A Survey

In machine learning, the notion of multi-armed bandits refers to a class...
research
10/16/2021

Statistical Consequences of Dueling Bandits

Multi-Armed-Bandit frameworks have often been used by researchers to ass...
research
11/20/2018

Playing with and against Hedge

Hedge has been proposed as an adaptive scheme, which guides an agent's d...

Please sign up or login with your details

Forgot password? Click here to reset