The Finite-Horizon Two-Armed Bandit Problem with Binary Responses: A Multidisciplinary Survey of the History, State of the Art, and Myths

06/20/2019
by   Peter Jacko, et al.
0

In this paper we consider the two-armed bandit problem, which often naturally appears per se or as a subproblem in some multi-armed generalizations, and serves as a starting point for introducing additional problem features. The consideration of binary responses is motivated by its widespread applicability and by being one of the most studied settings. We focus on the undiscounted finite-horizon objective, which is the most relevant in many applications. We make an attempt to unify the terminology as this is different across disciplines that have considered this problem, and present a unified model cast in the Markov decision process framework, with subject responses modelled using the Bernoulli distribution, and the corresponding Beta distribution for Bayesian updating. We give an extensive account of the history and state of the art of approaches from several disciplines, including design of experiments, Bayesian decision theory, naive designs, reinforcement learning, biostatistics, and combination designs. We evaluate these designs, together with a few newly proposed, accurately computationally (using a newly written package in Julia programming language by the author) in order to compare their performance. We show that conclusions are different for moderate horizons (typical in practice) than for small horizons (typical in academic literature reporting computational results). We further list and clarify a number of myths about this problem, e.g., we show that, computationally, much larger problems can be designed to Bayes-optimality than what is commonly believed.

READ FULL TEXT
research
06/02/2022

A Confirmation of a Conjecture on the Feldman's Two-armed Bandit Problem

Myopic strategy is one of the most important strategies when studying ba...
research
12/14/2020

Bayesian Optimization – Multi-Armed Bandit Problem

In this report, we survey Bayesian Optimization methods focussed on the ...
research
07/13/2019

A new approach to Poissonian two-armed bandit problem

We consider a continuous time two-armed bandit problem in which incomes ...
research
08/15/2019

Exponential two-armed bandit problem

We consider exponential two-armed bandit problem in which incomes are de...
research
03/27/2013

Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization

We address the problem of finding the maximizer of a nonlinear smooth fu...
research
05/12/2023

Designing Optimal Behavioral Experiments Using Machine Learning

Computational models are powerful tools for understanding human cognitio...

Please sign up or login with your details

Forgot password? Click here to reset