On conditional versus marginal bias in multi-armed bandits

02/19/2020
by   Jaehyeok Shin, et al.
12

The bias of the sample means of the arms in multi-armed bandits is an important issue in adaptive data analysis that has recently received considerable attention in the literature. Existing results relate in precise ways the sign and magnitude of the bias to various sources of data adaptivity, but do not apply to the conditional inference setting in which the sample means are computed only if some specific conditions are satisfied. In this paper, we characterize the sign of the conditional bias of monotone functions of the rewards, including the sample mean. Our results hold for arbitrary conditioning events and leverage natural monotonicity properties of the data collection policy. We further demonstrate, through several examples from sequential testing and best arm identification, that the sign of the conditional and unconditional bias of the sample mean of an arm can be different, depending on the conditioning event. Our analysis offers new and interesting perspectives on the subtleties of assessing the bias in data adaptive settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2019

The bias of the sample mean in multi-armed bandits can be positive or negative

It is well known that in stochastic multi-armed bandits (MAB), the sampl...
research
12/14/2018

Conditional bias reduction can be dangerous: a key example from sequential analysis

We present a key example from sequential analysis, which illustrates tha...
research
02/02/2019

On the bias, risk and consistency of sample means in multi-armed bandits

In the classic stochastic multi-armed bandit problem, it is well known t...
research
05/09/2021

Stochastic Multi-Armed Bandits with Control Variates

This paper studies a new variant of the stochastic multi-armed bandits p...
research
03/29/2018

Best arm identification in multi-armed bandits with delayed feedback

We propose a generalization of the best arm identification problem in st...
research
09/21/2023

Optimal Conditional Inference in Adaptive Experiments

We study batched bandit experiments and consider the problem of inferenc...
research
12/18/2017

Accurate Inference for Adaptive Linear Models

Estimators computed from adaptively collected data do not behave like th...

Please sign up or login with your details

Forgot password? Click here to reset