Asymptotic expansion for batched bandits

04/09/2023
by   Yechan Park, et al.
0

In bandit algorithms, the randomly time-varying adaptive experimental design makes it difficult to apply traditional limit theorems to off-policy evaluation of the treatment effect. Moreover, the normal approximation by the central limit theorem becomes unsatisfactory for lack of information due to the small sample size of the inferior arm. To resolve this issue, we introduce a backwards asymptotic expansion method and prove the validity of this scheme based on the partial mixing, that was originally introduced for the expansion of the distribution of a functional of a jump-diffusion process in a random environment. The theory is generalized in this paper to incorporate the backward propagation of random functions in the bandit algorithm. Besides the analytical validation, the simulation studies also support the new method. Our formulation is general and applicable to nonlinearly parametrized differentiable statistical models having an adaptive design.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2020

Inference for Batched Bandits

As bandit algorithms are increasingly utilized in scientific studies, th...
research
12/31/2020

Asymptotic expansion of a variation with anticipative weights

Asymptotic expansion of a variation with anticipative weights is derived...
research
02/13/2020

Adaptive Experimental Design for Efficient Treatment Effect Estimation: Randomized Allocation via Contextual Bandit Algorithm

Many scientific experiments have an interest in the estimation of the av...
research
12/22/2022

Small time approximation in Wright-Fisher diffusion

Wright-Fisher model has been widely used to represent random variation i...
research
09/18/2018

Gram Charlier and Edgeworth expansion for sample variance

In this paper, we derive a valid Edgeworth expansions for the Bessel cor...
research
09/16/2021

Policy Choice and Best Arm Identification: Comments on "Adaptive Treatment Assignment in Experiments for Policy Choice"

Adaptive experimental design for efficient decision-making is an importa...

Please sign up or login with your details

Forgot password? Click here to reset