An Evaluation on Practical Batch Bayesian Sampling Algorithms for Online Adaptive Traffic Experimentation

05/24/2023
by   Zezhong Zhang, et al.
0

To speed up online testing, adaptive traffic experimentation through multi-armed bandit algorithms is rising as an essential complementary alternative to the fixed horizon A/B testing. Based on recent research on best arm identification and statistical inference with adaptively collected data, this paper derives and evaluates four Bayesian batch bandit algorithms (NB-TS, WB-TS, NB-TTTS, WB-TTTS), which are combinations of two ways of weighting batches (Naive Batch and Weighted Batch) and two Bayesian sampling strategies (Thompson Sampling and Top-Two Thompson Sampling) to adaptively determine traffic allocation. These derived Bayesian sampling algorithms are practically based on summary batch statistics of a reward metric for pilot experiments, where one of the combination WB-TTTS in this paper seems to be newly discussed. The comprehensive evaluation on the four Bayesian sampling algorithms covers trustworthiness, sensitivity and regret of a testing methodology. Moreover, the evaluation includes 4 real-world eBay experiments and 40 reproducible synthetic experiments to reveal the learnings, which covers both stationary and non-stationary situations. Our evaluation reveals that, (a) There exist false positives inflation with equivalent best arms, while seldom discussed in literatures; (b) To control false positives, connections between convergence of posterior optimal probabilities and neutral posterior reshaping are discovered; (c) WB-TTTS shows competitive recall, higher precision, and robustness against non-stationary trend; (d) NB-TS outperforms on minimizing regret trials except on precision and robustness; (e) WB-TTTS is a promising alternative if regret of A/B Testing is affordable, otherwise NB-TS is still a powerful choice with regret consideration for pilot experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2019

Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes

A survey is performed of various Multi-Armed Bandit (MAB) strategies in ...
research
07/31/2017

Taming Non-stationary Bandits: A Bayesian Approach

We consider the multi armed bandit problem in non-stationary environment...
research
05/20/2022

Actively Tracking the Optimal Arm in Non-Stationary Environments with Mandatory Probing

We study a novel multi-armed bandit (MAB) setting which mandates the age...
research
07/30/2021

Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce

E-commerce sites strive to provide users the most timely relevant inform...
research
10/30/2021

Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling

Using bandit algorithms to conduct adaptive randomised experiments can m...
research
03/21/2023

Adaptive Experimentation at Scale: Bayesian Algorithms for Flexible Batches

Standard bandit algorithms that assume continual reallocation of measure...

Please sign up or login with your details

Forgot password? Click here to reset