A framework for Multi-A(rmed)/B(andit) testing with online FDR control

06/16/2017
by   Fanny Yang, et al.
0

We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time. This setup arises in many practical applications, e.g. when pharmaceutical companies test new treatment options against control pills for different diseases, or when internet companies test their default webpages versus various alternatives over time. Our framework proposes to replace a sequence of A/B tests by a sequence of best-arm MAB instances, which can be continuously monitored by the data scientist. When interleaving the MAB tests with an an online false discovery rate (FDR) algorithm, we can obtain the best of both worlds: low sample complexity and any time online FDR control. Our main contributions are: (i) to propose reasonable definitions of a null hypothesis for MAB instances; (ii) to demonstrate how one can derive an always-valid sequential p-value that allows continuous monitoring of each MAB test; and (iii) to show that using rejection thresholds of online-FDR algorithms as the confidence levels for the MAB algorithms results in both sample-optimality, high power and low FDR at any point in time. We run extensive simulations to verify our claims, and also report results on real data collected from the New Yorker Cartoon Caption contest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2020

A New Framework for Online Testing of Heterogeneous Treatment Effect

We propose a new framework for online testing of heterogeneous treatment...
research
05/29/2022

Rapid Regression Detection in Software Deployments through Sequential Testing

The practice of continuous deployment has enabled companies to reduce ti...
research
02/25/2018

SAFFRON: an adaptive algorithm for online control of the false discovery rate

In the online false discovery rate (FDR) problem, one observes a possibl...
research
02/07/2019

Contextual Online False Discovery Rate Control

Multiple hypothesis testing, a situation when we wish to consider many h...
research
10/04/2021

Online Control of the False Discovery Rate under "Decision Deadlines"

Online testing procedures aim to control the extent of false discoveries...
research
10/26/2020

Dynamic Algorithms for Online Multiple Testing

We demonstrate new algorithms for online multiple testing that provably ...
research
05/27/2019

ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls

Major internet companies routinely perform tens of thousands of A/B test...

Please sign up or login with your details

Forgot password? Click here to reset