A unified framework for bandit multiple testing

07/15/2021
by   Ziyu Xu, et al.
0

In bandit multiple hypothesis testing, each arm corresponds to a different null hypothesis that we wish to test, and the goal is to design adaptive algorithms that correctly identify large set of interesting arms (true discoveries), while only mistakenly identifying a few uninteresting ones (false discoveries). One common metric in non-bandit multiple testing is the false discovery rate (FDR). We propose a unified, modular framework for bandit FDR control that emphasizes the decoupling of exploration and summarization of evidence. We utilize the powerful martingale-based concept of “e-processes” to ensure FDR control for arbitrary composite nulls, exploration rules and stopping times in generic problem settings. In particular, valid FDR control holds even if the reward distributions of the arms could be dependent, multiple arms may be queried simultaneously, and multiple (cooperating or competing) agents may be querying arms, covering combinatorial semi-bandit type settings as well. Prior work has considered in great detail the setting where each arm's reward distribution is independent and sub-Gaussian, and a single arm is queried at each step. Our framework recovers matching sample complexity guarantees in this special case, and performs comparably or better in practice. For other settings, sample complexities will depend on the finer details of the problem (composite nulls being tested, exploration algorithm, data dependence structure, stopping rule) and we do not explore these; our contribution is to show that the FDR guarantee is clean and entirely agnostic to these details.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2019

Polynomial-time Algorithms for Combinatorial Pure Exploration with Full-bandit Feedback

We study the problem of stochastic combinatorial pure exploration (CPE),...
research
08/20/2023

Thompson Sampling for Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit

We study the real-valued combinatorial pure exploration of the multi-arm...
research
02/09/2022

Optimal Clustering with Bandit Feedback

This paper considers the problem of online clustering with bandit feedba...
research
11/01/2022

Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits

In the infinite-armed bandit problem, each arm's average reward is sampl...
research
05/04/2018

Combinatorial Pure Exploration with Continuous and Separable Reward Functions and Its Applications (Extended Version)

We study the Combinatorial Pure Exploration problem with Continuous and ...
research
01/18/2021

Sequential causal inference in a single world of connected units

We consider adaptive designs for a trial involving N individuals that we...
research
10/29/2021

A/B/n Testing with Control in the Presence of Subpopulations

Motivated by A/B/n testing applications, we consider a finite set of dis...

Please sign up or login with your details

Forgot password? Click here to reset