Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

06/04/2018
by   Emilie Kaufmann, et al.
0

Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-task in planning, game tree search and reinforcement learning. We formalize this learning task as the problem of sequentially testing how the minimum mean among a finite set of distributions compares to a given threshold. We develop refined non-asymptotic lower bounds, which show that optimality mandates very different sampling behavior for a low vs high true minimum. We show that Thompson Sampling and the intuitive Lower Confidence Bounds policy each nail only one of these cases. We develop a novel approach that we call Murphy Sampling. Even though it entertains exclusively low true minima, we prove that MS is optimal for both possibilities. We then design advanced self-normalized deviation inequalities, fueling more aggressive stopping rules. We complement our theoretical guarantees by experiments showing that MS works best in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/31/2023

New Lower Bounds for Testing Monotonicity and Log Concavity of Distributions

We develop a new technique for proving distribution testing lower bounds...
research
05/13/2014

On the Complexity of A/B Testing

A/B testing refers to the task of determining the best option among two ...
research
10/27/2015

Online Learning with Gaussian Payoffs and Side Observations

We consider a sequential learning problem with Gaussian payoffs and side...
research
08/03/2023

Tight Bounds for Local Glivenko-Cantelli

This paper addresses the statistical problem of estimating the infinite-...
research
11/05/2021

Maillard Sampling: Boltzmann Exploration Done Optimally

The PhD thesis of Maillard (2013) presents a randomized algorithm for th...
research
05/05/2020

Variance Reduction for Sequential Sampling in Stochastic Programming

This paper investigates the variance reduction techniques Antithetic Var...
research
12/22/2020

Refined bounds for randomized experimental design

Experimental design is an approach for selecting samples among a given s...

Please sign up or login with your details

Forgot password? Click here to reset