Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

09/10/2021
by   Yinglun Zhu, et al.
0

The model selection problem in the pure exploration linear bandit setting is introduced and studied in both the fixed confidence and fixed budget settings. The model selection problem considers a nested sequence of hypothesis classes of increasing complexities. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model, rather than suffering from the complexity measure related to the largest hypothesis class. We provide evidence showing that a standard doubling trick over dimension fails to achieve the optimal instance-dependent sample complexity. Our algorithms define a new optimization problem based on experimental design that leverages the geometry of the action set to efficiently identify a near-optimal hypothesis class. Our fixed budget algorithm uses a novel application of a selection-validation trick in bandits. This provides a new method for the understudied fixed budget setting in linear bandits (even without the added challenge of model selection). We further generalize the model selection problem to the misspecified regime, adapting our algorithms in both fixed confidence and fixed budget settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2021

Pareto Optimal Model Selection in Linear Bandits

We study a model selection problem in the linear bandit setting, where t...
research
11/21/2017

Disagreement-based combinatorial pure exploration: Efficient algorithms and an analysis with localization

We design new algorithms for the combinatorial pure exploration problem ...
research
06/21/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

This paper proposes near-optimal algorithms for the pure-exploration lin...
research
01/26/2020

Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property

The agnostic PAC learning model consists of: a Hypothesis Space H, a pro...
research
06/04/2020

Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits

We consider the problem of model selection for two popular stochastic li...
research
03/18/2021

Top-m identification for linear bandits

Motivated by an application to drug repurposing, we propose the first al...
research
12/23/2021

Model Selection in Batch Policy Optimization

We study the problem of model selection in batch policy optimization: gi...

Please sign up or login with your details

Forgot password? Click here to reset