Empirical Bayes Regret Minimization

04/04/2019
by   Chih-Wei Hsu, et al.
6

The prevalent approach to bandit algorithm design is to have a low-regret algorithm by design. While celebrated, this approach is often conservative because it ignores many intricate properties of actual problem instances. In this work, we pioneer the idea of minimizing an empirical approximation to the Bayes regret, the expected regret with respect to a distribution over problems. This approach can be viewed as an instance of learning-to-learn, it is conceptually straightforward, and easy to implement. We conduct a comprehensive empirical study of empirical Bayes regret minimization in a wide range of bandit problems, from Bernoulli bandits to structured problems, such as generalized linear and Gaussian process bandits. We report significant improvements over state-of-the-art bandit algorithms, often by an order of magnitude, by simply optimizing over a sample from the distribution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2020

Upper Confidence Bounds for Combining Stochastic Bandits

We provide a simple method to combine stochastic bandit algorithms. Our ...
research
02/04/2020

Decoupling Learning Rates Using Empirical Bayes Priors

In this work, we propose an Empirical Bayes approach to decouple the lea...
research
03/15/2023

Borda Regret Minimization for Generalized Linear Dueling Bandits

Dueling bandits are widely used to model preferential feedback that is p...
research
07/13/2021

No Regrets for Learning the Prior in Bandits

We propose AdaTS, a Thompson sampling algorithm that adapts sequentially...
research
12/28/2021

Learning Across Bandits in High Dimension via Robust Statistics

Decision-makers often face the "many bandits" problem, where one must si...
research
06/05/2020

Adaptation to the Range in K-Armed Bandits

We consider stochastic bandit problems with K arms, each associated with...
research
05/12/2023

High Accuracy and Low Regret for User-Cold-Start Using Latent Bandits

We develop a novel latent-bandit algorithm for tackling the cold-start p...

Please sign up or login with your details

Forgot password? Click here to reset