DeepAI AI Chat
Log In Sign Up

X-Armed Bandits: Optimizing Quantiles and Other Risks

by   Léonard Torossian, et al.
ENS Lyon

We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution, like for example quantiles. We provide a generic regret analysis of StoROO. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on random-size samples. We explain in detail how to construct them for quantiles, providing tight bounds based on Kullback-Leibler divergence. The interest of these tight bounds is highlighted by numerical experiments that show a dramatic improvement over standard approaches.


page 1

page 2

page 3

page 4


Thompson Sampling for Gaussian Entropic Risk Bandits

The multi-armed bandit (MAB) problem is a ubiquitous decision-making pro...

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

This paper unifies the design and simplifies the analysis of risk-averse...

Thompson Sampling Algorithms for Mean-Variance Bandits

The multi-armed bandit (MAB) problem is a classical learning task that e...

Embedded Bandits for Large-Scale Black-Box Optimization

Random embedding has been applied with empirical success to large-scale ...

Open Problem: Tight Online Confidence Intervals for RKHS Elements

Confidence intervals are a crucial building block in the analysis of var...

Value-at-Risk Optimization with Gaussian Processes

Value-at-risk (VaR) is an established measure to assess risks in critica...

A Bayesian approach for the analysis of error rate studies in forensic science

Over the past decade, the field of forensic science has received recomme...