-
Nonparametric Contextual Bandits in an Unknown Metric Space
Consider a nonparametric contextual multi-arm bandit problem where each ...
read it
-
Thompson Sampling for CVaR Bandits
Risk awareness is an important feature to formulate a variety of real wo...
read it
-
Recovering Bandits
We study the recovering bandits problem, a variant of the stochastic mul...
read it
-
Compliance-Aware Bandits
Motivated by clinical trials, we study bandits with observable non-compl...
read it
-
The Combinatorial Multi-Bandit Problem and its Application to Energy Management
We study a Combinatorial Multi-Bandit Problem motivated by applications ...
read it
-
Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits
We consider an adversarial variant of the classic K-armed linear context...
read it
-
Predictive Bandits
We introduce and study a new class of stochastic bandit problems, referr...
read it
Bandits for BMO Functions
We study the bandit problem where the underlying expected reward is a Bounded Mean Oscillation (BMO) function. BMO functions are allowed to be discontinuous and unbounded, and are useful in modeling signals with infinities in the do-main. We develop a toolset for BMO bandits, and provide an algorithm that can achieve poly-log δ-regret – a regret measured against an arm that is optimal after removing a δ-sized portion of the arm space.
READ FULL TEXT
Comments
There are no comments yet.