
OSOM: A Simultaneously Optimal Algorithm for MultiArmed and Linear Contextual Bandits
We consider the stochastic linear (multiarmed) contextual bandit proble...
read it

Nonparametric Stochastic Contextual Bandits
We analyze the Karmed bandit problem where the reward for each arm is a...
read it

Regret bounds for NarendraShapiro bandit algorithms
NarendraShapiro (NS) algorithms are bandittype algorithms that have be...
read it

Convergence rates of efficient global optimization algorithms
Efficient global optimization is the problem of minimizing an unknown fu...
read it

Contextual MultiArmed Bandits for Causal Marketing
This work explores the idea of a causal contextual multiarmed bandit ap...
read it

Towards A Deeper Geometric, Analytic and Algorithmic Understanding of Margins
Given a matrix A, a linear feasibility problem (of which linear classifi...
read it

A ProblemAdaptive Algorithm for Resource Allocation
We consider a sequential stochastic resource allocation problem under th...
read it
Regularized Contextual Bandits
We consider the stochastic contextual bandit problem with additional regularization. The motivation comes from problems where the policy of the agent must be close to some baseline policy which is known to perform well on the task. To tackle this problem we use a nonparametric model and propose an algorithm splitting the context space into bins, and solving simultaneously  and independently  regularized multiarmed bandit instances on each bin. We derive slow and fast rates of convergence, depending on the unknown complexity of the problem. We also consider a new relevant margin condition to get problemindependent convergence rates, ending up in intermediate convergence rates interpolating between the aforementioned slow and fast rates.
READ FULL TEXT
Comments
There are no comments yet.