Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

02/09/2018
by   Zeyuan Allen-Zhu, et al.
0

Regret bounds in online learning compare the player's performance to L^*, the optimal performance in hindsight with a fixed strategy. Typically such bounds scale with the square root of the time horizon T. The more refined concept of first-order regret bound replaces this with a scaling √(L^*), which may be much smaller than √(T). It is well known that minor variants of standard algorithms satisfy first-order regret bounds in the full information and multi-armed bandit settings. In a COLT 2017 open problem, Agarwal, Krishnamurthy, Langford, Luo, and Schapire raised the issue that existing techniques do not seem sufficient to obtain first-order regret bounds for the contextual bandit problem. In the present paper, we resolve this open problem by presenting a new strategy based on augmenting the policy space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2020

MOTS: Minimax Optimal Thompson Sampling

Thompson sampling is one of the most widely used algorithms for many onl...
research
09/15/2012

Further Optimal Regret Bounds for Thompson Sampling

Thompson Sampling is one of the oldest heuristics for multi-armed bandit...
research
04/24/2022

Complete Policy Regret Bounds for Tallying Bandits

Policy regret is a well established notion of measuring the performance ...
research
02/02/2019

First-Order Regret Analysis of Thompson Sampling

We address online combinatorial optimization when the player has a prior...
research
05/26/2017

Online Auctions and Multi-scale Online Learning

We consider revenue maximization in online auctions and pricing. A selle...
research
02/11/2022

A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

This work addresses a version of the two-armed Bernoulli bandit problem ...
research
05/30/2019

Equipping Experts/Bandits with Long-term Memory

We propose the first reduction-based approach to obtaining long-term mem...

Please sign up or login with your details

Forgot password? Click here to reset