Efficient Online Bandit Multiclass Learning with Õ(√(T)) Regret

02/25/2017
by   Alina Beygelzimer, et al.
0

We present an efficient second-order algorithm with Õ(1/η√(T)) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by η, for a range of η restricted by the norm of the competitor. The family of loss functions ranges from hinge loss (η=0) to squared hinge loss (η=1). This provides a solution to the open problem of (J. Abernethy and A. Rakhlin. An efficient bandit algorithm for √(T)-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it also performs favorably against earlier algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2023

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

In this paper, we improve the regret bound for online kernel selection u...
research
03/20/2021

Projection-free Distributed Online Learning with Strongly Convex Losses

To efficiently solve distributed online learning problems with complicat...
research
04/18/2019

Semi-bandit Optimization in the Dispersed Setting

In this work, we study the problem of online optimization of piecewise L...
research
07/24/2020

Exploiting the Surrogate Gap in Online Multiclass Classification

We present Gaptron, a randomized first-order algorithm for online multic...
research
07/31/2019

Multi-Point Bandit Algorithms for Nonstationary Online Nonconvex Optimization

Bandit algorithms have been predominantly analyzed in the convex setting...
research
07/02/2018

Adaptation to Easy Data in Prediction with Limited Advice

We derive an online learning algorithm with improved regret guarantees f...
research
09/15/2019

Online k-means Clustering

We study the problem of online clustering where a clustering algorithm h...

Please sign up or login with your details

Forgot password? Click here to reset