Experts with Lower-Bounded Loss Feedback: A Unifying Framework

12/17/2020
by   Eyal Gofer, et al.
0

The most prominent feedback models for the best expert problem are the full information and bandit models. In this work we consider a simple feedback model that generalizes both, where on every round, in addition to a bandit feedback, the adversary provides a lower bound on the loss of each expert. Such lower bounds may be obtained in various scenarios, for instance, in stock trading or in assessing errors of certain measurement devices. For this model we prove optimal regret bounds (up to logarithmic factors) for modified versions of Exp3, generalizing algorithms and bounds both for the bandit and the full-information settings. Our second-order unified regret analysis simulates a two-step loss update and highlights three Hessian or Hessian-like expressions, which map to the full-information regret, bandit regret, and a hybrid of both. Our results intersect with those for bandits with graph-structured feedback, in that both settings can accommodate feedback from an arbitrary subset of experts on each round. However, our model also accommodates partial feedback at the single-expert level, by allowing non-trivial lower bounds on each loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2022

Second Order Regret Bounds Against Generalized Expert Sequences under Partial Bandit Feedback

We study the problem of expert advice under partial bandit feedback sett...
research
05/30/2022

Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

The problem of bandit with graph feedback generalizes both the multi-arm...
research
09/12/2017

Setpoint Tracking with Partially Observed Loads

We use online convex optimization (OCO) for setpoint tracking with uncer...
research
05/15/2023

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs

We derive a new analysis of Follow The Regularized Leader (FTRL) for onl...
research
02/13/2019

Distributed Online Linear Regression

We study online linear regression problems in a distributed setting, whe...
research
04/01/2018

Online learning with graph-structured feedback against adaptive adversaries

We derive upper and lower bounds for the policy regret of T-round online...
research
03/06/2023

Lower Bounds for γ-Regret via the Decision-Estimation Coefficient

In this note, we give a new lower bound for the γ-regret in bandit probl...

Please sign up or login with your details

Forgot password? Click here to reset