Constant regret for sequence prediction with limited advice

10/05/2022
by   El Mehdi Saad, et al.
0

We investigate the problem of cumulative regret minimization for individual sequence prediction with respect to the best expert in a finite family of size K under limited access to information. We assume that in each round, the learner can predict using a convex combination of at most p experts for prediction, then they can observe a posteriori the losses of at most m experts. We assume that the loss function is range-bounded and exp-concave. In the standard multi-armed bandits setting, when the learner is allowed to play only one expert per round and observe only its feedback, known optimal regret bounds are of the order O(√($) KT). We show that allowing the learner to play one additional expert per round and observe one additional feedback improves substantially the guarantees on regret. We provide a strategy combining only p = 2 experts per round for prediction and observing m≥2 experts' losses. Its randomized regret (wrt. internal randomization of the learners' strategy) is of order O (K/m) log(Kδ–1) with probability 1 –δ, i.e., is independent of the horizon T ("constant" or "fast rate" regret) if (p≥2 and m≥3). We prove that this rate is optimal up to a logarithmic factor in K. In the case p = m = 2, we provide an upper bound of order O(K 2 log(Kδ–1)), with probability 1 –δ. Our strategies do not require any prior knowledge of the horizon T nor of the confidence parameterδ. Finally, we show that if the learner is constrained to observe only one expert feedback per round, the worst-case regret is the "slow rate"Ω(√($) KT), suggesting that synchronous observation of at least two experts per round is necessary to have a constant regret.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2021

Fast rates for prediction with limited expert advice

We investigate the problem of minimizing the excess generalization error...
research
01/28/2020

Fast Rates for Online Prediction with Abstention

In the setting of sequential prediction of individual {0, 1}-sequences w...
research
06/05/2023

Active Ranking of Experts Based on their Performances in Many Tasks

We consider the problem of ranking n experts based on their performances...
research
02/09/2021

Nonstochastic Bandits with Infinitely Many Experts

We study the problem of nonstochastic bandits with infinitely many exper...
research
10/16/2020

Online non-convex optimization with imperfect feedback

We consider the problem of online learning with non-convex losses. In te...
research
08/07/2022

Optimal Tracking in Prediction with Expert Advice

We study the prediction with expert advice setting, where the aim is to ...
research
01/29/2021

Sequential prediction under log-loss and misspecification

We consider the question of sequential prediction under the log-loss in ...

Please sign up or login with your details

Forgot password? Click here to reset