Lower Bounds for γ-Regret via the Decision-Estimation Coefficient

03/06/2023
by   Margalit Glasgow, et al.
0

In this note, we give a new lower bound for the γ-regret in bandit problems, the regret which arises when comparing against a benchmark that is γ times the optimal solution, i.e., 𝖱𝖾𝗀_γ(T) = ∑_t = 1^T γmax_π f(π) - f(π_t). The γ-regret arises in structured bandit problems where finding an exact optimum of f is intractable. Our lower bound is given in terms of a modification of the constrained Decision-Estimation Coefficient (DEC) of <cit.> (and closely related to the original offset DEC of <cit.>), which we term the γ-DEC. When restricted to the traditional regret setting where γ = 1, our result removes the logarithmic factors in the lower bound of <cit.>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2017

A Note on Tight Lower Bound for MNL-Bandit Assortment Selection Models

In this note we prove a tight lower bound for the MNL-bandit assortment ...
research
06/08/2015

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

We study the K-armed dueling bandit problem, a variation of the standard...
research
08/02/2022

Unimodal Mono-Partite Matching in a Bandit Setting

We tackle a new emerging problem, which is finding an optimal monopartit...
research
08/02/2022

UniRank: Unimodal Bandit Algorithm for Online Ranking

We tackle a new emerging problem, which is finding an optimal monopartit...
research
12/14/2022

Invariant Lipschitz Bandits: A Side Observation Approach

Symmetry arises in many optimization and decision-making problems, and h...
research
07/20/2020

A Short Note on Soft-max and Policy Gradients in Bandits Problems

This is a short communication on a Lyapunov function argument for softma...
research
12/17/2020

Experts with Lower-Bounded Loss Feedback: A Unifying Framework

The most prominent feedback models for the best expert problem are the f...

Please sign up or login with your details

Forgot password? Click here to reset