Lower Bounds for γ-Regret via the Decision-Estimation Coefficient

03/06/2023
by   Margalit Glasgow, et al.
0

In this note, we give a new lower bound for the γ-regret in bandit problems, the regret which arises when comparing against a benchmark that is γ times the optimal solution, i.e., 𝖱𝖾𝗀_γ(T) = ∑_t = 1^T γmax_π f(π) - f(π_t). The γ-regret arises in structured bandit problems where finding an exact optimum of f is intractable. Our lower bound is given in terms of a modification of the constrained Decision-Estimation Coefficient (DEC) of <cit.> (and closely related to the original offset DEC of <cit.>), which we term the γ-DEC. When restricted to the traditional regret setting where γ = 1, our result removes the logarithmic factors in the lower bound of <cit.>.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset