Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation

05/31/2020
by   Tor Lattimore, et al.
0

We prove that the information-theoretic upper bound on the minimax regret for adversarial bandit convex optimisation is at most O(d^3 √(n)log(n)), improving on O(d^9.5√(n)log(n)^7.5) by Bubeck et al. (2017). The proof is based on identifying an improved exploratory distribution for convex functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2021

Minimax Regret for Bandit Convex Optimisation of Ridge Functions

We analyse adversarial bandit convex optimisation with an adversary that...
research
02/01/2023

Bandit Convex Optimisation Revisited: FTRL Achieves Õ(t^1/2) Regret

We show that a kernel estimator using multiple function evaluations can ...
research
02/20/2019

A Note on Bounding Regret of the C^2UCB Contextual Combinatorial Bandit

We revisit the proof by Qin et al. (2014) of bounded regret of the C^2UC...
research
02/10/2023

A Second-Order Method for Stochastic Bandit Convex Optimisation

We introduce a simple and efficient algorithm for unconstrained zeroth-o...
research
06/15/2022

Corruption-Robust Contextual Search through Density Updates

We study the problem of contextual search in the adversarial noise model...
research
10/09/2019

Derivative-Free Order-Robust Optimisation

In this paper, we formalise order-robust optimisation as an instance of ...
research
01/06/2022

Gaussian Imagination in Bandit Learning

Assuming distributions are Gaussian often facilitates computations that ...

Please sign up or login with your details

Forgot password? Click here to reset