Derivative-Free Order-Robust Optimisation

10/09/2019
by   Victor Gabillon, et al.
0

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2023

Non-stationary Projection-free Online Learning with Dynamic and Adaptive Regret Guarantees

Projection-free online learning has drawn increasing interest due to its...
research
09/16/2023

Efficient Methods for Non-stationary Online Learning

Non-stationary online learning has drawn much attention in recent years....
research
02/07/2021

Non-stationary Online Learning with Memory and Non-stochastic Control

We study the problem of Online Convex Optimization (OCO) with memory, wh...
research
05/31/2020

Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation

We prove that the information-theoretic upper bound on the minimax regre...
research
06/28/2022

Dynamic Memory for Interpretable Sequential Optimisation

Real-world applications of reinforcement learning for recommendation and...
research
01/31/2023

Online Learning in Dynamically Changing Environments

We study the problem of online learning and online regret minimization w...
research
08/10/2020

Self-accelerating root search and optimisation methods based on rational interpolation

Iteration methods based on barycentric rational interpolation are derive...

Please sign up or login with your details

Forgot password? Click here to reset