Bandit Convex Optimization in Non-stationary Environments

07/29/2019
by   Peng Zhao, et al.
4

Bandit Convex Optimization (BCO) is a fundamental framework for modeling sequential decision-making with partial information, where the only feedback available to the player is the one-point or two-point function values. In this paper, we investigate BCO in non-stationary environments and choose the dynamic regret as the performance measure, which is defined as the difference between the cumulative loss incurred by the algorithm and that of any feasible comparator sequence. Let T be the time horizon and P_T be the path-length of the comparator sequence that reflects the non-stationary of environments. We propose a novel algorithm that achieves O(T^3/4(1+P_T)^1/2) and O(T^1/2(1+P_T)^1/2) dynamic regret respectively for the one-point and two-point feedback models. The latter result is optimal, matching the Ω(T^1/2(1+P_T)^1/2) lower bound established in this paper. Notably, our algorithm is more adaptive to non-stationary environments since it does not require prior knowledge of the path-length P_T ahead of time, which is generally unknown.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2021

Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization

We investigate online convex optimization in non-stationary environments...
research
10/11/2022

On Adaptivity in Non-stationary Stochastic Optimization With Bandit Feedback

In this paper we study the non-stationary stochastic optimization questi...
research
05/20/2023

Non-stationary Online Convex Optimization with Arbitrary Delays

Online convex optimization (OCO) with arbitrary delays, in which gradien...
research
07/19/2018

Adaptive Variational Particle Filtering in Non-stationary Environments

Online convex optimization is a sequential prediction framework with the...
research
02/05/2018

Wireless Optimisation via Convex Bandits: Unlicensed LTE/WiFi Coexistence

Bandit Convex Optimisation (BCO) is a powerful framework for sequential ...
research
09/19/2020

Recursive Experts: An Efficient Optimal Mixture of Learning Systems in Dynamic Environments

Sequential learning systems are used in a wide variety of problems from ...
research
12/22/2017

Network Utility Maximization in Adversarial Environments

Stochastic models have been dominant in network optimization theory for ...

Please sign up or login with your details

Forgot password? Click here to reset