Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

02/11/2021
by   Chung-Wei Lee, et al.
0

In this work, we develop linear bandit algorithms that automatically adapt to different environments. By plugging a novel loss estimator into the optimization problem that characterizes the instance-optimal strategy, our first algorithm not only achieves nearly instance-optimal regret in stochastic environments, but also works in corrupted environments with additional regret being the amount of corruption, while the state-of-the-art (Li et al., 2019) achieves neither instance-optimality nor the optimal dependence on the corruption amount. Moreover, by equipping this algorithm with an adversarial component and carefully-designed testings, our second algorithm additionally enjoys minimax-optimal regret in completely adversarial environments, which is the first of this kind to our knowledge. Finally, all our guarantees hold with high probability, while existing instance-optimal guarantees only hold in expectation.

READ FULL TEXT
research
06/27/2012

An Adaptive Algorithm for Finite Stochastic Partial Monitoring

We present a new anytime algorithm that achieves near-optimal regret for...
research
01/28/2022

Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits

In this paper, we generalize the concept of heavy-tailed multi-armed ban...
research
07/19/2018

An Optimal Algorithm for Stochastic and Adversarial Bandits

We provide an algorithm that achieves the optimal (up to constants) fini...
research
06/29/2022

Best of Both Worlds Model Selection

We study the problem of model selection in bandit scenarios in the prese...
research
05/18/2022

Slowly Changing Adversarial Bandit Algorithms are Provably Efficient for Discounted MDPs

Reinforcement learning (RL) generalizes bandit problems with additional ...
research
09/04/2019

Stochastic Linear Optimization with Adversarial Corruption

We extend the model of stochastic bandits with adversarial corruption (L...
research
02/21/2019

Certainty Equivalent Control of LQR is Efficient

We study the performance of the certainty equivalent controller on the L...

Please sign up or login with your details

Forgot password? Click here to reset