Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions

06/04/2021
by   Tal Lancewicki, et al.
0

We study the stochastic Multi-Armed Bandit (MAB) problem with random delays in the feedback received by the algorithm. We consider two settings: the reward-dependent delay setting, where realized delays may depend on the stochastic rewards, and the reward-independent delay setting. Our main contribution is algorithms that achieve near-optimal regret in each of the settings, with an additional additive dependence on the quantiles of the delay distribution. Our results do not make any assumptions on the delay distributions: in particular, we do not assume they come from any parametric family of distributions and allow for unbounded support and expectation; we further allow for infinite delays where the algorithm might occasionally not observe any feedback.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2022

Thompson Sampling with Unrestricted Delays

We investigate properties of Thompson Sampling in the stochastic multi-a...
research
07/21/2022

Delayed Feedback in Generalised Linear Bandits Revisited

The stochastic generalised linear bandit is a well-understood model for ...
research
06/18/2020

Stochastic bandits with arm-dependent delays

Significant work has been recently dedicated to the stochastic delayed b...
research
10/26/2021

Scale-Free Adversarial Multi-Armed Bandit with Arbitrary Feedback Delays

We consider the Scale-Free Adversarial Multi Armed Bandit (MAB) problem ...
research
12/02/2022

Multi-Agent Reinforcement Learning with Reward Delays

This paper considers multi-agent reinforcement learning (MARL) where the...
research
01/23/2019

Cooperation Speeds Surfing: Use Co-Bandit!

In this paper, we explore the benefit of cooperation in adversarial band...
research
06/21/2021

Smooth Sequential Optimisation with Delayed Feedback

Stochastic delays in feedback lead to unstable sequential learning using...

Please sign up or login with your details

Forgot password? Click here to reset