Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

12/13/2020
by   Siwei Wang, et al.
0

We study the multi-armed bandit (MAB) problem with composite and anonymous feedback. In this model, the reward of pulling an arm spreads over a period of time (we call this period as reward interval) and the player receives partial rewards of the action, convoluted with rewards from pulling other arms, successively. Existing results on this model require prior knowledge about the reward interval size as an input to their algorithms. In this paper, we propose adaptive algorithms for both the stochastic and the adversarial cases, without requiring any prior information about the reward interval. For the stochastic case, we prove that our algorithm guarantees a regret that matches the lower bounds (in order). For the adversarial case, we propose the first algorithm to jointly handle non-oblivious adversary and unknown reward interval size. We also conduct simulations based on real-world dataset. The results show that our algorithms outperform existing benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2019

Stochastic Bandits with Delayed Composite Anonymous Feedback

We explore a novel setting of the Multi-Armed Bandit (MAB) problem inspi...
research
06/01/2022

Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When Partial Feedback Counts

There is a rising interest in industrial online applications where data ...
research
06/28/2023

Allocating Divisible Resources on Arms with Unknown and Random Rewards

We consider a decision maker allocating one unit of renewable and divisi...
research
03/04/2020

Bandits with adversarial scaling

We study "adversarial scaling", a multi-armed bandit model where rewards...
research
08/05/2017

Thompson Sampling Guided Stochastic Searching on the Line for Deceptive Environments with Applications to Root-Finding Problems

The multi-armed bandit problem forms the foundation for solving a wide r...
research
02/12/2019

A Problem-Adaptive Algorithm for Resource Allocation

We consider a sequential stochastic resource allocation problem under th...
research
05/16/2019

Adaptive Sensor Placement for Continuous Spaces

We consider the problem of adaptively placing sensors along an interval ...

Please sign up or login with your details

Forgot password? Click here to reset