An Arm-wise Randomization Approach to Combinatorial Linear Semi-bandits

09/05/2019
by   Kei Takemura, et al.
0

Combinatorial linear semi-bandits (CLS) are widely applicable frameworks of sequential decision-making, in which a learner chooses a subset of arms from a given set of arms associated with feature vectors. For this problem, existing algorithms work poorly for the clustered case in which the feature vectors form many large clusters. This is a critical shortcoming in practice because such situations can be found in many applications including recommender systems. In this paper, we clarify the cause of why such a shortcoming appears, and to overcome this, we introduce a key technique of arm-wise randomization. We propose two algorithms with this technique: the perturbed C^2UCB (PC^2UCB) and the Thompson sampling (TS). Our empirical evaluation with artificial and real-world datasets demonstrates that the proposed algorithms with the arm-wise randomization technique outperform the existing algorithms without this technique, especially for the clustered case. Our contributions also include theoretical analyses that provide high probability asymptotic regret bounds for our algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2021

Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions

The contextual combinatorial semi-bandit problem with linear payoff func...
research
09/04/2019

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

In this paper, we study Censored Semi-Bandits, a novel variant of the se...
research
05/31/2023

Combinatorial Neural Bandits

We consider a contextual combinatorial bandit problem where in each roun...
research
03/30/2023

Contextual Combinatorial Bandits with Probabilistically Triggered Arms

We study contextual combinatorial bandits with probabilistically trigger...
research
04/12/2021

Censored Semi-Bandits for Resource Allocation

We consider the problem of sequentially allocating resources in a censor...
research
08/21/2023

Clustered Linear Contextual Bandits with Knapsacks

In this work, we study clustered contextual bandits where rewards and re...

Please sign up or login with your details

Forgot password? Click here to reset