Budgeted Combinatorial Multi-Armed Bandits

02/08/2022
by   Debojit Das, et al.
0

We consider a budgeted combinatorial multi-armed bandit setting where, in every round, the algorithm selects a super-arm consisting of one or more arms. The goal is to minimize the total expected regret after all rounds within a limited budget. Existing techniques in this literature either fix the budget per round or fix the number of arms pulled in each round. Our setting is more general where based on the remaining budget and remaining number of rounds, the algorithm can decide how many arms to be pulled in each round. First, we propose CBwK-Greedy-UCB algorithm, which uses a greedy technique, CBwK-Greedy, to allocate the arms to the rounds. Next, we propose a reduction of this problem to Bandits with Knapsacks (BwK) with a single pull. With this reduction, we propose CBwK-LPUCB that uses PrimalDualBwK ingeniously. We rigorously prove regret bounds for CBwK-LP-UCB. We experimentally compare the two algorithms and observe that CBwK-Greedy-UCB performs incrementally better than CBwK-LP-UCB. We also show that for very high budgets, the regret goes to zero.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2017

Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits

We address the M-best-arm identification problem in multi-armed bandits....
research
02/24/2020

Optimal and Greedy Algorithms for Multi-Armed Bandits with Many Arms

We characterize Bayesian regret in a stochastic multi-armed bandit probl...
research
01/13/2022

Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach (Extended Version)

Motivated by scenarios of information diffusion and advertising in socia...
research
02/25/2021

Combinatorial Bandits under Strategic Manipulations

We study the problem of combinatorial multi-armed bandits (CMAB) under s...
research
10/05/2021

Contextual Combinatorial Volatile Bandits via Gaussian Processes

We consider a contextual bandit problem with a combinatorial action set ...
research
07/16/2022

Collaborative Best Arm Identification with Limited Communication on Non-IID Data

In this paper, we study the tradeoffs between time-speedup and the numbe...
research
06/17/2016

Structured Stochastic Linear Bandits

The stochastic linear bandit problem proceeds in rounds where at each ro...

Please sign up or login with your details

Forgot password? Click here to reset