Budget-Constrained Bandits over General Cost and Reward Distributions

02/29/2020
by   Semih Cayci, et al.
0

We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. The objective is to maximize the total expected reward under a budget constraint on the total cost. The model is general in the sense that it allows correlated and potentially heavy-tailed cost-reward pairs that can take on negative values as required by many applications. We show that if moments of order (2+γ) for some γ > 0 exist for all cost-reward pairs, O(log B) regret is achievable for a budget B>0. In order to achieve tight regret bounds, we propose algorithms that exploit the correlation between the cost and reward of each arm by extracting the common information via linear minimum mean-square error estimation. We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret bounds, which are optimal up to a universal constant factor in the case of jointly Gaussian cost and reward pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

Continuous Time Bandits With Sampling Costs

We consider a continuous-time multi-arm bandit problem (CTMAB), where th...
research
06/09/2021

A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback

In a wide variety of applications including online advertising, contract...
research
02/20/2015

Low-Cost Learning via Active Data Procurement

We design mechanisms for online procurement of data held by strategic ag...
research
10/23/2018

Unifying the stochastic and the adversarial Bandits with Knapsack

This paper investigates the adversarial Bandits with Knapsack (BwK) onli...
research
10/11/2022

Regret Analysis of the Stochastic Direct Search Method for Blind Resource Allocation

Motivated by programmatic advertising optimization, we consider the task...
research
06/30/2020

Continuous-Time Multi-Armed Bandits with Controlled Restarts

Time-constrained decision processes have been ubiquitous in many fundame...
research
10/20/2017

Uniformly bounded regret in the multi-secretary problem

In the secretary problem of Cayley (1875) and Moser (1956), n non-negati...

Please sign up or login with your details

Forgot password? Click here to reset