On Submodular Contextual Bandits

12/03/2021
βˆ™
by   Dean P. Foster, et al.
βˆ™
5
βˆ™

We consider the problem of contextual bandits where actions are subsets of a ground set and mean rewards are modeled by an unknown monotone submodular function that belongs to a class β„±. We allow time-varying matroid constraints to be placed on the feasible sets. Assuming access to an online regression oracle with regret 𝖱𝖾𝗀(β„±), our algorithm efficiently randomizes around local optima of estimated functions according to the Inverse Gap Weighting strategy. We show that cumulative regret of this procedure with time horizon n scales as O(√(n 𝖱𝖾𝗀(β„±))) against a benchmark with a multiplicative factor 1/2. On the other hand, using the techniques of (Filmus and Ward 2014), we show that an Ο΅-Greedy procedure with local randomization attains regret of O(n^2/3𝖱𝖾𝗀(β„±)^1/3) against a stronger (1-e^-1) benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 02/06/2016

BISTRO: An Efficient Relaxation-Based Method for Contextual Bandits

We present efficient algorithms for the problem of contextual bandits wi...
research
βˆ™ 02/02/2023

Randomized Greedy Learning for Non-monotone Stochastic Submodular Maximization Under Full-bandit Feedback

We investigate the problem of unconstrained combinatorial multi-armed ba...
research
βˆ™ 02/20/2017

An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits

We present a new strategy for gap estimation in randomized algorithms fo...
research
βˆ™ 07/07/2022

Interactive Combinatorial Bandits: Balancing Competitivity and Complementarity

We study non-modular function maximization in the online interactive ban...
research
βˆ™ 04/27/2015

Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits

We study contextual bandits with budget and time constraints, referred t...
research
βˆ™ 03/23/2023

Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

This paper investigates the problem of combinatorial multiarmed bandits ...
research
βˆ™ 02/16/2023

Infinite Action Contextual Bandits with Reusable Data Exhaust

For infinite action contextual bandits, smoothed regret and reduction to...

Please sign up or login with your details

Forgot password? Click here to reset