Parallelizing Contextual Linear Bandits

05/21/2021
by   Jeffrey Chan, et al.
9

Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, simultaneously proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual linear bandit algorithms, whose regret is nearly identical to their perfectly sequential counterparts – given access to the same total number of oracle queries – up to a lower-order "burn-in" term that is dependent on the context-set geometry. We provide matching information-theoretic lower bounds on parallel regret performance to establish our algorithms are asymptotically optimal in the time horizon. Finally, we also present an empirical evaluation of these parallel algorithms in several domains, including materials discovery and biological sequence design problems, to demonstrate the utility of parallelized bandits in practical settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2020

Sequential Batch Learning in Finite-Action Linear Contextual Bandits

We study the sequential batch learning problem in linear contextual band...
research
10/15/2019

Adaptive Exploration in Linear Contextual Bandit

Contextual bandits serve as a fundamental model for many sequential deci...
research
10/23/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimis...
research
10/07/2020

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

In the classical multi-armed bandit problem, instance-dependent algorith...
research
02/25/2021

Batched Neural Bandits

In many sequential decision-making problems, the individuals are split i...
research
02/07/2023

Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications

Partial monitoring is an expressive framework for sequential decision-ma...
research
06/20/2019

Sequential Experimental Design for Transductive Linear Bandits

In this paper we introduce the transductive linear bandit problem: given...

Please sign up or login with your details

Forgot password? Click here to reset