Adaptively Exploiting d-Separators with Causal Bandits

02/10/2022
by   Blair Bilodeau, et al.
0

Multi-armed bandit problems provide a framework to identify the optimal intervention over a sequence of repeated experiments. Without additional assumptions, minimax optimal performance (measured by cumulative regret) is well-understood. With access to additional observed variables that d-separate the intervention from the outcome (i.e., they are a d-separator), recent causal bandit algorithms provably incur less regret. However, in practice it is desirable to be agnostic to whether observed variables are a d-separator. Ideally, an algorithm should be adaptive; that is, perform nearly as well as an algorithm with oracle knowledge of the presence or absence of a d-separator. In this work, we formalize and study this notion of adaptivity, and provide a novel algorithm that simultaneously achieves (a) optimal regret when a d-separator is observed, improving on classical minimax algorithms, and (b) significantly smaller regret than recent causal bandit algorithms when the observed variables are not a d-separator. Crucially, our algorithm does not require any oracle knowledge of whether a d-separator is observed. We also generalize this adaptivity to other conditions, such as the front-door criterion.

READ FULL TEXT

page 1

page 2

page 3

page 4

01/05/2022

Bridging Adversarial and Nonstationary Multi-armed Bandit

In the multi-armed bandit framework, there are two formulations that are...
07/06/2021

Causal Bandits on General Graphs

We study the problem of determining the best intervention in a Causal Ba...
06/05/2021

Causal Bandits with Unknown Graph Structure

In causal bandit problems, the action set consists of interventions on v...
03/07/2021

Hierarchical Causal Bandit

Causal bandit is a nascent learning model where an agent sequentially ex...
03/30/2021

Optimal Stochastic Nonconvex Optimization with Bandit Feedback

In this paper, we analyze the continuous armed bandit problems for nonco...
10/16/2017

On the Hardness of Inventory Management with Censored Demand Data

We consider a repeated newsvendor problem where the inventory manager ha...
07/16/2020

Sparsity-Agnostic Lasso Bandit

We consider a stochastic contextual bandit problem where the dimension d...

Code Repositories

adaptive-causal-bandits

Code for our paper "Adaptively Exploiting d-Separators with Causal Bandits".


view repo