Pure Exploration of Causal Bandits

06/16/2022
by   Nuoya Xiong, et al.
0

Causal bandit problem integrates causal inference with multi-armed bandits. The pure exploration of causal bandits is the following online learning task: given a causal graph with unknown causal inference distributions, in each round we can choose to either intervene one variable or do no intervention, and observe the random outcomes of all random variables, with the goal that using as few rounds as possible, we can output an intervention that gives the best (or almost best) expected outcome on the reward variable Y with probability at least 1-δ, where δ is a given confidence level. We provide first gap-dependent fully adaptive pure exploration algorithms on three types of causal models including parallel graphs, general graphs with small number of backdoor parents, and binary generalized linear models. Our algorithms improve both prior causal bandit algorithms, which are not adaptive to reward gaps, and prior adaptive pure exploration algorithms, which do not utilize the special features of causal bandits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2022

Combinatorial Causal Bandits

In combinatorial causal bandits (CCB), the learning agent chooses at mos...
research
03/07/2021

Hierarchical Causal Bandit

Causal bandit is a nascent learning model where an agent sequentially ex...
research
10/02/2018

Contextual Multi-Armed Bandits for Causal Marketing

This work explores the idea of a causal contextual multi-armed bandit ap...
research
06/06/2023

Pivotuner: automatic real-time pure intonation and microtonal modulation

Pivotuner is a VST3/AU MIDI effect plugin that automatically tunes note ...
research
06/13/2023

Additive Causal Bandits with Unknown Graph

We explore algorithms to select actions in the causal bandit setting whe...
research
01/31/2023

Combinatorial Causal Bandits without Graph Skeleton

In combinatorial causal bandits (CCB), the learning agent chooses a subs...
research
09/21/2021

Achieving Counterfactual Fairness for Causal Bandit

In online recommendation, customers arrive in a sequential and stochasti...

Please sign up or login with your details

Forgot password? Click here to reset