Causal Bandits with Propagating Inference

06/06/2018
by   Akihiro Yabe, et al.
0

Bandit is a framework for designing sequential experiments. In each experiment, a learner selects an arm A ∈A and obtains an observation corresponding to A. Theoretically, the tight regret lower-bound for the general bandit is polynomial with respect to the number of arms |A|. This makes bandit incapable of handling an exponentially large number of arms, hence the bandit problem with side-information is often considered to overcome this lower bound. Recently, a bandit framework over a causal graph was introduced, where the structure of the causal graph is available as side-information. A causal graph is a fundamental model that is frequently used with a variety of real problems. In this setting, the arms are identified with interventions on a given causal graph, and the effect of an intervention propagates throughout all over the causal graph. The task is to find the best intervention that maximizes the expected value on a target node. Existing algorithms for causal bandit overcame the Ω(√(|A|/T)) simple-regret lower-bound; however, their algorithms work only when the interventions A are localized around a single node (i.e., an intervention propagates only to its neighbors). We propose a novel causal bandit algorithm for an arbitrary set of interventions, which can propagate throughout the causal graph. We also show that it achieves O(√(γ^*(|A|T) / T)) regret bound, where γ^* is determined by using a causal graph structure. In particular, if the in-degree of the causal graph is bounded, then γ^* = O(N^2), where N is the number N of nodes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2021

Causal Bandits on General Graphs

We study the problem of determining the best intervention in a Causal Ba...
research
06/05/2021

Causal Bandits with Unknown Graph Structure

In causal bandit problems, the action set consists of interventions on v...
research
10/11/2019

Regret Analysis of Causal Bandit Problems

We study how to learn optimal interventions sequentially given causal in...
research
06/13/2023

Additive Causal Bandits with Unknown Graph

We explore algorithms to select actions in the causal bandit setting whe...
research
01/10/2017

Identifying Best Interventions through Online Importance Sampling

Motivated by applications in computational advertising and systems biolo...
research
08/26/2022

Causal Bandits for Linear Structural Equation Models

This paper studies the problem of designing an optimal sequence of inter...
research
01/31/2023

Combinatorial Causal Bandits without Graph Skeleton

In combinatorial causal bandits (CCB), the learning agent chooses a subs...

Please sign up or login with your details

Forgot password? Click here to reset