
Causal Bandits on General Graphs
We study the problem of determining the best intervention in a Causal Ba...
read it

Regret Analysis of Causal Bandit Problems
We study how to learn optimal interventions sequentially given causal in...
read it

Identifying Best Interventions through Online Importance Sampling
Motivated by applications in computational advertising and systems biolo...
read it

Causal Bandits: Learning Good Interventions via Causal Inference
We study the problem of using causal models to improve the rate at which...
read it

Causal Bandits with Unknown Graph Structure
In causal bandit problems, the action set consists of interventions on v...
read it

Intervention Efficient Algorithms for Approximate Learning of Causal Graphs
We study the problem of learning the causal relationships between a set ...
read it

Intervention scenarios to enhance knowledge transfer in a network of firm
We investigate a multiagent model of firms in an R&D network. Each firm...
read it
Budgeted and Nonbudgeted Causal Bandits
Learning good interventions in a causal graph can be modelled as a stochastic multiarmed bandit problem with sideinformation. First, we study this problem when interventions are more expensive than observations and a budget is specified. If there are no backdoor paths from an intervenable node to the reward node then we propose an algorithm to minimize simple regret that optimally tradesoff observations and interventions based on the cost of intervention. We also propose an algorithm that accounts for the cost of interventions, utilizes causal sideinformation, and minimizes the expected cumulative regret without exceeding the budget. Our cumulativeregret minimization algorithm performs better than standard algorithms that do not take sideinformation into account. Finally, we study the problem of learning best interventions without budget constraint in general graphs and give an algorithm that achieves constant expected cumulative regret in terms of the instance parameters when the parent distribution of the reward variable for each intervention is known. Our results are experimentally validated and compared to the bestknown bounds in the current literature.
READ FULL TEXT
Comments
There are no comments yet.