Chronological Causal Bandits

12/03/2021
by   Neil Dhir, et al.
0

This paper studies an instance of the multi-armed bandit (MAB) problem, specifically where several causal MABs operate chronologically in the same dynamical system. Practically the reward distribution of each bandit is governed by the same non-trivial dependence structure, which is a dynamic causal model. Dynamic because we allow for each causal MAB to depend on the preceding MAB and in doing so are able to transfer information between agents. Our contribution, the Chronological Causal Bandit (CCB), is useful in discrete decision-making settings where the causal effects are changing across time and can be informed by earlier interventions in the same system. In this paper, we present some early findings of the CCB as demonstrated on a toy problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2021

Causal Bandits with Unknown Graph Structure

In causal bandit problems, the action set consists of interventions on v...
research
09/16/2020

Causal Discovery for Causal Bandits utilizing Separating Sets

The Causal Bandit is a variant of the classic Bandit problem where an ag...
research
10/02/2018

Contextual Multi-Armed Bandits for Causal Marketing

This work explores the idea of a causal contextual multi-armed bandit ap...
research
10/26/2021

Dynamic Causal Bayesian Optimization

This paper studies the problem of performing a sequence of optimal inter...
research
12/13/2020

Budgeted and Non-budgeted Causal Bandits

Learning good interventions in a causal graph can be modelled as a stoch...
research
06/21/2022

Using cognitive psychology to understand GPT-3

We study GPT-3, a recent large language model, using tools from cognitiv...
research
08/07/2023

Provably Efficient Learning in Partially Observable Contextual Bandit

In this paper, we investigate transfer learning in partially observable ...

Please sign up or login with your details

Forgot password? Click here to reset