Causal Reinforcement Learning using Observational and Interventional Data

06/28/2021
by   Maxime Gasse, et al.
22

Learning efficiently a causal model of the environment is a key challenge of model-based RL agents operating in POMDPs. We consider here a scenario where the learning agent has the ability to collect online experiences through direct interactions with the environment (interventional data), but has also access to a large collection of offline experiences, obtained by observing another agent interacting with the environment (observational data). A key ingredient, that makes this situation non-trivial, is that we allow the observed agent to interact with the environment based on hidden information, which is not observed by the learning agent. We then ask the following questions: can the online and offline experiences be safely combined for learning a causal model ? And can we expect the offline experiences to improve the agent's performances ? To answer these questions, we import ideas from the well-established causal framework of do-calculus, and we express model-based reinforcement learning as a causal inference problem. Then, we propose a general yet simple methodology for leveraging offline data during learning. In a nutshell, the method relies on learning a latent-based causal transition model that explains both the interventional and observational regimes, and then using the recovered latent variable to infer the standard POMDP transition model via deconfounding. We prove our method is correct and efficient in the sense that it attains better generalization guarantees due to the offline data (in the asymptotic case), and we illustrate its effectiveness empirically on synthetic toy problems. Our contribution aims at bridging the gap between the fields of reinforcement learning and causality.

READ FULL TEXT

page 16

page 17

page 18

page 19

page 20

page 21

research
06/03/2022

Offline Reinforcement Learning with Causal Structured World Models

Model-based methods have recently shown promising for offline reinforcem...
research
11/02/2022

Causal Counterfactuals for Improving the Robustness of Reinforcement Learning

Reinforcement learning (RL) is applied in a wide variety of fields. RL e...
research
02/12/2022

Learning by Doing: Controlling a Dynamical System using Causality, Control, and Reinforcement Learning

Questions in causality, control, and reinforcement learning go beyond th...
research
11/28/2022

Causal Deep Reinforcement Learning using Observational Data

Deep reinforcement learning (DRL) requires the collection of plenty of i...
research
06/22/2020

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Empowered by expressive function approximators such as neural networks, ...
research
10/04/2021

Learning to Assist Agents by Observing Them

The ability of an AI agent to assist other agents, such as humans, is an...
research
05/20/2022

Towards biologically plausible Dreaming and Planning

Humans and animals can learn new skills after practicing for a few hours...

Please sign up or login with your details

Forgot password? Click here to reset