Causal Deep Reinforcement Learning using Observational Data

11/28/2022
by   Wenxuan Zhu, et al.
0

Deep reinforcement learning (DRL) requires the collection of plenty of interventional data, which is sometimes expensive and even unethical in the real world, such as in the autonomous driving and the medical field. Offline reinforcement learning promises to alleviate this issue by exploiting the vast amount of observational data available in the real world. However, observational data may mislead the learning agent to undesirable outcomes if the behavior policy that generates the data depends on unobserved random variables (i.e., confounders). In this paper, we propose two deconfounding methods in DRL to address this problem. The methods first calculate the importance degree of different samples based on the causal inference technique, and then adjust the impact of different samples on the loss function by reweighting or resampling the offline dataset to ensure its unbiasedness. These deconfounding methods can be flexibly combined with the existing model-free DRL algorithms such as soft actor-critic and deep Q-learning, provided that a weak condition can be satisfied by the loss functions of these algorithms. We prove the effectiveness of our deconfounding methods and validate them experimentally.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2021

Causal Inference Q-Network: Toward Resilient Reinforcement Learning

Deep reinforcement learning (DRL) has demonstrated impressive performanc...
research
06/22/2020

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Empowered by expressive function approximators such as neural networks, ...
research
11/29/2021

Pessimistic Model Selection for Offline Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) has demonstrated great potentials in s...
research
06/28/2021

Causal Reinforcement Learning using Observational and Interventional Data

Learning efficiently a causal model of the environment is a key challeng...
research
09/06/2023

ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning

Data is a critical asset in AI, as high-quality datasets can significant...
research
12/26/2018

Deconfounding Reinforcement Learning in Observational Settings

We propose a general formulation for addressing reinforcement learning (...

Please sign up or login with your details

Forgot password? Click here to reset