Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

07/19/2022
by   Wenhao Ding, et al.
15

As a pivotal component to attaining generalizable solutions in human intelligence, reasoning provides great potential for reinforcement learning (RL) agents' generalization towards varied goals by summarizing part-to-whole arguments and discovering cause-and-effect relations. However, how to discover and represent causalities remains a huge gap that hinders the development of causal RL. In this paper, we augment Goal-Conditioned RL (GCRL) with Causal Graph (CG), a structure built upon the relation between objects and events. We novelly formulate the GCRL problem into variational likelihood maximization with CG as latent variables. To optimize the derived objective, we propose a framework with theoretical performance guarantees that alternates between two steps: using interventional data to estimate the posterior of CG; using CG to learn generalizable models and interpretable policies. Due to the lack of public benchmarks that verify generalization capability under reasoning, we design nine tasks and then empirically show the effectiveness of the proposed method against five baselines on these tasks. Further theoretical analysis shows that our performance improvement is attributed to the virtuous cycle of causal discovery, transition modeling, and policy training, which aligns with the experimental evidence in extensive ablation studies.

READ FULL TEXT
research
06/03/2022

Offline Reinforcement Learning with Causal Structured World Models

Model-based methods have recently shown promising for offline reinforcem...
research
08/04/2020

Learning Transition Models with Time-delayed Causal Relations

This paper introduces an algorithm for discovering implicit and delayed ...
research
11/02/2020

Causal Campbell-Goodhart's law and Reinforcement Learning

Campbell-Goodhart's law relates to the causal inference error whereby de...
research
03/12/2021

Discovering Diverse Solutions in Deep Reinforcement Learning

Reinforcement learning (RL) algorithms are typically limited to learning...
research
05/14/2021

Ordering-Based Causal Discovery with Reinforcement Learning

It is a long-standing question to discover causal relations among a set ...
research
06/13/2020

Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning

We propose a graphical model framework for goal-conditioned RL, with an ...
research
05/28/2023

Learning a Structural Causal Model for Intuition Reasoning in Conversation

Reasoning, a crucial aspect of NLP research, has not been adequately add...

Please sign up or login with your details

Forgot password? Click here to reset