Learning Transition Models with Time-delayed Causal Relations

08/04/2020
by   Junchi Liang, et al.
0

This paper introduces an algorithm for discovering implicit and delayed causal relations between events observed by a robot at arbitrary times, with the objective of improving data-efficiency and interpretability of model-based reinforcement learning (RL) techniques. The proposed algorithm initially predicts observations with the Markov assumption, and incrementally introduces new hidden variables to explain and reduce the stochasticity of the observations. The hidden variables are memory units that keep track of pertinent past events. Such events are systematically identified by their information gains. The learned transition and reward models are then used for planning. Experiments on simulated and real robotic tasks show that this method significantly improves over current RL techniques.

READ FULL TEXT

page 1

page 3

page 4

page 5

research
07/19/2022

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

As a pivotal component to attaining generalizable solutions in human int...
research
05/01/2017

Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

In this paper we study how to learn stochastic, multimodal transition dy...
research
05/14/2021

Ordering-Based Causal Discovery with Reinforcement Learning

It is a long-standing question to discover causal relations among a set ...
research
05/20/2021

To do or not to do: finding causal relations in smart homes

Research in Cognitive Science suggests that humans understand and repres...
research
11/30/2020

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

Although deep reinforcement learning (RL) has been successfully applied ...
research
02/08/2020

Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials

There is broad interest in creating RL agents that can solve many (relat...
research
02/29/2020

Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment

Corvids, apes, and children solve The Crow and The Pitcher task (from Ae...

Please sign up or login with your details

Forgot password? Click here to reset