Temporal Difference Variational Auto-Encoder

06/08/2018
by   Karol Gregor, et al.
0

One motivation for learning generative models of environments is to use them as simulators for model-based reinforcement learning. Yet, it is intuitively clear that when time horizons are long, rolling out single step transitions is inefficient and often prohibitive. In this paper, we propose a generative model that learns state representations containing explicit beliefs about states several time steps in the future and that can be rolled out directly in these states without executing single step transitions. The model is trained on pairs of temporally separated time points, using an analogue of temporal difference learning used in reinforcement learning, taking the belief about possible futures at one time point as a bootstrap for training the belief at an earlier time. While we focus purely on the study of the model rather than its use in reinforcement learning, the model architecture we design respects agents' constraints as it builds the representation online.

READ FULL TEXT

page 5

page 8

research
06/21/2019

Shaping Belief States with Generative Environment Models for RL

When agents interact with a complex environment, they must form and main...
research
04/25/2018

Generative Temporal Models with Spatial Memory for Partially Observed Environments

In model-based reinforcement learning, generative and temporal models of...
research
09/16/2022

A Biologically-Inspired Dual Stream World Model

The medial temporal lobe (MTL), a brain region containing the hippocampu...
research
05/23/2018

Dyna Planning using a Feature Based Generative Model

Dyna-style reinforcement learning is a powerful approach for problems wh...
research
02/03/2021

Neural Recursive Belief States in Multi-Agent Reinforcement Learning

In multi-agent reinforcement learning, the problem of learning to act is...
research
07/12/2021

Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue

To encourage AI agents to conduct meaningful Visual Dialogue (VD), the u...
research
01/07/2021

Learning Temporal Dynamics from Cycles in Narrated Video

Learning to model how the world changes as time elapses has proven a cha...

Please sign up or login with your details

Forgot password? Click here to reset